profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/xvrl/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.

xvrl/charts 0

Curated applications for Kubernetes

xvrl/common 0

Common utilities library containing metrics, config and utils

xvrl/confluent-docker-utils 0

Common Python utils for testing Confluent's Docker images

xvrl/cp-docker-images 0

Docker images for Confluent Platform.

xvrl/druid 0

Column oriented distributed data store ideal for powering interactive applications

xvrl/druid-docker 0

Docker container running Druid.io

xvrl/ducktape 0

System integration and performance tests

xvrl/FlameGraph 0

Stack trace visualizer

xvrl/homebrew-core 0

:beers: Core formulae for the Homebrew package manager

PullRequestReviewEvent

Pull request review commentconfluentinc/druid

[METRICS-3631] Added compile phases before checks

 blocks:          - name: "forbidden api checks"           commands:+            - ${MVN} compile ${MAVEN_SKIP} ${MAVEN_SKIP_TESTS}

why not test-compile here as well since we also check test code below?

IvanVan

comment created time in 7 days

PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentconfluentinc/druid

Kafka input format changes

+/*+ * Licensed to the Apache Software Foundation (ASF) under one+ * or more contributor license agreements.  See the NOTICE file+ * distributed with this work for additional information+ * regarding copyright ownership.  The ASF licenses this file+ * to you under the Apache License, Version 2.0 (the+ * "License"); you may not use this file except in compliance+ * with the License.  You may obtain a copy of the License at+ *+ *   http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing,+ * software distributed under the License is distributed on an+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY+ * KIND, either express or implied.  See the License for the+ * specific language governing permissions and limitations+ * under the License.+ */++package org.apache.druid.data.input.kafkainput;++import com.google.common.collect.Lists;+import com.google.common.collect.Sets;+import org.apache.druid.data.input.InputEntityReader;+import org.apache.druid.data.input.InputRow;+import org.apache.druid.data.input.InputRowListPlusRawValues;+import org.apache.druid.data.input.InputRowSchema;+import org.apache.druid.data.input.MapBasedInputRow;+import org.apache.druid.data.input.kafka.KafkaRecordEntity;+import org.apache.druid.java.util.common.CloseableIterators;+import org.apache.druid.java.util.common.Pair;+import org.apache.druid.java.util.common.logger.Logger;+import org.apache.druid.java.util.common.parsers.CloseableIterator;+import org.apache.druid.java.util.common.parsers.ParseException;++import java.io.IOException;+import java.util.Collections;+import java.util.HashMap;+import java.util.HashSet;+import java.util.List;+import java.util.Map;++public class KafkaInputReader implements InputEntityReader+{+  private static final Logger log = new Logger(KafkaInputReader.class);++  private final InputRowSchema inputRowSchema;+  private final KafkaRecordEntity record;+  private final KafkaHeaderReader headerParser;+  private final InputEntityReader keyParser;+  private final InputEntityReader valueParser;+  private final String keyColumnPrefix;+  private final String recordTimestampColumnPrefix;++  /**+   *+   * @param inputRowSchema Actual schema from the ingestion spec+   * @param record kafka record containing header, key & value+   * @param headerParser Header parser for parsing the header section, kafkaInputFormat allows users to skip header parsing section and hence an be null+   * @param keyParser Key parser for key section, can be null as well+   * @param valueParser Value parser is a required section in kafkaInputFormat, but because of tombstone records we can have a null parser here.+   * @param keyColumnPrefix Default key column prefix+   * @param recordTimestampColumnPrefix Default kafka record's timestamp prefix+   */+  public KafkaInputReader(+      InputRowSchema inputRowSchema,+      KafkaRecordEntity record,+      KafkaHeaderReader headerParser,+      InputEntityReader keyParser,+      InputEntityReader valueParser,+      String keyColumnPrefix,+      String recordTimestampColumnPrefix+  )+  {+    this.inputRowSchema = inputRowSchema;+    this.record = record;+    this.headerParser = headerParser;+    this.keyParser = keyParser;+    this.valueParser = valueParser;+    this.keyColumnPrefix = keyColumnPrefix;+    this.recordTimestampColumnPrefix = recordTimestampColumnPrefix;+  }++  private CloseableIterator<InputRow> buildBlendedRows(InputEntityReader valueParser, Map<String, Object> headerKeyList) throws IOException+  {+    return valueParser.read().map(+        r -> {+          MapBasedInputRow valueRow;+          try {+            // Return type for the value parser should be of type MapBasedInputRow+            // Parsers returning other types are not compatible currently.+            valueRow = (MapBasedInputRow) r;+          }+          catch (ClassCastException e) {+            throw new ParseException("Unsupported input format in valueFormat. KafkaInputformat only supports input format that return MapBasedInputRow rows");+          }+          Map<String, Object> event = new HashMap<>(headerKeyList);+          /* Currently we prefer payload attributes if there is a collision in names.+              We can change this beahvior in later changes with a config knob. This default+              behavior lets easy porting of existing inputFormats to the new one without any changes.+            */+          event.putAll(valueRow.getEvent());++          HashSet<String> newDimensions = new HashSet<String>(valueRow.getDimensions());+          newDimensions.addAll(headerKeyList.keySet());+          // Remove the dummy timestamp added in KafkaInputFormat+          newDimensions.remove("__kif_auto_timestamp");++          final List<String> schemaDimensions = inputRowSchema.getDimensionsSpec().getDimensionNames();+          final List<String> dimensions;+          if (!schemaDimensions.isEmpty()) {+            dimensions = schemaDimensions;+          } else {+            dimensions = Lists.newArrayList(+                Sets.difference(newDimensions, inputRowSchema.getDimensionsSpec().getDimensionExclusions())+            );+          }++          return new MapBasedInputRow(+              inputRowSchema.getTimestampSpec().extractTimestamp(event),+              dimensions,+              event+          );+        }+    );+  }++  private CloseableIterator<InputRow> buildRowsWithoutValuePayload(Map<String, Object> headerKeyList)+  {+    HashSet<String> newDimensions = new HashSet<String>(headerKeyList.keySet());+    final List<String> schemaDimensions = inputRowSchema.getDimensionsSpec().getDimensionNames();+    final List<String> dimensions;+    if (!schemaDimensions.isEmpty()) {+      dimensions = schemaDimensions;+    } else {+      dimensions = Lists.newArrayList(+          Sets.difference(newDimensions, inputRowSchema.getDimensionsSpec().getDimensionExclusions())+      );+    }+    InputRow row = new MapBasedInputRow(+        inputRowSchema.getTimestampSpec().extractTimestamp(headerKeyList),+        dimensions,+        headerKeyList+    );+    List<InputRow> rows = Collections.singletonList(row);+    return CloseableIterators.withEmptyBaggage(rows.iterator());+  }++  @Override+  public CloseableIterator<InputRow> read() throws IOException+  {+    Map<String, Object> mergeList = new HashMap<>();

nit, calling it a list is a bit of a misnomer.

lokesh-lingarajan

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentconfluentinc/druid

Kafka input format changes

+/*+ * Licensed to the Apache Software Foundation (ASF) under one+ * or more contributor license agreements.  See the NOTICE file+ * distributed with this work for additional information+ * regarding copyright ownership.  The ASF licenses this file+ * to you under the Apache License, Version 2.0 (the+ * "License"); you may not use this file except in compliance+ * with the License.  You may obtain a copy of the License at+ *+ *   http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing,+ * software distributed under the License is distributed on an+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY+ * KIND, either express or implied.  See the License for the+ * specific language governing permissions and limitations+ * under the License.+ */++package org.apache.druid.data.input.kafkainput;++import com.google.common.collect.Lists;+import com.google.common.collect.Sets;+import org.apache.druid.data.input.InputEntityReader;+import org.apache.druid.data.input.InputRow;+import org.apache.druid.data.input.InputRowListPlusRawValues;+import org.apache.druid.data.input.InputRowSchema;+import org.apache.druid.data.input.MapBasedInputRow;+import org.apache.druid.data.input.kafka.KafkaRecordEntity;+import org.apache.druid.java.util.common.CloseableIterators;+import org.apache.druid.java.util.common.Pair;+import org.apache.druid.java.util.common.logger.Logger;+import org.apache.druid.java.util.common.parsers.CloseableIterator;+import org.apache.druid.java.util.common.parsers.ParseException;++import java.io.IOException;+import java.util.Collections;+import java.util.HashMap;+import java.util.HashSet;+import java.util.List;+import java.util.Map;++public class KafkaInputReader implements InputEntityReader+{+  private static final Logger log = new Logger(KafkaInputReader.class);++  private final InputRowSchema inputRowSchema;+  private final KafkaRecordEntity record;+  private final KafkaHeaderReader headerParser;+  private final InputEntityReader keyParser;+  private final InputEntityReader valueParser;+  private final String keyColumnPrefix;+  private final String recordTimestampColumnPrefix;++  /**+   *+   * @param inputRowSchema Actual schema from the ingestion spec+   * @param record kafka record containing header, key & value+   * @param headerParser Header parser for parsing the header section, kafkaInputFormat allows users to skip header parsing section and hence an be null+   * @param keyParser Key parser for key section, can be null as well+   * @param valueParser Value parser is a required section in kafkaInputFormat, but because of tombstone records we can have a null parser here.+   * @param keyColumnPrefix Default key column prefix+   * @param recordTimestampColumnPrefix Default kafka record's timestamp prefix+   */+  public KafkaInputReader(+      InputRowSchema inputRowSchema,+      KafkaRecordEntity record,+      KafkaHeaderReader headerParser,+      InputEntityReader keyParser,+      InputEntityReader valueParser,+      String keyColumnPrefix,+      String recordTimestampColumnPrefix+  )+  {+    this.inputRowSchema = inputRowSchema;+    this.record = record;+    this.headerParser = headerParser;+    this.keyParser = keyParser;+    this.valueParser = valueParser;+    this.keyColumnPrefix = keyColumnPrefix;+    this.recordTimestampColumnPrefix = recordTimestampColumnPrefix;+  }++  private CloseableIterator<InputRow> buildBlendedRows(InputEntityReader valueParser, Map<String, Object> headerKeyList) throws IOException+  {+    return valueParser.read().map(+        r -> {+          MapBasedInputRow valueRow;+          try {+            // Return type for the value parser should be of type MapBasedInputRow+            // Parsers returning other types are not compatible currently.+            valueRow = (MapBasedInputRow) r;+          }+          catch (ClassCastException e) {+            throw new ParseException("Unsupported input format in valueFormat. KafkaInputformat only supports input format that return MapBasedInputRow rows");+          }+          Map<String, Object> event = new HashMap<>(headerKeyList);+          /* Currently we prefer payload attributes if there is a collision in names.+              We can change this beahvior in later changes with a config knob. This default+              behavior lets easy porting of existing inputFormats to the new one without any changes.+            */+          event.putAll(valueRow.getEvent());++          HashSet<String> newDimensions = new HashSet<String>(valueRow.getDimensions());+          newDimensions.addAll(headerKeyList.keySet());+          // Remove the dummy timestamp added in KafkaInputFormat+          newDimensions.remove("__kif_auto_timestamp");++          final List<String> schemaDimensions = inputRowSchema.getDimensionsSpec().getDimensionNames();+          final List<String> dimensions;+          if (!schemaDimensions.isEmpty()) {+            dimensions = schemaDimensions;+          } else {+            dimensions = Lists.newArrayList(+                Sets.difference(newDimensions, inputRowSchema.getDimensionsSpec().getDimensionExclusions())+            );+          }++          return new MapBasedInputRow(+              inputRowSchema.getTimestampSpec().extractTimestamp(event),+              dimensions,+              event+          );+        }+    );+  }++  private CloseableIterator<InputRow> buildRowsWithoutValuePayload(Map<String, Object> headerKeyList)+  {+    HashSet<String> newDimensions = new HashSet<String>(headerKeyList.keySet());+    final List<String> schemaDimensions = inputRowSchema.getDimensionsSpec().getDimensionNames();+    final List<String> dimensions;+    if (!schemaDimensions.isEmpty()) {+      dimensions = schemaDimensions;+    } else {+      dimensions = Lists.newArrayList(+          Sets.difference(newDimensions, inputRowSchema.getDimensionsSpec().getDimensionExclusions())+      );+    }

we seem to be repeating this code twice, any chance we can avoid this repetition?

lokesh-lingarajan

comment created time in a month

Pull request review commentconfluentinc/druid

Kafka input format changes

+/*+ * Licensed to the Apache Software Foundation (ASF) under one+ * or more contributor license agreements.  See the NOTICE file+ * distributed with this work for additional information+ * regarding copyright ownership.  The ASF licenses this file+ * to you under the Apache License, Version 2.0 (the+ * "License"); you may not use this file except in compliance+ * with the License.  You may obtain a copy of the License at+ *+ *   http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing,+ * software distributed under the License is distributed on an+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY+ * KIND, either express or implied.  See the License for the+ * specific language governing permissions and limitations+ * under the License.+ */++package org.apache.druid.data.input.kafkainput;++import com.google.common.collect.Lists;+import com.google.common.collect.Sets;+import org.apache.druid.data.input.InputEntityReader;+import org.apache.druid.data.input.InputRow;+import org.apache.druid.data.input.InputRowListPlusRawValues;+import org.apache.druid.data.input.InputRowSchema;+import org.apache.druid.data.input.MapBasedInputRow;+import org.apache.druid.data.input.kafka.KafkaRecordEntity;+import org.apache.druid.java.util.common.CloseableIterators;+import org.apache.druid.java.util.common.Pair;+import org.apache.druid.java.util.common.logger.Logger;+import org.apache.druid.java.util.common.parsers.CloseableIterator;+import org.apache.druid.java.util.common.parsers.ParseException;++import java.io.IOException;+import java.util.Collections;+import java.util.HashMap;+import java.util.HashSet;+import java.util.List;+import java.util.Map;++public class KafkaInputReader implements InputEntityReader+{+  private static final Logger log = new Logger(KafkaInputReader.class);++  private final InputRowSchema inputRowSchema;+  private final KafkaRecordEntity record;+  private final KafkaHeaderReader headerParser;+  private final InputEntityReader keyParser;+  private final InputEntityReader valueParser;+  private final String keyColumnPrefix;+  private final String recordTimestampColumnPrefix;++  /**+   *+   * @param inputRowSchema Actual schema from the ingestion spec+   * @param record kafka record containing header, key & value+   * @param headerParser Header parser for parsing the header section, kafkaInputFormat allows users to skip header parsing section and hence an be null+   * @param keyParser Key parser for key section, can be null as well+   * @param valueParser Value parser is a required section in kafkaInputFormat, but because of tombstone records we can have a null parser here.+   * @param keyColumnPrefix Default key column prefix+   * @param recordTimestampColumnPrefix Default kafka record's timestamp prefix+   */+  public KafkaInputReader(+      InputRowSchema inputRowSchema,+      KafkaRecordEntity record,+      KafkaHeaderReader headerParser,+      InputEntityReader keyParser,+      InputEntityReader valueParser,+      String keyColumnPrefix,+      String recordTimestampColumnPrefix+  )+  {+    this.inputRowSchema = inputRowSchema;+    this.record = record;+    this.headerParser = headerParser;+    this.keyParser = keyParser;+    this.valueParser = valueParser;+    this.keyColumnPrefix = keyColumnPrefix;+    this.recordTimestampColumnPrefix = recordTimestampColumnPrefix;+  }++  private CloseableIterator<InputRow> buildBlendedRows(InputEntityReader valueParser, Map<String, Object> headerKeyList) throws IOException+  {+    return valueParser.read().map(+        r -> {+          MapBasedInputRow valueRow;+          try {+            // Return type for the value parser should be of type MapBasedInputRow+            // Parsers returning other types are not compatible currently.+            valueRow = (MapBasedInputRow) r;+          }+          catch (ClassCastException e) {+            throw new ParseException("Unsupported input format in valueFormat. KafkaInputformat only supports input format that return MapBasedInputRow rows");+          }+          Map<String, Object> event = new HashMap<>(headerKeyList);+          /* Currently we prefer payload attributes if there is a collision in names.+              We can change this beahvior in later changes with a config knob. This default+              behavior lets easy porting of existing inputFormats to the new one without any changes.+            */+          event.putAll(valueRow.getEvent());++          HashSet<String> newDimensions = new HashSet<String>(valueRow.getDimensions());+          newDimensions.addAll(headerKeyList.keySet());+          // Remove the dummy timestamp added in KafkaInputFormat+          newDimensions.remove("__kif_auto_timestamp");

can we make this string a constant?

lokesh-lingarajan

comment created time in a month

PullRequestReviewEvent

Pull request review commentconfluentinc/druid

Kafka input format changes

 import org.apache.druid.java.util.common.Pair; import org.apache.druid.java.util.common.logger.Logger; import org.apache.druid.java.util.common.parsers.CloseableIterator;+import org.apache.druid.java.util.common.parsers.ParseException;  import java.io.IOException;-import java.util.ArrayList; import java.util.Collections; import java.util.HashMap; import java.util.HashSet; import java.util.List; import java.util.Map;-import java.util.NoSuchElementException;  public class KafkaInputReader implements InputEntityReader {   private static final Logger log = new Logger(KafkaInputReader.class);-  private static final String DEFAULT_KEY_STRING = "key";-  private static final String DEFAULT_TIMESTAMP_STRING = "timestamp";    private final InputRowSchema inputRowSchema;   private final KafkaRecordEntity record;   private final KafkaHeaderReader headerParser;   private final InputEntityReader keyParser;   private final InputEntityReader valueParser;-  private final String keyLabelPrefix;-  private final String recordTimestampLabelPrefix;-+  private final String keyColumnPrefix;+  private final String recordTimestampColumnPrefix;++  /**+   *+   * @param inputRowSchema Actual schema from the ingestion spec+   * @param record kafka record containing header, key & value+   * @param headerParser Header parser for parsing the header section, kafkaInputFormat allows users to skip header parsing section and hence an be null+   * @param keyParser Key parser for key section, can be null as well+   * @param valueParser Value parser is a required section in kafkaInputFormat, but because of tombstone records we can have a null parser here.+   * @param keyColumnPrefix Default key column prefix+   * @param recordTimestampColumnPrefix Default kafka record's timestamp prefix+   */   public KafkaInputReader(       InputRowSchema inputRowSchema,       KafkaRecordEntity record,       KafkaHeaderReader headerParser,       InputEntityReader keyParser,       InputEntityReader valueParser,-      String keyLabelPrefix,-      String recordTimestampLabelPrefix+      String keyColumnPrefix,+      String recordTimestampColumnPrefix

those two fields are no longer prefixes, they are the actual column names, no?

lokesh-lingarajan

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent

push eventapache/druid

sthetland

commit sha 95c5bc3a6ddeb2ce10f696a352bcda492dba853a

Clarify when changes to credentialIterations take effect (#11590) This change updates doc to clarify when and how a change to druid.auth.authenticator.basic.credentialIterations takes effect: changes apply only to new users or existing users upon changing their password via the credentials API, which may not be the expectation.

view details

push time in a month

PR merged apache/druid

Reviewers
Clarify when changes to credentialIterations take effect Area - Documentation

This PR updates doc to clarify when and how a change to druid.auth.authenticator.basic.credentialIterations takes effect: changes apply only to new users or existing users upon changing their password via the credentials API, which may not be the expectation.

+9 -2

0 comment

1 changed file

sthetland

pr closed time in a month

pull request commentapache/kafka

MINOR: fix mbean tag name ordering in JMX reporter

@dajac agree there might be some surprises for users that relied on ordering that might appear constant, but never guaranteed that. Unfortunately, that order would be dependent on a particular JVM's hashmap implementation, which in practice might not vary often, but has changed in the past and could vary by vendor (e.g. Java 8 even called out some changes to the ordering https://docs.oracle.com/javase/8/docs/technotes/guides/collections/changes8.html)

I'll file a ticket so that we can at least include it in the release notes and call it out as one of the things to be aware of when upgrading. Since we never guaranteed tag ordering though, I don't think we need a KIP for this.

xvrl

comment created time in a month

PullRequestReviewEvent

Pull request review commentapache/kafka

MINOR: fix mbean tag name ordering in JMX reporter

 public void testJmxRegistration() throws Exception {         }     } +    @Test+    public void testMbeanTagOrdering() {+        Map<String, String> tags = new HashMap<>();+        tags.put("tag_a", "x");+        tags.put("tag_b", "y");+        tags.put("tag_c", "z");+        tags.put("tag_d", "1,2");+        tags.put("tag_e", "");+        tags.put("tag_f", "3");

the test already fails if I remove the ordering code, but I can change the order if that makes us feel better.

xvrl

comment created time in a month

Pull request review commentconfluentinc/druid

Kafka input format changes

+/*+ * Licensed to the Apache Software Foundation (ASF) under one+ * or more contributor license agreements.  See the NOTICE file+ * distributed with this work for additional information+ * regarding copyright ownership.  The ASF licenses this file+ * to you under the Apache License, Version 2.0 (the+ * "License"); you may not use this file except in compliance+ * with the License.  You may obtain a copy of the License at+ *+ *   http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing,+ * software distributed under the License is distributed on an+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY+ * KIND, either express or implied.  See the License for the+ * specific language governing permissions and limitations+ * under the License.+ */++package org.apache.druid.data.input.kafkainput;++import com.google.common.collect.Lists;+import com.google.common.collect.Sets;+import org.apache.druid.data.input.InputEntityReader;+import org.apache.druid.data.input.InputRow;+import org.apache.druid.data.input.InputRowListPlusRawValues;+import org.apache.druid.data.input.InputRowSchema;+import org.apache.druid.data.input.MapBasedInputRow;+import org.apache.druid.data.input.kafka.KafkaRecordEntity;+import org.apache.druid.java.util.common.CloseableIterators;+import org.apache.druid.java.util.common.Pair;+import org.apache.druid.java.util.common.logger.Logger;+import org.apache.druid.java.util.common.parsers.CloseableIterator;++import java.io.IOException;+import java.util.ArrayList;+import java.util.Collections;+import java.util.HashMap;+import java.util.HashSet;+import java.util.List;+import java.util.Map;+import java.util.NoSuchElementException;++public class KafkaInputReader implements InputEntityReader+{+  private static final Logger log = new Logger(KafkaInputReader.class);+  private static final String DEFAULT_KEY_STRING = "key";+  private static final String DEFAULT_TIMESTAMP_STRING = "timestamp";++  private final InputRowSchema inputRowSchema;+  private final KafkaRecordEntity record;+  private final KafkaHeaderReader headerParser;+  private final InputEntityReader keyParser;+  private final InputEntityReader valueParser;+  private final String keyLabelPrefix;+  private final String recordTimestampLabelPrefix;++  public KafkaInputReader(+      InputRowSchema inputRowSchema,+      KafkaRecordEntity record,+      KafkaHeaderReader headerParser,+      InputEntityReader keyParser,+      InputEntityReader valueParser,+      String keyLabelPrefix,+      String recordTimestampLabelPrefix+  )+  {+    this.inputRowSchema = inputRowSchema;+    this.record = record;+    this.headerParser = headerParser; //Header parser can be null by config+    this.keyParser = keyParser; //Key parser can be null by config and data+    this.valueParser = valueParser; //value parser can be null by data (tombstone records)+    this.keyLabelPrefix = keyLabelPrefix;+    this.recordTimestampLabelPrefix = recordTimestampLabelPrefix;+  }++  @Override+  public CloseableIterator<InputRow> read() throws IOException+  {+    // Add kafka record timestamp to the mergelist+    List<Pair<String, Object>> mergeList = new ArrayList<>();+    if (headerParser != null) {+      List<Pair<String, Object>> headerList = headerParser.read();+      mergeList.addAll(headerList);+    }++    Pair ts = new Pair(recordTimestampLabelPrefix + DEFAULT_TIMESTAMP_STRING,+                      record.getRecord().timestamp());+    if (!mergeList.contains(ts)) {+      mergeList.add(ts);+    }++    if (keyParser != null) {+      try (CloseableIterator<InputRow> keyIterator = keyParser.read()) {+        // Key currently only takes the first row and ignores the rest.+        if (keyIterator.hasNext()) {+          // Return type for the key parser should be of type MapBasedInputRow+          // Parsers returning other types are not compatible currently.+          MapBasedInputRow keyRow = (MapBasedInputRow) keyIterator.next();+          Pair key = new Pair(keyLabelPrefix + DEFAULT_KEY_STRING,+                            keyRow.getEvent().entrySet().stream().findFirst().get().getValue());+          if (!mergeList.contains(key)) {

ok, same here, let's add a comment to explain

lokesh-lingarajan

comment created time in a month

PullRequestReviewEvent

Pull request review commentconfluentinc/druid

Kafka input format changes

+/*+ * Licensed to the Apache Software Foundation (ASF) under one+ * or more contributor license agreements.  See the NOTICE file+ * distributed with this work for additional information+ * regarding copyright ownership.  The ASF licenses this file+ * to you under the Apache License, Version 2.0 (the+ * "License"); you may not use this file except in compliance+ * with the License.  You may obtain a copy of the License at+ *+ *   http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing,+ * software distributed under the License is distributed on an+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY+ * KIND, either express or implied.  See the License for the+ * specific language governing permissions and limitations+ * under the License.+ */++package org.apache.druid.data.input.kafkainput;++import com.google.common.collect.Lists;+import com.google.common.collect.Sets;+import org.apache.druid.data.input.InputEntityReader;+import org.apache.druid.data.input.InputRow;+import org.apache.druid.data.input.InputRowListPlusRawValues;+import org.apache.druid.data.input.InputRowSchema;+import org.apache.druid.data.input.MapBasedInputRow;+import org.apache.druid.data.input.kafka.KafkaRecordEntity;+import org.apache.druid.java.util.common.CloseableIterators;+import org.apache.druid.java.util.common.Pair;+import org.apache.druid.java.util.common.logger.Logger;+import org.apache.druid.java.util.common.parsers.CloseableIterator;++import java.io.IOException;+import java.util.ArrayList;+import java.util.Collections;+import java.util.HashMap;+import java.util.HashSet;+import java.util.List;+import java.util.Map;+import java.util.NoSuchElementException;++public class KafkaInputReader implements InputEntityReader+{+  private static final Logger log = new Logger(KafkaInputReader.class);+  private static final String DEFAULT_KEY_STRING = "key";+  private static final String DEFAULT_TIMESTAMP_STRING = "timestamp";++  private final InputRowSchema inputRowSchema;+  private final KafkaRecordEntity record;+  private final KafkaHeaderReader headerParser;+  private final InputEntityReader keyParser;+  private final InputEntityReader valueParser;+  private final String keyLabelPrefix;+  private final String recordTimestampLabelPrefix;++  public KafkaInputReader(+      InputRowSchema inputRowSchema,+      KafkaRecordEntity record,+      KafkaHeaderReader headerParser,+      InputEntityReader keyParser,+      InputEntityReader valueParser,+      String keyLabelPrefix,+      String recordTimestampLabelPrefix+  )+  {+    this.inputRowSchema = inputRowSchema;+    this.record = record;+    this.headerParser = headerParser; //Header parser can be null by config+    this.keyParser = keyParser; //Key parser can be null by config and data+    this.valueParser = valueParser; //value parser can be null by data (tombstone records)+    this.keyLabelPrefix = keyLabelPrefix;+    this.recordTimestampLabelPrefix = recordTimestampLabelPrefix;+  }++  @Override+  public CloseableIterator<InputRow> read() throws IOException+  {+    // Add kafka record timestamp to the mergelist+    List<Pair<String, Object>> mergeList = new ArrayList<>();+    if (headerParser != null) {+      List<Pair<String, Object>> headerList = headerParser.read();+      mergeList.addAll(headerList);+    }++    Pair ts = new Pair(recordTimestampLabelPrefix + DEFAULT_TIMESTAMP_STRING,+                      record.getRecord().timestamp());+    if (!mergeList.contains(ts)) {

ok let's add a comment to explain why we'd skip entries though

lokesh-lingarajan

comment created time in a month

PullRequestReviewEvent

delete branch confluentinc/druid

delete branch : upstream_cherry-pick

delete time in a month

push eventconfluentinc/druid

Abhishek Agarwal

commit sha 51a973ad9b34c91e3fc9defbac618df66fbe08e9

Avoid expensive findEntry call in segment metadata query (#10892) * Avoid expensive findEntry call in segment metadata query * other places * Remove findEntry * Fix add cost * Refactor a bit * Add performance test * Add comment * Review comments * intellij (cherry picked from commit 489f5b1a03c8c3d35374c9eb6fd140f33c4f3fb1)

view details

push time in a month

PR merged confluentinc/druid

Avoid expensive findEntry call in segment metadata query (#10892)
  • Avoid expensive findEntry call in segment metadata query

  • other places

  • Remove findEntry

  • Fix add cost

  • Refactor a bit

  • Add performance test

  • Add comment

  • Review comments

  • intellij

(cherry picked from commit 489f5b1a03c8c3d35374c9eb6fd140f33c4f3fb1)

+448 -134

0 comment

16 changed files

harinirajendran

pr closed time in a month

PullRequestReviewEvent

push eventapache/druid

Harini Rajendran

commit sha ccd362d228ba996b51bd406845f824571871cf40

Fix FileIteratingFirehoseTest to extend NullHandlingTest (#11581)

view details

push time in a month

delete branch confluentinc/druid

delete branch : upstream_master

delete time in a month

PR merged apache/druid

Fix FileIteratingFirehoseTest to extend NullHandlingTest Area - Testing

Problem

FileIteratingFirehoseTest was failing with the ERROR [ERROR] Tests run: 192, Failures: 0, Errors: 180, Skipped: 0, Time elapsed: 1.021 s <<< FAILURE! - in org.apache.druid.data.input.impl.FileIteratingFirehoseTest00:19 [ERROR] testClose[2000,foo], 0 Time elapsed: 0.017 s <<< ERROR!00:19 java.lang.Exception: Unexpected exception, expected<java.lang.RuntimeException> but was<java.lang.ExceptionInInitializerError> .... Caused by: java.lang.IllegalStateException: NullHandling module not initialized, call NullHandling.initializeForTests()00:19 at org.apache.druid.common.config.NullHandling.replaceWithDefault(NullHandling.java:71)00:19 at org.apache.druid.data.input.impl.CsvInputFormat.createOpenCsvParser(CsvInputFormat.java:84)00:19 at org.apache.druid.data.input.impl.CsvInputFormat.<clinit>(CsvInputFormat.java:41)

Fix

Made FileIteratingFirehoseTest extend NullHandlingTest

+2 -1

0 comment

1 changed file

harinirajendran

pr closed time in a month