profile
viewpoint
Guozhang Wang guozhangwang @confluentinc San Franscisco https://www.linkedin.com/in/guozhangwang/ @apache Kafka PMC and committer.

guozhangwang/awesome-streaming 18

a curated list of awsome streaming frameworks, applications, etc

guozhangwang/kafka-streams-examples 4

Demo applications and code examples for Apache Kafka's Streams API.

abbccdda/kafka 2

Mirror of Apache Kafka

guozhangwang/kafka 2

Mirror of Apache Kafka

guozhangwang/bashreduce 0

mapreduce in bash

guozhangwang/common-docker 0

To be deprecated - Confluent Commons with support for building and testing Docker images.

guozhangwang/delta 0

An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.

guozhangwang/examples 0

Example code on running Confluent Platform

guozhangwang/flink 0

Apache Flink

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 public void maybeWaitForNonEmptyTopology(final Supplier<State> threadState) {         }     } -    public void registerAndBuildNewTopology(final InternalTopologyBuilder newTopologyBuilder) {+    /**+     * Adds the topology and registers a future that listens for all threads on the older version to see the update+     */+    public KafkaFuture<Void> registerAndBuildNewTopology(final InternalTopologyBuilder newTopologyBuilder) {+        final KafkaFutureImpl<Void> future = new KafkaFutureImpl<>();         try {             lock();             version.topologyVersion.incrementAndGet();             log.info("Adding NamedTopology {}, latest topology version is {}", newTopologyBuilder.topologyName(), version.topologyVersion.get());+            version.activeTopologyWaiters.add(new TopologyVersionWaiters(topologyVersion(), future));

Please see my other comment: instead of keeping a single list of topologyVersionWaiters, could we just keep a list of futures per thread, and once updated immediately complete all of them?

wcarlson5

comment created time in 13 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 private void initializeAndRestorePhase() {      // Check if the topology has been updated since we last checked, ie via #addNamedTopology or #removeNamedTopology     private void checkForTopologyUpdates() {-        if (lastSeenTopologyVersion < topologyMetadata.topologyVersion() || topologyMetadata.isEmpty()) {-            lastSeenTopologyVersion = topologyMetadata.topologyVersion();+        if (topologyMetadata.isEmpty() || topologyMetadata.needsUpdate(getName())) {             taskManager.handleTopologyUpdates();+            log.info("StreamThread has detected an update to the topology, triggering a rebalance to refresh the assignment");+            if (topologyMetadata.isEmpty()) {

Why we add this if condition? I cannot read the motivations here..

wcarlson5

comment created time in 13 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 private void unlock() {         version.topologyLock.unlock();     } +    public Collection<String> sourceTopicsForTopology(final String name) {+        return builders.get(name).sourceTopicCollection();+    }++    public boolean needsUpdate(final String threadName) {+        return threadVersions.get(threadName) < topologyVersion();+    }++    public void registerThread(final String threadName) {+        threadVersions.put(threadName, 0L);+    }++    public void unregisterThread(final String threadName) {+        threadVersions.remove(threadName);+    }++    public void reachedLatestVersion(final String threadName) {+        try {+            lock();+            final Iterator<TopologyVersionWaiters> iterator = version.activeTopologyWaiters.listIterator();

Correct me if I think crazily here :P My read is that the topology version of thread can ONLY be incremented (maybe by more than 1), but never decremented. So we actually do not need to remember what are the "waiting" versions as a list, instead we can just keep a list of futures for each thread, and when a thread has successfully handleTopologyUpdates, we can immediately complete all the currently maintained futures since there will be no futures that can ever be related to a newer version that this thread has just updated to.

With that, we can 1) get rid of the topologyVersionWaiters, 2) do not need the nest loop of while + stream() below fo complete futures. Instead we just keep a list of futures without any additional futures associated with them per thread.

wcarlson5

comment created time in 13 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 private void unlock() {         version.topologyLock.unlock();     } +    public Collection<String> sourceTopicsForTopology(final String name) {+        return builders.get(name).sourceTopicCollection();+    }++    public boolean needsUpdate(final String threadName) {+        return threadVersions.get(threadName) < topologyVersion();+    }++    public void registerThread(final String threadName) {+        threadVersions.put(threadName, 0L);

Does the version starts at 0 or starts at 1? If it start at 0 is it possible that a newly added thread would not get notified of the initial version 0?

wcarlson5

comment created time in 13 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentapache/kafka

KAFKA-13454: kafka has duplicate configuration information log information printin…

 class DynamicBrokerConfig(private val kafkaConfig: KafkaConfig) extends Logging     dynamicDefaultConfigs.clone()   } -  private[server] def updateBrokerConfig(brokerId: Int, persistentProps: Properties): Unit = CoreUtils.inWriteLock(lock) {+  private[server] def updateBrokerConfig(brokerId: Int, persistentProps: Properties, doLog: Boolean = false): Unit = CoreUtils.inWriteLock(lock) {

I think in order to not modify the behavior in ConfigHandler, we need to make the default doLog as true, and then explicitly set it in line 215 above as false.

zzccctv

comment created time in 18 days

PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentapache/kafka

KAFKA-13419: Only reset generation ID when ILLEGAL_GENERATION error

 protected void onJoinPrepare(int generation, String memberId) {      @Override     public void onLeavePrepare() {-        // Save the current Generation and use that to get the memberId, as the hb thread can change it at any time+        // Save the current Generation, as the hb thread can change it at any time         final Generation currentGeneration = generation();-        final String memberId = currentGeneration.memberId; -        log.debug("Executing onLeavePrepare with generation {} and memberId {}", currentGeneration, memberId);

Cool, thanks.

showuon

comment created time in 18 days

PullRequestReviewEvent

pull request commentapache/kafka

KAFKA-13454: kafka has duplicate configuration information log information printin…

Ah thanks @zzccctv that helps.

I'm now wondering if it is very useful to print the KafkaConfigs in updateDefaultConfig for the cluster-wide configs. What do you think about only printing once in updateBrokerConfig since that's the final configs to be used in this specific broker during initialization anyways.

If you agree, we can augment the updateCurrentConfig function with the validateOnly flag, which would be passed in as false in updateDefaultConfig so we only print it once.

zzccctv

comment created time in 19 days

PullRequestReviewEvent

pull request commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

Hey @wcarlson5 sorry for getting late on this review.. overall it looks good, and I think adding the future makes sense. Just feeling that we can simplify the thread coordination on a single-rebalance-trigger a bit.

Also, seems some related tests are failed.

wcarlson5

comment created time in 19 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 public NamedTopologyBuilder newNamedTopologyBuilder(final String topologyName) {      /**      * Add a new NamedTopology to a running Kafka Streams app. If multiple instances of the application are running,-     * you should inform all of them by calling {@link #addNamedTopology(NamedTopology)} on each client in order for+     * you should inform all of them by calling {@code #addNamedTopology(NamedTopology)} on each client in order for      * it to begin processing the new topology.      *      * @throws IllegalArgumentException if this topology name is already in use      * @throws IllegalStateException    if streams has not been started or has already shut down      * @throws TopologyException        if this topology subscribes to any input topics or pattern already in use      */-    public void addNamedTopology(final NamedTopology newTopology) {+    public AddNamedTopologyResult addNamedTopology(final NamedTopology newTopology) {+        log.error("adding {}", newTopology.name());         if (hasStartedOrFinishedShuttingDown()) {             throw new IllegalStateException("Cannot add a NamedTopology while the state is " + super.state);         } else if (getTopologyByName(newTopology.name()).isPresent()) {             throw new IllegalArgumentException("Unable to add the new NamedTopology " + newTopology.name() +                                                    " as another of the same name already exists");         }-        topologyMetadata.registerAndBuildNewTopology(newTopology.internalTopologyBuilder());+        return new AddNamedTopologyResult(+            topologyMetadata.registerAndBuildNewTopology(newTopology.internalTopologyBuilder())+        );     }      /**      * Remove an existing NamedTopology from a running Kafka Streams app. If multiple instances of the application are-     * running, you should inform all of them by calling {@link #removeNamedTopology(String)} on each client to ensure+     * running, you should inform all of them by calling {@code #removeNamedTopology(String)} on each client to ensure      * it stops processing the old topology.      *+     * @param topologyToRemove          name of the topology to be removed+     * @param resetOffsets              whether to reset the committed offsets for any source topics+     *      * @throws IllegalArgumentException if this topology name cannot be found      * @throws IllegalStateException    if streams has not been started or has already shut down      * @throws TopologyException        if this topology subscribes to any input topics or pattern already in use      */-    public void removeNamedTopology(final String topologyToRemove) {+    public RemoveNamedTopologyResult removeNamedTopology(final String topologyToRemove, final boolean resetOffsets) {+        log.error("Removing {}", topologyToRemove);         if (!isRunningOrRebalancing()) {             throw new IllegalStateException("Cannot remove a NamedTopology while the state is " + super.state);         } else if (!getTopologyByName(topologyToRemove).isPresent()) {             throw new IllegalArgumentException("Unable to locate for removal a NamedTopology called " + topologyToRemove);         }+        final Set<TopicPartition> partitionsToReset = metadataForLocalThreads()+            .stream()+            .flatMap(t -> {+                final HashSet<TaskMetadata> tasks = new HashSet<>();+                tasks.addAll(t.activeTasks());+                tasks.addAll(t.standbyTasks());+                return tasks.stream();+            })+            .flatMap(t -> t.topicPartitions().stream())+//            .filter(t -> topologyMetadata.sourceTopicCollection().contains(t))

Thanks! Just wanted to check that we would only want to remove partitions related to the topologyToRemove here :)

wcarlson5

comment created time in 19 days

PullRequestReviewEvent

pull request commentapache/kafka

KAFKA-12960: Enforcing strict retention time for WindowStore and Sess…

@mjsax @ableegoldman could you chime in here?

vamossagar12

comment created time in 19 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 public void maybeWaitForNonEmptyTopology(final Supplier<State> threadState) {         if (isEmpty() && threadState.get().isAlive()) {             try {                 lock();-                while (isEmpty() && threadState.get().isAlive()) {

See my other comment: I think this function maybeWaitForNonEmptyTopology would not be needed any more since it's only value is the version.topologyCV.await(); part, which we can just move up to the caller's while loop.

wcarlson5

comment created time in 19 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 private void initializeAndRestorePhase() {      // Check if the topology has been updated since we last checked, ie via #addNamedTopology or #removeNamedTopology     private void checkForTopologyUpdates() {-        if (lastSeenTopologyVersion < topologyMetadata.topologyVersion() || topologyMetadata.isEmpty()) {-            lastSeenTopologyVersion = topologyMetadata.topologyVersion();-            taskManager.handleTopologyUpdates();-

For line 915 below: do we still need this line? Could we just inline the Condition topologyCV waiting within this while loop now?

wcarlson5

comment created time in 19 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 private void initializeAndRestorePhase() {      // Check if the topology has been updated since we last checked, ie via #addNamedTopology or #removeNamedTopology     private void checkForTopologyUpdates() {-        if (lastSeenTopologyVersion < topologyMetadata.topologyVersion() || topologyMetadata.isEmpty()) {-            lastSeenTopologyVersion = topologyMetadata.topologyVersion();-            taskManager.handleTopologyUpdates();-+        do {             topologyMetadata.maybeWaitForNonEmptyTopology(() -> state);+            if (lastSeenTopologyVersion < topologyMetadata.topologyVersion()) {

As for the future: I think it's okay to let the first future be waiting a bit more if there are consecutive operations, i.e. in the above implementation we would complete all futures that have been registered so far when we eventually get every still-alive threads to catch up with the bumped version.

wcarlson5

comment created time in 19 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 public NamedTopologyBuilder newNamedTopologyBuilder(final String topologyName) {      /**      * Add a new NamedTopology to a running Kafka Streams app. If multiple instances of the application are running,-     * you should inform all of them by calling {@link #addNamedTopology(NamedTopology)} on each client in order for+     * you should inform all of them by calling {@code #addNamedTopology(NamedTopology)} on each client in order for      * it to begin processing the new topology.      *      * @throws IllegalArgumentException if this topology name is already in use      * @throws IllegalStateException    if streams has not been started or has already shut down      * @throws TopologyException        if this topology subscribes to any input topics or pattern already in use      */-    public void addNamedTopology(final NamedTopology newTopology) {+    public AddNamedTopologyResult addNamedTopology(final NamedTopology newTopology) {+        log.error("adding {}", newTopology.name());         if (hasStartedOrFinishedShuttingDown()) {             throw new IllegalStateException("Cannot add a NamedTopology while the state is " + super.state);         } else if (getTopologyByName(newTopology.name()).isPresent()) {             throw new IllegalArgumentException("Unable to add the new NamedTopology " + newTopology.name() +                                                    " as another of the same name already exists");         }-        topologyMetadata.registerAndBuildNewTopology(newTopology.internalTopologyBuilder());+        return new AddNamedTopologyResult(+            topologyMetadata.registerAndBuildNewTopology(newTopology.internalTopologyBuilder())+        );     }      /**      * Remove an existing NamedTopology from a running Kafka Streams app. If multiple instances of the application are-     * running, you should inform all of them by calling {@link #removeNamedTopology(String)} on each client to ensure+     * running, you should inform all of them by calling {@code #removeNamedTopology(String)} on each client to ensure      * it stops processing the old topology.      *+     * @param topologyToRemove          name of the topology to be removed+     * @param resetOffsets              whether to reset the committed offsets for any source topics+     *      * @throws IllegalArgumentException if this topology name cannot be found      * @throws IllegalStateException    if streams has not been started or has already shut down      * @throws TopologyException        if this topology subscribes to any input topics or pattern already in use      */-    public void removeNamedTopology(final String topologyToRemove) {+    public RemoveNamedTopologyResult removeNamedTopology(final String topologyToRemove, final boolean resetOffsets) {+        log.error("Removing {}", topologyToRemove);         if (!isRunningOrRebalancing()) {             throw new IllegalStateException("Cannot remove a NamedTopology while the state is " + super.state);         } else if (!getTopologyByName(topologyToRemove).isPresent()) {             throw new IllegalArgumentException("Unable to locate for removal a NamedTopology called " + topologyToRemove);         }+        final Set<TopicPartition> partitionsToReset = metadataForLocalThreads()+            .stream()+            .flatMap(t -> {+                final HashSet<TaskMetadata> tasks = new HashSet<>();+                tasks.addAll(t.activeTasks());+                tasks.addAll(t.standbyTasks());+                return tasks.stream();+            })+            .flatMap(t -> t.topicPartitions().stream())+//            .filter(t -> topologyMetadata.sourceTopicCollection().contains(t))

Intentional?

wcarlson5

comment created time in 19 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 private void initializeAndRestorePhase() {      // Check if the topology has been updated since we last checked, ie via #addNamedTopology or #removeNamedTopology     private void checkForTopologyUpdates() {-        if (lastSeenTopologyVersion < topologyMetadata.topologyVersion() || topologyMetadata.isEmpty()) {-            lastSeenTopologyVersion = topologyMetadata.topologyVersion();-            taskManager.handleTopologyUpdates();-+        do {             topologyMetadata.maybeWaitForNonEmptyTopology(() -> state);+            if (lastSeenTopologyVersion < topologyMetadata.topologyVersion()) {

I feel this is getting more complicated than necessary to do KAFKA-12648: we have a topologyMetadata at the KafkaStreams layer while at the StreamThread layer we keep a long lastSeenTopologyVersion. If we let the topologyMetadata object which is shared among all threads (as well as their task managers etc) then we can simplify this. Just a sketchy thought here:

  • In TopologyMetadata we maintain a Map<String, Long> from thread name to thread's current topology version. When new threads are added / threads are removed, this map would be updated as well (in synchronized way).
  • Inside a thread's checkForTopologyUpdate, in a synchronized block check if all threads' versions except this thread is equal to the current version: if yes, this thread would update its corresponding map entry as well and then trigger rebalance; otherwise, only update the corresponding map.

So if consecutive topology updates are being issued, then we would only trigger a rebalance at the end when all threads reaches the end topology version.

wcarlson5

comment created time in 19 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 public NamedTopologyBuilder newNamedTopologyBuilder(final String topologyName) {      /**      * Add a new NamedTopology to a running Kafka Streams app. If multiple instances of the application are running,-     * you should inform all of them by calling {@link #addNamedTopology(NamedTopology)} on each client in order for+     * you should inform all of them by calling {@code #addNamedTopology(NamedTopology)} on each client in order for      * it to begin processing the new topology.      *      * @throws IllegalArgumentException if this topology name is already in use      * @throws IllegalStateException    if streams has not been started or has already shut down      * @throws TopologyException        if this topology subscribes to any input topics or pattern already in use      */-    public void addNamedTopology(final NamedTopology newTopology) {+    public AddNamedTopologyResult addNamedTopology(final NamedTopology newTopology) {+        log.error("adding {}", newTopology.name());         if (hasStartedOrFinishedShuttingDown()) {             throw new IllegalStateException("Cannot add a NamedTopology while the state is " + super.state);         } else if (getTopologyByName(newTopology.name()).isPresent()) {             throw new IllegalArgumentException("Unable to add the new NamedTopology " + newTopology.name() +                                                    " as another of the same name already exists");         }-        topologyMetadata.registerAndBuildNewTopology(newTopology.internalTopologyBuilder());+        return new AddNamedTopologyResult(+            topologyMetadata.registerAndBuildNewTopology(newTopology.internalTopologyBuilder())+        );     }      /**      * Remove an existing NamedTopology from a running Kafka Streams app. If multiple instances of the application are-     * running, you should inform all of them by calling {@link #removeNamedTopology(String)} on each client to ensure+     * running, you should inform all of them by calling {@code #removeNamedTopology(String)} on each client to ensure      * it stops processing the old topology.      *+     * @param topologyToRemove          name of the topology to be removed+     * @param resetOffsets              whether to reset the committed offsets for any source topics+     *      * @throws IllegalArgumentException if this topology name cannot be found      * @throws IllegalStateException    if streams has not been started or has already shut down      * @throws TopologyException        if this topology subscribes to any input topics or pattern already in use      */-    public void removeNamedTopology(final String topologyToRemove) {+    public RemoveNamedTopologyResult removeNamedTopology(final String topologyToRemove, final boolean resetOffsets) {+        log.error("Removing {}", topologyToRemove);         if (!isRunningOrRebalancing()) {             throw new IllegalStateException("Cannot remove a NamedTopology while the state is " + super.state);         } else if (!getTopologyByName(topologyToRemove).isPresent()) {             throw new IllegalArgumentException("Unable to locate for removal a NamedTopology called " + topologyToRemove);         }+        final Set<TopicPartition> partitionsToReset = metadataForLocalThreads()+            .stream()+            .flatMap(t -> {+                final HashSet<TaskMetadata> tasks = new HashSet<>();+                tasks.addAll(t.activeTasks());+                tasks.addAll(t.standbyTasks());+                return tasks.stream();+            })+            .flatMap(t -> t.topicPartitions().stream())+//            .filter(t -> topologyMetadata.sourceTopicCollection().contains(t))

Also this logic seems would just include all partitions of all tasks, since it does not involve the topologyToRemove at all?

wcarlson5

comment created time in 19 days

Pull request review commentapache/kafka

KAFKA-12648: Make changing the named topologies blocking

 public TopologyMetadata(final InternalTopologyBuilder builder,         } else {             builders.put(UNNAMED_TOPOLOGY, builder);         }+        getStreamThreadCount = () -> getNumStreamThreads(config);

Why we need to initialize this supplier in the constructor (ditto below)? This variable is only called in reachedVersion when stream threads have been initialized already.

wcarlson5

comment created time in 19 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentapache/kafka

KAFKA-13419: Only reset generation ID when ILLEGAL_GENERATION error

 protected void onJoinPrepare(int generation, String memberId) {      @Override     public void onLeavePrepare() {-        // Save the current Generation and use that to get the memberId, as the hb thread can change it at any time+        // Save the current Generation, as the hb thread can change it at any time         final Generation currentGeneration = generation();-        final String memberId = currentGeneration.memberId; -        log.debug("Executing onLeavePrepare with generation {} and memberId {}", currentGeneration, memberId);+        log.debug("Executing onLeavePrepare with generation {}", currentGeneration);          // we should reset assignment and trigger the callback before leaving group         Set<TopicPartition> droppedPartitions = new HashSet<>(subscriptions.assignedPartitions());          if (subscriptions.hasAutoAssignedPartitions() && !droppedPartitions.isEmpty()) {             final Exception e;-            if (generation() == Generation.NO_GENERATION || rebalanceInProgress()) {+            if (currentGeneration.equals(Generation.NO_GENERATION) || rebalanceInProgress()) {

Good catch.

showuon

comment created time in 19 days

Pull request review commentapache/kafka

KAFKA-13419: Only reset generation ID when ILLEGAL_GENERATION error

 public void handle(JoinGroupResponse joinResponse, RequestFuture<ByteBuffer> fut                 // only need to reset the member id if generation has not been changed,                 // then retry immediately                 if (generationUnchanged())-                    resetGenerationOnResponseError(ApiKeys.JOIN_GROUP, error);+                    resetGenerationOnResponseError(ApiKeys.JOIN_GROUP, error, true);

nit: maybe it's now better to rename this function, to resetStateOnResponseError?

showuon

comment created time in 19 days

Pull request review commentapache/kafka

KAFKA-13419: Only reset generation ID when ILLEGAL_GENERATION error

 protected void onJoinPrepare(int generation, String memberId) {      @Override     public void onLeavePrepare() {-        // Save the current Generation and use that to get the memberId, as the hb thread can change it at any time+        // Save the current Generation, as the hb thread can change it at any time         final Generation currentGeneration = generation();-        final String memberId = currentGeneration.memberId; -        log.debug("Executing onLeavePrepare with generation {} and memberId {}", currentGeneration, memberId);

What's the rationale of removing the member id in logging?

showuon

comment created time in 19 days

more