3 Comments

It seems like the problem is that kafka's default unit of parallelism is the partition (N consumers are mapped to N partitions enabling N degree of parallelism), but what Uber wanted is parallelism even within a partition.

Wouldn't a much simpler solution be to increase the number of partitions to `numConsumers * numThreadsPerConsumer` and make every consumer thread a separate kafka consumer of its own? I believe this is how the Kafka creators envisioned it to be done.

Another option would be to use Amazon SQS instead of Kafka where there is no concept of partitions and we can simply scale up the number of consumers however we want.

Expand full comment

How does the consumers register themselves with the proxy?

Expand full comment

How is the order of the messages maintained? Proxy layer could read multiple messages from the same user and send the messages to downstream services in parallel. Can this lead to race conditions?

Expand full comment