KIP-932 - Queues for Kafka - Analysis of EDA Compatibility

Overview
#

After reading Gunnar Morling’s blog post investigating queues for Kafka , it got me wondering if the way queues are implemented is at odds with EDA principles and best practices. This is a fairly formal analysis, mostly due to the number of formal concerns.

This analysis examines whether KIP-932 “Queues for Kafka” contradicts or undermines event-driven architecture (EDA) principles. The key question is whether separating consumption by key represents a fundamental departure from EDA principles or simply introduces an additional consumption model within Kafka’s event streaming paradigm.

Key Finding: KIP-932 does not fundamentally change Kafka’s nature as an event streaming platform but rather supplements it with queue-like processing capabilities. It maintains the core EDA principle of events as the source of truth while adding flexibility in consumption patterns.

High Value Use Cases
#

These represent scenarios where KIP-932 provides substantial benefits

Partition Count Limitations: Services that need more consumer parallelism than the current partition count allows, enabling scaling beyond the traditional “one consumer per partition” constraint
Cost Optimization: Reduces the operational cost of maintaining large numbers of partitions in cloud-hosted Kafka deployments (especially relevant for Confluent Cloud pricing models)
Processing Parallelism: Enables efficient parallel processing of events within a partition, improving throughput without sacrificing key-based ordering guarantees
Dynamic Scaling: Allows horizontal scaling of consumers in response to load without reconfiguring partition counts, especially valuable in container/Kubernetes environments with tools like KEDA - this is basically an unlock to make KEDA amazing!

KIP-932 Overview
#

KIP-932 introduces cooperative consumption through “share groups,” allowing multiple consumers to collectively process messages from the same partition. This feature addresses a significant limitation in Kafka’s original consumer group model, which ties scaling to partition count.

Core Features of KIP-932
#

Share Groups: Multiple consumers can process records from the same partition
Record-level Acknowledgment: Consumers ack individual records rather than offsets
Key-based Consumption: Enables routing messages with the same key to the same consumer
Work Sharing: Allows horizontal scaling of consumers independent of partition count
Cooperative Processing: Consumers work together on partitions rather than exclusively owning them

Compatibility with Event-Driven Architecture Principles
#

EDA Principle	KIP-932 Compatibility	Analysis
Events as First-class Citizens	✅ Compatible	Events remain the fundamental unit and source of truth
Event Log as Source of Truth	✅ Compatible	The log structure remains unchanged; new consumption model added on top
Decoupling of Producers/Consumers	✅ Compatible	Maintains or enhances decoupling by allowing flexible consumption
Immutability of Events	✅ Compatible	Events remain immutable in the log
Temporal Ordering	✅ Compatible	Preserves temporal ordering within partitions
Event Replay Capability	✅ Compatible	Log replay capabilities remain intact

Architectural Implications
#

1. What KIP-932 Changes
#

Consumption Model: Introduces queue-like consumption patterns without changing event production
Scaling Model: Decouples consumer scaling from partition count
Message Processing Guarantees: Enables record-level acknowledgment rather than just offset-based
Consumer Coordination: Allows cooperative work on partitions instead of exclusive ownership

2. What KIP-932 Preserves
#

Log-based Storage: Events are still stored in an immutable, append-only log
Event-first Paradigm: Events remain the primary integration mechanism
Producer Independence: Producers are unaffected and continue working exactly as before
Event Replay: The ability to replay events from any point remains intact
Temporal Order: Order of events within partitions remains preserved

Queue vs. EDA: Feature Comparison
#

Feature	Traditional Queue	Traditional Kafka	Kafka with KIP-932
Message Retention	Removed after processing	Configurable retention	Configurable retention
Consumption Model	Competing consumers	Consumer groups tied to partitions	Flexible: Traditional or cooperative
Processing Acknowledgment	Message-level ack	Offset-based	Both offset and record-level available
Message Replay	Limited/None	Full replay capability	Full replay capability
Scalability	Limited by competing consumer model	Limited by partition count	Independent of partition count

Assessment of Architectural Impact
#

KIP-932 represents an evolutionary rather than revolutionary change to Kafka’s architecture. It adds queue-like features while preserving the core event streaming foundation. The primary architectural implication is increased flexibility in how events are consumed, not a fundamental change to how events are produced, stored, or conceptualized.

Key Analysis Points:
#

Adding vs. Replacing: KIP-932 adds capabilities rather than replacing existing ones
Opt-in Feature: Traditional consumer groups remain fully supported
Log Foundation: The underlying log-based architecture remains unchanged
Event Immutability: Events remain immutable in the log, preserving a core EDA principle
Temporal Ordering: Event ordering within partitions remains preserved

Recommendations for Maintaining EDA Principles
#

For teams concerned about maintaining pure EDA principles while adopting KIP-932:

Maintain Event-First Thinking: Continue to model domain changes as events
Use Share Groups Judiciously: Apply queue-like processing only where scaling or specific message distribution is needed
Preserve Event Sourcing Patterns: Continue using events as the source of truth
Document Consumption Models: Clearly separate traditional consumer groups from share groups in documentation
Establish Architecture Guidelines: Create clear guidelines for when each consumption model is appropriate

Technical Implementation Considerations
#

KIP-932 aligns with Kafka’s core principles while extending consumption capabilities. These are specific considerations when implementing:

Technical Implementation Focus Areas
#

Area	Specific Consideration	Implementation Approach
Key-Based Processing	KIP-932 extends Kafka’s key-based partitioning model to consumption, maintaining the same ordering guarantees within key groups	Leverage existing key-partition design patterns; KIP-932 works naturally with current key-partitioning strategies
Consumer Code Changes	Moving from offset commits to record acknowledgment requires specific API usage	Use the ShareGroup API explicitly rather than attempting to modify current Consumer Group implementations
Unkeyed Topics	Share groups provide limited benefits for unkeyed topics where message ordering across the topic is important	Reserve share groups for keyed topics where the primary concern is scaling processing of independent keys
Rebalance Handling	Share group rebalancing behavior differs from consumer groups and requires specific handling	Implement explicit tests for rebalance scenarios; behavior is well-defined but different
Client Library Support	Adoption will depend on client library implementation across languages	Verify Share Group API support in your programming language’s client libraries before planning implementation

Organizational Adoption Focus
#

Area	Specific Focus	Implementation Approach
Usage Guidelines	Define clear criteria for when share groups are appropriate (e.g., when partition count limits are reached, when scaling is needed for throughput)	Document specific use cases with concrete examples, focusing on situations where consumer scaling shouldn’t be constrained by partition count
Team Knowledge	Ensure engineers understand that share groups maintain Kafka’s ordering guarantees for keys	Focused training on how share groups enhance rather than change Kafka’s fundamental event-ordering properties
Implementation Consistency	Standardize how teams implement record acknowledgment and error handling	Create organization-specific client wrappers with standardized acknowledgment patterns; focus especially on error scenarios
Organizational Alignment	Address potential disagreements on adoption by focusing on metrics and use cases	Establish objective criteria for adoption such as CPU utilization improvements, throughput gains, or reduced partition count
Quick Wins	Identify existing bottlenecks that are perfect candidates for share groups	Target services with known key hotspots or those that require partition counts that exceed reasonable management overhead

Specific Application Scenarios for KIP-932
#

KIP-932 addresses very specific technical challenges that occur in real-world Kafka deployments:

Implementation Checklist
#

For teams preparing to adopt Share Groups, focus on these specific technical aspects:

Consumer Parallelism Analysis: Use monitoring tools to identify consumer groups that would benefit from additional parallelism beyond partition count; look for services with high lag or slower message processing times where adding more consumers would improve throughput
Consumer Logic Review: Examine current consumer implementation to ensure idempotent processing, if required
Client Library Verification: Confirm your client library implementation has proper support for the ShareGroup API and record-level acknowledgment
Partition Count Optimization: Calculate optimal partition count based on producer througput rather than consumer parallelism requirements; this enables right-sizing partition counts to data volume rather than scaling needs
Order-Sensitivity Assessment: Identify whether your processing has ordering requirements beyond key-level ordering (which share groups preserve) or if global topic ordering is needed (where share groups provide fewer benefits), or where key-level ordering is not required (share groups not recommended)

Technical Implementation Guide
#

Specific guidance for engineering teams implementing share groups:

Record Acknowledgment Pattern: Implement consistent record acknowledgment patterns with proper error handling to ensure processing reliability
Key Distribution Analysis: Analyze your message key distribution to understand potential benefits; topics with diverse keys will benefit most from share groups
Monitoring Instrumentation: Add specific metrics for share group operations - tracking record processing times, acknowledgment rates, and consumer resource utilization
Scaling Automation: Integrate with container orchestration platforms like Kubernetes to enable dynamic scaling based on message processing metrics
State Management: Review your application’s state management approach, as consumers now process specific keys rather than entire partitions

Technical Implementation Details
#

Specific technical aspects of KIP-932 implementation that engineers should understand:

Share Group Protocol Specifics
#

Key technical details of the share group protocol implementation:

Record Delivery Mechanism: Share groups use record-level acknowledgment, enabling multiple consumers to process messages from the same partition concurrently
Broker-side Management: The broker maintains the state of acknowledged records, extending Kafka’s traditional broker responsibilities (a departure from Kafka’s traditional “dumb brokers, smart clients” philosophy)
Message Distribution: The protocol distributes messages to consumers based on key affinity while enabling processing parallelism
Rebalance Protocol: Leverages the cooperative rebalancing protocol (introduced in KIP-429 ) to minimize disruption during consumer scaling or failover, and will need to consider future improvements from KIP-848 which is not yet available (at the time of writing) in Confluent Cloud
Consumer Coordination: Share group consumers coordinate processing through the broker rather than through direct partition ownership

Monitoring Metrics That Matter
#

Specific metrics to implement for share group monitoring:

Record Processing Latency: Track processing time for records to identify throughput bottlenecks by implementing custom metrics in your consumer application
Message Consumption Rate: Monitor the rate at which messages are being consumed by the share group
Record Acknowledgment Rate: Track record acknowledgment rates to identify processing issues
Consumer Resource Utilization: Monitor CPU, memory, and network usage per consumer to optimize scaling
Rebalance Frequency & Duration: Measure rebalance operations which may affect processing latency
Unacknowledged Record Count: Track records that remain unacknowledged beyond expected processing timeframes

Architectural Solutions Enabled by KIP-932
#

Specific architectural patterns that share groups enable:

Parallel Processing Model: Process messages from the same partition in parallel while maintaining key-based ordering
Consumer Scaling Beyond Partition Limits: Scale consumers beyond the traditional partition count limitation
Partition Count Optimization: Size partition count based on producer throughput and storage requirements rather than consumer scaling needs
Dynamic Consumer Scaling: Scale consumers up/down independently of partition structure

Comparison with RabbitMQ
#

System	Feature	Specific Technical Comparison to KIP-932
RabbitMQ	Competing Consumers	KIP-932 maintains strict per-key ordering guarantees while enabling parallel consumption, combining queue-like processing with Kafka’s immutable log model for replay capabilities

Conclusion: Technical Reality
#

KIP-932 “Queues for Kafka” represents a natural evolution of Kafka’s consumption model that addresses real operational challenges without compromising its fundamental principles. The key points to understand:

Alignment with Core Principles: Share groups extend Kafka’s key-based partitioning model to consumption, maintaining the ordering guarantees that are central to Kafka’s design while adding flexibility
Performance Optimization: Share groups enable more efficient resource utilization by allowing consumer scaling independent of partition count constraints
Technical Continuity: The feature preserves event immutability, temporal ordering, log persistence, and replay capabilities - core EDA principles remain fully intact
Implementation Considerations: Share groups introduce record-level acknowledgment and new broker responsibilities, representing a shift from Kafka’s traditional “dumb brokers, smart clients” approach
Operational Benefits: Direct benefits include right-sizing partition counts based on data needs rather than scaling constraints, enabling more flexible consumer scaling models, and optimizing resource utilization

The value proposition is clear: KIP-932 adds capabilities that address real operational constraints while preserving Kafka’s core architectural strengths. It enhances Kafka’s consumption model without requiring teams to compromise on EDA principles, making it a pragmatic enhancement that respects Kafka’s fundamental design philosophy. For additional insights, Gunnar Morling provides an excellent overview in his analysis .

Overview #

High Value Use Cases #

KIP-932 Overview #

Core Features of KIP-932 #

Compatibility with Event-Driven Architecture Principles #

Architectural Implications #

1. What KIP-932 Changes #

2. What KIP-932 Preserves #

Queue vs. EDA: Feature Comparison #

Assessment of Architectural Impact #

Key Analysis Points: #

Recommendations for Maintaining EDA Principles #

Technical Implementation Considerations #

Technical Implementation Focus Areas #

Organizational Adoption Focus #

Specific Application Scenarios for KIP-932 #

Implementation Checklist #

Technical Implementation Guide #

Technical Implementation Details #

Share Group Protocol Specifics #

Monitoring Metrics That Matter #

Architectural Solutions Enabled by KIP-932 #

Comparison with RabbitMQ #

Conclusion: Technical Reality #

References #