Optimizing Cloud Messaging Performance with Pub/Sub Batching
Written on
Chapter 1: Introduction to Pub/Sub and Batching
Google's Pub/Sub platform is an essential resource for establishing loosely coupled services in the cloud, providing a robust framework for asynchronous communication. Nevertheless, utilizing cloud solutions like Pub/Sub presents specific challenges, particularly concerning network interactions.
Network issues such as latency, partitioning, and the potential for message loss can adversely affect performance. These challenges, often concealed by the abstraction layers in client libraries, may result in inconsistent service performance. Therefore, it is vital to understand and alleviate the effects of network calls to maintain peak system performance. One effective approach is to reduce the frequency of these calls, especially during high-traffic periods.
In this article, we will delve into how message batching in Pub/Sub can improve performance, leading to a more reliable, efficient, and economical message processing system.
Pub/Sub Best Practices: Publishing - YouTube: This video discusses essential strategies for effectively publishing messages using Google Cloud Pub/Sub.
Section 1.1: Evaluating Pub/Sub Performance Without Batching
Before we explore batching, it's crucial to examine Pub/Sub's performance without it. A Kotlin script was developed for this purpose, which can be found on GitHub. This script sends a set number of messages to Pub/Sub, waits for processing, and records the elapsed time.
To better mimic real-world conditions, the script introduces a slight delay of 0 to 5 milliseconds between each publish, typically averaging less than 2 milliseconds. While these tests may not be flawless due to factors like computer performance and internet quality, they provide a general overview of operational effectiveness and highlight areas for potential improvement.
Here are the results without batching:
- Msgs: Total number of messages published during each test run.
- Med. Time (ms): Median time taken to publish all messages over 25 trials, serving as a reliable performance metric.
- Msgs/sec: Rate of successful message publications per second.
- Time to Publish One Million: Estimated time to publish one million messages based on median performance.
It appears there is some initial 'warming up' occurring. The messages per second rate increases significantly with larger batches (10,000 vs. 1,000 messages), but the performance difference between 10,000 and 25,000 messages is less pronounced, indicating diminishing returns at higher volumes. This aligns with Google Pub/Sub's design, which is horizontally scalable, meaning it can handle increased topics, subscriptions, or messages by scaling server instances.
Section 1.2: Implementing Batching
To enable message batching, the Publisher must be configured with suitable BatchSettings. Here’s an example using Google's Java client library:
val batchingSettings =
BatchingSettings.newBuilder()
.setIsEnabled(true)
.setDelayThreshold(...)
.setElementCountThreshold(...)
.setRequestByteThreshold(...)
.build();
val builder =
Publisher.newBuilder("projects/$GCP_PROJECT/topics/$GCP_TOPIC")
.setBatchingSettings(batchingSettings)
.build();
- isEnabled: Activates batching functionality.
- elementCountThreshold: Maximum number of messages per batch.
- requestByteThreshold: Maximum total size of a batch in bytes.
- delayThreshold: Maximum wait time before sending an incomplete batch if the other two thresholds aren't met.
These settings may vary slightly across client libraries. For those using Spring Boot with PubSubTemplate, it is crucial to include all configuration properties in your application.yaml file, as missing any of them can lead to batching not being implemented, which could be misleading.
spring:
cloud:
gcp:
pubsub:
enabled: true
publisher:
batching:
enabled: true
element-count-threshold: 10
delay-threshold-seconds: 1
request-byte-threshold: 100000
Pub/Sub Best Practices: Latency & Reliability - YouTube: This video provides insights into minimizing latency and ensuring reliable message delivery in Google Cloud Pub/Sub.
Subsection 1.2.1: Assessing the Impact of Batching
Next, we will repeat the previous experiment but this time with batching activated. My focus is on the scenario involving 1,000 messages. A grid search was conducted to optimize batch parameters, testing various combinations to achieve the best settings. The findings are as follows:
- Best Performance:
- Settings: delayThreshold 10ms, elementCountThreshold 5, requestByteThreshold 4096
- Time to Publish One Million: Approximately 52 minutes and 24 seconds
- Performance Improvement: 35.79% faster than the baseline
- Worst Performance:
- Settings: delayThreshold 1000ms, elementCountThreshold 50, requestByteThreshold 2048
- Time to Publish One Million: Approximately 1 day, 2 hours, and 43 minutes
- Performance Decline: 200.47% slower than the baseline
While more exhaustive parameter searches could yield even better configurations, this level of experimentation suffices for demonstration.
It's important to note that incorrect batch parameter settings can lead to significant performance issues, resulting in inefficient processing and prolonged operational times. Finding the right balance between these three parameters is essential to optimize latency and network traffic. A low delay threshold may lead to frequent dispatch of incomplete batches, while a high threshold might create idle times, delaying message sending.
Section 1.3: Managing Batching Flow
To enhance the efficiency of message batching in Pub/Sub, effective flow control is vital, especially when generating messages at high rates. Without proper control, there's a risk of memory overload or producing messages that become outdated, potentially causing a DEADLINE_EXCEEDED error if your production rate exceeds the publisher's sending capacity.
Flow control can be managed using three key parameters:
- Buffer Size in Bytes (setMaxOutstandingRequestBytes): This defines the maximum buffer size for messages in bytes, regulating memory usage by capping the total size of messages waiting to be sent.
- Buffer Size in Terms of Events (setMaxOutstandingElementCount): This sets a limit on the number of messages in the buffer, ensuring it doesn't exceed a specified count.
- Behavior When Limits are Exceeded (setLimitExceededBehavior): This parameter determines the action taken when limits are surpassed. Options include ignoring the limit, throwing an exception, or blocking until there’s space.
These parameters collectively ensure the efficiency of the batching process, preventing bottlenecks and supporting a smooth message flow.
Chapter 2: Conclusion
Implementing message batching in Pub/Sub can significantly boost your application's performance, but it requires careful tuning of parameters. It's essential to have a clear grasp of your application's performance before enabling batching. Be sure to identify any other constraints impacting performance prior to making changes.
Finding the optimal batching settings in my experimental scenario was a lengthy process, and real-world situations can be considerably more complex. Moreover, batching is typically more effective with a steady stream of events; it may struggle in scenarios with abrupt traffic spikes.