How to loosely couple microservices
Introduction
In my last blog, I showed how to make a simple Kubernetes application resilient. To do so, I had to wrap some key code around a transaction lock to guarantee that only one request is updating the database at a time.
But there is still an issue with the current architecture: what happens when the Redis master dies or when the backend times out (maybe waiting for lock)?
In this article, I will show how to loosely couple the frontend and backend components.
Current architecture
Here is the current solution architecture:
All these components are deployed in HA mode (3 instances), except the Redis master, which has a single instance. So, if the Redis master dies, Kubernetes will promptly deploy a new one. However, the backend is not ready for component unavailability, so it might fail and lose a message.
Certainly, we can implement logic in the backend to retry the Redis master connection, or even think about using Istio VirtualService for this purpose. However, there is a way to delegate this responsibility to an independent component.
A message queue decouples the producer from the consumer and provides the capability to ensure the message has been processed successfully.
New architecture
Here is the new architecture:
There are two new components:
- Messaging: It can be implemented any messaging queue, such as RabbitMQ or IBM Cloud Pak for Integration.
- Consumer: This component will retrieve the messages from the queue and store them in the database.
The backend will continue retrieving the messages synchronously (using the Redis slave). But for the append, it will send the message to the queue. In this case, the consumer will then retrieve the message from the queue and persist in the database (using the Redis master).
The code for this version is available here.
A consequence of the asynchronicity
The asynchronous handling of the append operation decouples the backend from the Redis master. However, it introduces a new problem. So far, the operations were all synchronous: when a message was appended, the list of all messages was returned as the result. Now, the append operation simply adds the message to the queue, and we don’t know when the consumer will process the message.
To fix this problem, we need to change the way the frontend receives the list of messages. Instead of receiving as the result of the append operation, the frontend should query the messages constantly and update the UI.
This implementation is available here.
A look at the test application
The test application does the following flow:
- Clear all messages in the Redis database
- Ensure that the number of messages is zone
- Append N messages
- Retrieve the messages
- Ensure that the number the messages in N
As the processing of the message is now asynchronous, we don’t know when they will be persisted in the database. So, we need to change the logic above to include an additional step:
- Clear all messages in the Redis database
- Ensure that the number of messages is zone
- Append N messages
- Wait for the queue to be empty
- Retrieve the messages
- Ensure that the number the messages in N
The code for this solution is available here.
An issue with auto_ack
Now the solution proposed so far seems robust, but there is still an issue.
As I send 100 messages, only a fraction is returned in the first step of the flow above.
The reason is that even though we are checking that the queue is empty, the messages might not be fully processed (meaning, persisted in the database).
Let’s look at the consumer logic:
- Receive a message from the queue
- Lock the database (to prevent the concurrency problem described in the previous blog)
- Retrieve the messages from the database
- Append the message to the messages
- Persist the messages
- Unlock the database
As the message is consumed from the queue, it’s automatically removed (acknowledged). Then, many messages might be on the way of being processed, either waiting for the database lock or executing one of the steps above.
The solution here is to defer the acknowledgment of the message, to guarantee they are persisted before they are removed from the queue.
This implementation is available here.
Problem not solved
Even as I manually acknowledge the message at the end of the cycle, the number of messages in the queue returned by RabbitMQ / amqplib seems always to be zero.
So let me do the following scenario:
- Scale the number of consumers to 0
- Send 1000 messages
- Watch the number of messages in the queue (it should be 1000)
- Scale the number of consumers to 1
- The number of messages should gradually decrease
As I scale the number of consumer replicas to 0, I see the following:
checkQueue: { queue: 'messages', messageCount: 0, consumerCount: 0 }
Then as I send the 1000 messages, I see the following:
checkQueue: { queue: 'messages', messageCount: 1000, consumerCount: 0 }
as expected. The test application is patiently waiting for the queue size to be empty.
Now, let me start one consumer pod (by scaling the deployment to 1). Immediately after I do that, I see the following message:
checkQueue: { queue: 'messages', messageCount: 0, consumerCount: 1 }
and, certainly, the consumer didn’t have enough time to process the 1000 messages.
So it seems RabbitMQ (or the amqplib module) has an issue with returning the number of messages in the queue.
Do you know a solution? Let me know.
For now, I am moving to a different kind of queue.
Conclusion
In this article, I showed how to decouple some microservice components by using a queue.
The solution seemed solid until I reach a limitation (or issue) with the RabbitMQ library regarding returning the number of messages in the queue.
Bring your plan to the IBM Garage.
Are you ready to learn more about working with the IBM Garage? We’re here to help. Contact us today to schedule time to speak with a Garage expert about your next big idea. Learn about our IBM Garage Method, the design, development and startup communities we work in, and the deep expertise and capabilities we bring to the table.
Schedule a no-charge visit with the IBM Garage.