Outbox Pattern with Debezium – Part 1

When working with microservices, data exchange between services is always essential. This data exchange can be synchronous or asynchronous and must ensure data consistency.

With synchronous calls, we can use RESTful Web Services, gRPC, or some other synchronous API to allow these services to exchange data. The disadvantage of synchronous calls is that the services become coupled and dependent on each other. One service needs to know information about another service in order to call it. Sometimes, even knowing the information of a service, that service may not be available to call due to network issues or internal errors. We can use some Service Mesh tools to solve these problems, such as retry and circuit breakers, but they also have certain limitations.

Another drawback of using synchronous calls is that we cannot replay or reprocess data if the previous request ultimately fails.

Both of these drawbacks can be solved by using an asynchronous call. Instead of directly calling the service, we publish the data to a topic in a Message Broker like Apache Kafka. Other services will subscribe to this topic to retrieve and process the data.

A data consistency issue can arise when an application, after saving its data, needs to simultaneously publish it to a Message Broker so that other services can retrieve and process it. An error occurs during this publication process. This is known as the Dual Writes problem!

To solve the Dual Writes problem, the only solution is to first save the data to one of two data sources: a database or a Message Broker. Then, there is a mechanism for the other data source to retrieve and process the data. If you choose to store data in a Message Broker like Apache Kafka, it doesn’t comply with the Read-Your-Write consistency principle, which dictates that when new data is added, the data read from the service must include that new data. When using a Message Broker to store data from the start, the data might not be synchronized with the database immediately, so the data read from the database won’t include it.

We should store the data in the service’s database first and then use the Outbox Pattern to populate this data into the Message Broker!

With the Outbox Pattern, we need to define an outbox table in the service database. When the data is stored in the database, we insert a record into the outbox table. This record will contain information about the new data event. It will be captured by a data capture tool like Debezium and published to the Message Broker, so other services can know and process this event.

For details on implementing the Outbox Pattern with Debezium, please read Part 2!

Add Comment