Change Data Capture
Change Data Capture (CDC) is a technology and methodology used to capture and track changes made to data in a database, allowing you to keep track of what has been added, updated, or deleted in your database. Using an integration between Debezium, Kafka, the Semi-Structured Transformer and the Graph Writer it is possible to use this CDC data to update any graph that is based on data from the source database. In this way the source database and graph database can be kept constantly in sync. The steps of the process are as follows
A change is made in the SQL DB (create, update, or delete)
Debezium (connect) creates a payload with information of this change and writes it to Kafka
The Semi Structured Transformer (as this payload is represented as JSON) takes the data from the Kafka topic and transformers it into RDF (or CSV of LPG)
The graph data is stored as a file, the URL of which is pushed to a Kafka topic
The Graph Writer takes the graph data and writes it to the specified Knowledge Graph
Examples of how to set up a development stack to test CDC for various database are included.
Last updated

