Modelling Multiple Semi Structured Source Files
The Semi-Structured Transformer allows up to five different source data files to be used within a single model. Relationships (edges) can be created between entities (nodes) from different sources. However, to prevent the relationship from generating a Cartesian product—where every instance of node type X is linked to every instance of node type Y—a join condition is required. The join condition ensures that an edge or relationship is only created when a specified field from each source file contains matching values.
In the example below, two source files are used: one containing transaction data and another containing person and card data.
The
TransactionandLocationnodes are sourced from thenew-transactions.csvfile.The
Personnode is sourced from theperson-card.jsonfile.

An edge has been created to link the Person node to the Location node using the madeTransactionIn relationship. To ensure that edges are only created between people and their relevant transaction data, a join condition must be specified on the edge.

In this case, the if and equals fields are used to select data from each source file. Both files contain credit card numbers, so edges will only be created between nodes for records where the card numbers match.
Last updated

