- 2 Minutes to read
- DarkLight
Data Blending
- 2 Minutes to read
- DarkLight
Data blending integrates data from two data sources based on common identifiers or keys. This approach combines complementary data from different sources to form a single view, similar to the JOIN
operator in SQL.
Create a New Data Flow with Blending
- Navigate to the Flows page and click on Create Flow in the top right corner.
- Click on Connect Your Data and select the first source.
- Hover over the newly added source for the menu to show and click on Combine.
- Select Blend Sources.
- Click on Select Source to choose the sources you wish to blend.
- Choose a Join Key that is the same for both sources. Dataddo will blend the datasets based on this particular key.
- Click on the button between the sources to configure the joining type. Left Join is selected by default.
- Left Join: Records from the right source will be joined to the left dataset.
- Inner Join: Returns records which values are only matching in both column sources.
- Click on Save Source.
- Add your data destination, finish configuring your flow, and click on Save Flow.
Select columns from each source by dragging the fields from the list on the left or right. The sources don't have to have the same columns.
If changes affecting the database schema are made in the flow (e.g. change in field names, number of columns, data types), go to the database and delete the previously created table. Then, save the changes and refresh.
Troubleshooting
Issue with Repeated Column Names
ERROR MESSAGE
The column name ‘___’ is specified more than once
This problem can be fixed by renaming the affected column in one of the two data sources.
- On the Sources page, click on one of the sources you wish to blend
- Navigate to the Schema tab, change the name of the affected column to e.g. propertyid_2 and click on Save.
Your data sources can be blended as no column is defined twice anymore.
Duplicated Field
When using data blending, combining columns from two different data sources may result in a duplicated field (such as ID). This can result in a broken flow. Rename one of the two affected columns like in the previous troubleshooting guide.