- 5 Minutes to read
Data Sources Overview
- 5 Minutes to read
A data source in Dataddo represents a connection to any 3rd party from which data are extracted. This could be a variety of platforms including SaaS apps like Salesforce, NetSuite, HubSpot, Stripe, Klaviyo, Facebook Ads, and Google Analytics 4, or databases such as MySQL, Postgres, and SQL Server. You can also connect to cloud data warehouses like BigQuery, Redshift, and Snowflake, as well as file storages like S3 and SFTP. The initial setup of the source is configured via different types of Connectors, such as Universal Connectors, Fixed-Schema Connectors, or Custom-Schema Connectors. Once configured, these sources are paired with destinations to form a data flow, effectively bridging the gap between different systems. The execution of this data flow, and hence the actual data transport, is scheduled based on the source setup, particularly the snapshotting frequency you choose.
Creating a Source
Creating a source in Dataddo is initiated through the selection of a Connector to define the initial setup. Dataddo offers three main types of Connectors, each tailored to different needs:
- Universal Connectors. Examples include JSON, XML, CSV, and database options like MySQL and Postgres. These allow for highly customizable, low-level interactions with various systems.
- Fixed-Schema Connectors. Options like Mailchimp, ExactOnline, and Google Search Console offer pre-defined schemas. These are easy to set up but have limited customization capabilities.
- Custom-Schema Connectors. Specialized for platforms like Hubspot, Google Analytics 4, Salesforce, and Facebook Ads, these allow for dynamic schema definitions tailored to services with custom attributes.
Edit Data Source
- Navigate to Sources, click on the three dots next to the source you want to edit and click on Edit.
- Apply your changes to various settings including Authorizer associated with the Source or configuration of data extraction itself.
- Click on Save to confirm.
It's crucial to understand that changes affecting the schema of the source are not permitted. This is to ensure data consistency and avoid unintended disruptions in downstream systems, like your data warehouse. If your changes require a modification to the schema, we recommend utilizing the source cloning feature. Create a clone of the existing source and make the schema-related changes there.
Duplicate/Clone a Data Source
- Navigate to Sources and click on the Clone Settings icon next to the source you want to duplicate.
- You will be directed to create a new source with the same configuration as the existing one and you can edit data sets, attributes, dimensions, etc.
- Once your new configuration is ready, click on Save.
Versioning Data Source
While Dataddo does not offer built-in versioning for data sources, you can achieve a similar outcome by using the Source Cloning feature. By cloning the existing source, you can make modifications to its parameters without affecting the original setup. This allows you to maintain various versions of a source configuration to meet different needs.
Delete a Source
To delete a source, navigate to the "Sources" section and click on the trash can icon next to the source you wish to remove. A warning window will appear, asking you to confirm the deletion. This window will also display the flows that are connected to the source and will be automatically deleted as a consequence. To proceed with the deletion, type "DELETE" into the provided field box.
In the Dataddo platform, a source can exist in one of four different statuses, each represented by a color for quick identification:
- Live (green)
- Reconnecting (orange)
- Broken (red)
- Inactive (gray)
When a source is initially created, it enters the Live status. As long as Dataddo is able to successfully extract data from this source, it will continue to stay in the Live state.
If an attempt to extract data fails, the source's status changes to Reconnecting. Dataddo will make two more attempts to extract the data. If any of these attempts are successful, the source will revert back to the Live status.
If all three attempts to extract data fail, the source will transition to a Broken status. When a source is in this state, it must be manually fixed and restarted by the user. Until then, no data extraction will occur for this source.
The Inactive status indicates that the source is not currently being used for data extraction. To resume data extraction, you will need to manually restart the source, at which point it will return to the Live status.
Recovery from Broken State
When a source is in the broken state, the system ceases attempting to extract data. To recover the source from this state, please follow the steps below:
- In your Dataddo account, navigate to the Sources tab and click on the Restart Source button next to your broken source.
- If restart is unsuccessful, check the extraction logs and fix the source usign troubleshooting.
- If restart is successful, a prompt will appear for loading any missing data (known as data backfilling). This step is optional and can be skipped if not needed.
- Should you need to perform data backfilling, ensure the correct date range is set to capture all missing data.
- Based on the data destination of the attached data flows, set the Snapshot Keeping Policy.
- Once set up, type CONFIRM and click on Load Data.
Dataddo provides extraction logs for each source, offering users a transparent view into every step of the extraction process. These logs chronicle each extraction event, regardless of its success or failure.
To view the logs, click on the three-dots icon next to the respective source and choose Extraction Logs. Should you encounter any problems, review the most recent log entry and consult the Troubleshooting section for solutions.
Source ID & Extraction ID
When you encounter a problem and wish to report an error, you will be asked to provide your source ID and extraction ID.
- To get your source od extraction IDs, navigate to the Sources tab.
- Click on the three dots next to your source and click on Edit.
- At the bottom of the Basic Info tab, you can find your source and extraction IDs. Click on the Copy button and send us the ID.