A Source is a dynamic collection of snapshots from a service or system connected via a Dataddo Connector. Data within the Sources are automatically refreshed based on the snapshotting frequency and snapshot retention.
Creating a source
Dataddo uses Connectors to connect to your service or system (e.g. Salesforce CRM, Hubspot, Google Analytics). When the connector configuration is finished, a Source is created and a Snapshot is taken. Based on the configuration, the data in the Source will be refreshed according to the snapshotting frequency and snapshot retention policy.
No code connectors
Dataddo offers great flexibility when it comes to connecting to various online services and systems. We currently cover a wide variety of services and systems ranging from CRMs, ERPs through Marketing Analytics to Payment systems. All the services can be connected using No-code connectors, meaning that you no longer have to rely on scripting and SQL transformations to take a data snapshot from your service or system. With our no-code connector, a job that required several man-days of high-paid and highly qualified coders can be performed by Interns just by choosing the right metrics. Thanks to our unique technology, Dataddo can connect to almost any online service providing it has JSON/CSV/XML based API (this covers most services). Even in the case of proprietary APIs, our team is available to build new connectors within 1-2 business days. You can request a new connector here.
Although Dataddo is designed to provide unprecedented data integration flexibility, there is a finite number of off-the-shelf connectors with dedicated no-code interfaces that we offer. In case your desired service is not in our off-the-shelf connectors list, you can always use one of our Universal connectors to access your data.
Automatic refreshing of the data in a Source
When a Source is created the data are refreshed according to the Snapshotting frequency and Snapshot retention policy.
Snapshotting frequency tells you how often a snapshot should be taken. It is defined using multiple parameters including timezone, minute, hour, day in week and day in the month. Combining these parameters, you can easily set the snapshotting frequency ranging from 5-minute updates to once a month. More information available on the dedicated snapshotting frequency page.
In the sources overview, you can see the information about the state, when the last snapshot was taken and when the next will be taken.
Each time a snapshot is taken, the data in the Source is updated. Retention policy outlines the way how to treat the already existing data in the Source. When Keep latest only option is used, only the data from the latest snapshot will be kept in a source. When Merge with existing is set, the content of the latest snapshot is merged with the existing data in the source. More information available on the dedicated snapshot retention page.
Each Source can have multiple states with relation to the automatic refreshing of the data:
- Live. Snapshots are taken according to the schedule, works without any issue.
- Offline. Snapshot taking was not set in the connector, the data in the source is not being updated.
- Connecting. An issue encountered during the last attempt to take a snapshot. The operation is set for retry in the following hour. If the operation fails 3 times in a row, the snapshotting is blocked and has to be manually restarted.
- Paused. The snapshotting is manually paused by the user.
- Broken. The snapshotting is blocked due to multiple failures of the process.
You can use "Ad-hoc data load" to one-off load historic data to your source. More details in a dedicated topic.
Recovery from the broken state
When a source is in the "BROKEN" state, the system no longer attempts to take a data snapshot from the service. To recover the source from this state you must run the "Debug" command first. You will be provided with a detailed system output that can help you to identify the issue. Once the debug command returns with "OK", you can run "Restart" to return the source back to the standard snapshotting schedule.
Using Sources in a Data flow
To deliver the data from a Source(s) to a destination, the Source(s) must be orchestrated in a Data Flow. There are a M:N relationship to Data flow, therefore you can include multiple Sources into a single Data flow and single Source can be associated with multiple Data flows.
Data flow with single source
When data contained in a single Source are delivered to the storage or provided to a dashboarding app based on the chosen destination.
Data flow with multiple sources
When multiple sources are included in the Data flow, the data are either unjoined or joined together based on the Data flow configuration. More on dedicated page.
Editing a source
Data types and column names
Data types and column names can be edited by clicking on the "Edit" icon for the particular source. Although Dataddo automatically detects the datatypes of each source, occasionally, you might need to change the definition. After changing the datatype, please force run the source synchronization to apply the new definition to the values.
In the modal window, it is possible to edit the data types and column names of the source.
It is NOT possible to edit the chosen attributes (columns) in the source definition. In order to maintain data consistency, the source structure, including column ordering, is fixed. If you need to include the different attributes in the source, the best practice is to recreate the source using the “duplicate” button which will initiate the creation of a new data connector with the parameters used in the original source.