- 5 Minutes to read
-
DarkLight
Data Sources
- 5 Minutes to read
-
DarkLight
A Source is a dynamic collection of snapshots from a service or system connected via a Dataddo Connector. Data within the Sources are automatically refreshed based on the snapshotting frequency and snapshot retention.
Creating a Source
Dataddo uses Connectors to connect to your service or system (e.g. Salesforce CRM, Hubspot, Google Analytics). When the connector configuration is finished, a Source is created and a Snapshot is taken. Based on the configuration, the data in the source will be refreshed according to the snapshotting frequency and snapshot retention policy.
Connectors
No-code Connectors
Dataddo offers great flexibility when it comes to connecting to various online services and systems. We currently cover a wide variety of services and systems ranging from CRMs, ERPs through Marketing Analytics to Payment systems. All the services can be connected using No-code connectors, meaning that you no longer have to rely on scripting and SQL transformations to take a data snapshot from your service or system. With our no-code connector, a job that required several man-days of high-paid and highly qualified coders can be performed by Interns just by choosing the right metrics. Thanks to our unique technology, Dataddo can connect to almost any online service providing it has JSON/CSV/XML based API (this covers most services).
NOTE: Our team is available to build new connectors within a few business weeks. You can request a new connector here.
In order to request a new connector, you will need to provide us with a list of the exact API endpoints of the service that you'd like to access, or the types of data you'd like to access. The connector development process usually takes about 4 weeks (based on the volume of requests). If you're interested in Dataddo creating a custom connector for you please contact your account representative or our solutions team at support@dataddo.com.
Universal Connectors
Although Dataddo is designed to provide unprecedented data integration flexibility, there is a finite number of off-the-shelf connectors with dedicated no-code interfaces that we offer. In case your desired service is not in our off-the-shelf connectors list, you can always use one of our Universal connectors to access your data.
Automatic Refreshing of the Data in a Source
When a source is created the data are refreshed according to the snapshotting frequency and snapshot retention policy.
Snapshotting Frequency
Snapshotting frequency tells you how often a snapshot should be taken. It is defined using multiple parameters including timezone, minute, hour, day in a week, and day in the month. Combining these parameters, you can easily set the snapshotting frequency ranging from 5-minute updates to once a month. More information is available on the dedicated snapshotting frequency page.
In the sources overview, you can see the information about the state, when the last snapshot was taken, and when the next will be taken.
Snapshot Retention
Each time a snapshot is taken, the data in the source is updated. Retention policy outlines the way how to treat the already existing data in the source. When Keep latest only option is used, only the data from the latest snapshot will be kept in a source. When Merge with existing is set, the content of the latest snapshot is merged with the existing data in the source. More information is available on the dedicated snapshot retention page.
Ad-Hoc / Historic Data Load
You can use Ad-hoc data load function to one-off load historic data to your source. More details on how to load ad-hoc data or historical data is available on a dedicated topic.
Source State
Each Source can have multiple states with relation to the automatic refreshing of the data:
- Live Snapshots are taken according to the schedule, works without any issue.
- Offline. Snapshot taking was not set in the connector, the data in the source is not being updated.
- Connecting. An issue encountered during the last attempt to take a snapshot. The operation is set for retry in the following hour. If the operation fails 3 times in a row, the snapshotting is blocked and has to be manually restarted.
- Paused. The snapshotting is manually paused by the user.
- Broken. The snapshotting is blocked due to multiple failures of the process.
Recovery from the Broken State
When a source is in a Broken state, the system no longer attempts to take a data snapshot from the service. To recover the source from this state you must run the Debug command first. You will be provided with a detailed system output that can help you to identify the issue. Once the debug command returns with OK, you can restart by manual data load (we recommend loading yesterday's data) to return the source back to the standard snapshotting schedule. Read more on how to fix a broken source here.
Using Sources in a Data flow
To deliver the data from a source(s) to a destination, the source(s) must be orchestrated in a Data Flow. There is a M:N relationship to data flow, therefore you can include multiple sources into a single ata flow and a single source can be associated with multiple data flows.
Data Flow with a Single Source
When data contained in a single source is delivered to storage or provided to a dashboarding app based on the chosen destination.
Data Flow with Multiple Sources
When multiple sources are included in the flow, the data are either unjoined or joined together based on the data flow configuration. Learn more on the flow overview. You can also blend sources.
Editing a Source
You can edit a part of your data source configuration such as source name, types of data, or snapshotting.
Editing your data source may result in broken source and flow, data errors such as duplicates. We recommend you clone your data source and set up a new configuration.
See this article for more information on how to edit a data source.
Cloning a Data Source
In order to maintain data consistency, the source structure, including column order, is fixed. If you need to include the different attributes in the source, the best practice is to recreate the source using the Clone button which will initiate the creation of a new data connector with the parameters used in the original source.