Loading Historical Data

Dataddo enables you to load historical or ad-hoc data and add them to your Data Sources, Data Flows & Data Destinations.

If you come across a case where you need to add a specific range of your historical data to your data set, you can do it simply by a one-time request in the Dataddo app depending on your destination, whether it is a dashboarding app (eg. Google Data Studio, Power BI, Tableau... etc) or data warehouse (eg. PostgreSQL, MySQL, Google Big Query, S3... etc).

Differences between locations of data loads

  • A) Data Source - If you load your data to your source, it will affect all the flows and destinations to which the source is connected. 
  • B) Data Flow -  If you load your data to your flow, it will be loaded only to the specific destination but will not affect other flows with the same source.

Option A - Loading data to your Data Source

If you want to load historical data to an existing data source, you don't need to create it again, you can go straight to STEP 2.

STEP 1 - Creating a data source

Before loading historical data, first, you need to connect a data source. Read more details in Data Source configuration guide, or search for the name of the connector in the search bar.

STEP 2 - Loading historical data to a data source

   
Number 1

Under Sources, click on the three dots, next to the source and choose Ad Hoc Data Load, where you want to load the historic data.
Data source - data load step 2

Number 2

Set up the data load: choose the limits, date range, snapshotting policy and Confirm.Data source - data load step 3

Option B - How to load historic data to your Data Flow

If you want to load historical data to an existing data source, you don't need to create it again, you can go straight to STEP 2.

STEP 1 - Creating a data flow

Before loading historical data to a flow, first, you need to create a data flow. Read more details in Data Flow configuration guide.

STEP 2 - Loading historical data to a data flow

   
Number 1

Under Flows, click on the three dots, next to the flow, where you want to load the historic data, and choose Ad-hoc data insert.
Data flow - data load step 2

Number 2

Confirm the data insert and click on Insert data.
Data flow - data load step 3


Need assistance?

Feel free to contact us or create a ticket and you will help you set up the Data Source.

How to change Date Range in Ad-hoc data load

Another case is that selecting DATE RANGE in the Ad-hoc data load panel is not possible. When this occurs, there is still a way to get your historical data. This process is a bit more advanced but feasible following this guide.

   
Number 1

Navigate to Data Sources in your Dataddo account, find a specific source, you want to work with and click on the Edit button.
Data source - troubleshooting step 1

Number 2

Navigate to the Advanced Settings tab and search for the URL field. Here, you can see the current setting. You will always see something such as "startDate" and "endDate" in the link, in this example, you have "since_send_time" and "before_send_time". These specify the date range of data to be fetched by Dataddo.
Data source - troubleshooting step 2

The expression in the curly brackets defines a dynamic date range. This date range is originally set to the date range you have chosen during the configuration of the source (such as Yesterday, Last 7 days, Last 30 days, etc). In this case, you have the range set to last 30 days including today.

 

If you want to load historical data, the easiest way how to do this is by either specifying the exact start and end date and time of the period for which you want to get data or using the dynamic date range. You need to set this in the URL.

 

Case 1: Static date range

If you wanted to pull data from 1 September 2020 until the end of 14 September 2021. Then you would need to change the curly brackets term to the exact date in ISO 8601 time format, hence our example URL would be:

https://us16.api.mailchimp.com/3.0/campaigns?offset=0&count=500&since_send_time=2020-09-01T00:00:00+00:00&before_send_time=2020-09-14T23:59:59+00:00
 

Case 2: Dynamic date range

If you wanted to pull data for last year. Then the expression in the curly brackets needs to be set to 1y1, hence our example URL would be:

https://us16.api.mailchimp.com/3.0/campaigns?offset=0&count=500&since_send_time={{1y1}}T00:00:00+00:00&before_send_time={{1y1}}T23:59:59+00:00


If you wanted for example 90 days ago till yesterday, you would use 90d1, hence the example URL would be:

https://us16.api.mailchimp.com/3.0/campaigns?offset=0&count=500&since_send_time={{90d1}}T00:00:00+00:00&before_send_time={{90d1}}T23:59:59+00:00 
Number 3

Once the date range of the historical data that you want to load is set in the URL, Save the changes. Now, you are ready to load the historical data. This is simply done by clicking on the Ad-hoc data load button. You need to choose whether you want to append these historical data to the one you have already fetched, or whether you want to replace them. You confirm and the historical data gets loaded. You can once again Show logs on your source panel and check if the operation went successfully.

Number 4

The last thing you need to do is either return the URL to the original setting or pick a new date range for which you want to pull data every time Dataddo takes a new snapshot, using the dynamic range.

 

For example, if you have the schedule set to Weekly and you always want to collect only the data from the last seven days (not including today), you use the expression 7d1 in the URL. You can also choose whether you want the new snapshot to replace all the previous, or append to them. Choosing the further, our edit window would look like this:

Test extraction

After the URL is updated to whatever data range they want, you can use Test extraction to see the API request. And, if something goes wrong, then an error message will be displayed, such as a time-out error or syntax error. And you can know for sure that something went wrong. In the best-case scenario, you will see all the API requests and the total number of rows (at the very button of the page).  

Tip & Tricks for specific connectors

Facebook Pages

You need to load historical data in batches of 90 days, up to 5 years back.

Facebook Posts

Data cannot be broken down by days, you can only extract the lifetime values of your metrics such as impressions or clicks. You can keep a current snapshot and see the actual situation, or "Append" the data and build a time series of your data.

Instagram Business (page)

  • The limit for historical data is 30 days, therefore you will need to load the data in batches of 30 days and you can go as far as one year back.

NOTE: Exception is the New followers count metric. You can get information on the number of followers each day, but you can only load the last 30 days, older data is not possible to load. You can keep a current snapshot and see the actual situation, or "append" the data and build a time series of your data.

Instagram Story 

  • Due to the API limitations, Dataddo can only access data while the story is active (24-hour window). You can create a source only when you have some story running, otherwise, there is no data to access.
  • We recommend setting up the source to hourly syncing frequency in order to get data as often as possible and to be able to get the values at the end of the 24-hour window.
  • Allow empty - Option to pull the data only even if the data is not there when the story is posted or when there is no story, the source will not get broken because even “empty” data will be pushed - this feature can be enabled by contacting our support team.
  • Timestamp - the data will show the same story with different times and increased numbers.

Google Analytics

  • Not all combinations of metrics work. You can check Google Analytics official documentation for the proper combination.
  • There is a limited number of metrics and dimensions you can select per one source - a maximum of 10 metrics and 7 dimensions.
  • You can edit the limit of how much data is being loaded for the historic load.

TIP: How to check if you reached the limit?

When looking in the log, a round number (e.g. 10000) is visible, most probably you did not load all the data. An easy way to estimate the limit is to check how much daily data is being loaded, when creating the source in the preview, or go to Sources and click on the log of the source to see Row counts from the last load). 

Hubspot

  • You cannot select a date range, Dataddo will pull the full list of the selected contacts or deals.
  • To keep track of historical records, choose a timestamp to Append to build your time series in time.

TIP: to get owners' names for deals, you need to create a new source with a specific dataset - (deal) owners. Then use the Data Blending feature in Flows, where the owner ID will be the Join Key field.

Stripe

To change a date range you need to use advanced settings and change the date range in the URL (1d1,1m1)

TIP: Go to advanced settings and change the URL from 1d1 - loading last day of data to 1m1 - that means to load last month of data. After you load this data, you need to load the URL back so for the next data load you only get the new fresh data. For more info about dynamic date ranges, follow our guide.

Google Search Console

You can load data only 16 months back.

Google my business

You can load data only 14 months back.

Google Analytics 4

You can load data only 1 year back.

Standard connectors

For most of the standard connectors, you can change the data range in the URL under Advanced settings. For Sendinblue, Simplicate, Klaviyo, Google Search Console, dynamic data range has been implemented.


If you still need some help with the loading of historical data, feel free to contact us