Loading Historical Data
  • 8 Minutes to read
  • Dark
    Light

Loading Historical Data

  • Dark
    Light

Dataddo enables you to load historical data on an ad-hoc basis to be sent to your data destination.

If you need to add historical data from a specific date range, you can do it simply by a one-time request in the Dataddo app depending on your destination, whether it is a dashboarding app (eg. Google Data Studio, Power BI, Tableau... etc) or data warehouse (eg. PostgreSQL, MySQL, Google Big Query, S3... etc).

A historical data load will affect all the flows and destinations connected to that source.


Loading Data to Your Data Source

If you want to load historical data from an existing data source, you don't need to create the source again, just go straight to Step 2.

Step 1 - Creating a Data Source

Before loading historical data, first, you need to create a data source. Read more details in Data Source configuration guide, or search for the name of the connector in the search bar.

Step 2 - Loading Historical Data to a Data Source

  1. Under 'Sources', find the source you would like to load historical data from and click on the Manual Data Load icon.

  2. Set up the data load: choose the limits, date range, snapshot keeping policy and Confirm.
    Data source - data load step 3

NOTE

The snapshot keeping policy depends on the destination. If you are connecting your data directly to a dashboarding app, Dataddo will be used as your storage for all of your snapshots. Therefore, in most cases, you want to append the historical data to your already connected data/snapshots. If you are sending your data to a data warehouse, you do not need to keep the snapshots in the source, therefore, you can replace the current snapshots and only load the historical data to the source, that you will send to your data warehouse afterward.

How to Load Historic Data to Your Data Flow

If your destination is a database, you will need to manually send the new data load to the database.

For dashboarding tools, the historical data will be sent automatically.

If you want to load historical data to an existing data flow, you don't need to create the source again, just go straight to Step 2.

Step 1 - Creating a Data Flow

First, create a data flow with the source you'd like historical data from. Read more details in Data Flow configuration guide.

Step 2 - Loading Historical Data to a Data Flow

  1. Under 'Flows', find the flow you would like to load historical data from and click on the Manual Data Load icon.
    NOTE: When the destination is a dashboarding app (e.g. Google Data Studio), the data from your source is loaded automatically. For databases, you need to manually load the data to the flow after you loaded them to your source.
    Data flow - data load step 2

  2. Confirm the data insert and click on Insert data.
    Data flow - data load step 3


How to Change Date Range for a Manual Data Load

For some sources, selecting the date range in the Manual Data Load panel is not possible. When this occurs, there is still a way to get your historical data.

  1. Navigate to 'Sources' in your Dataddo account. Find the specific source you want to work with, click on the three dots and go to Edit .

  2. Navigate to the Advanced Settings tab and search for the URL field. Here, you can see the current setting. You will see things such as "startDate" and "endDate" in the link. In this example, you have "since_send_time" and "before_send_time". These specify the date range of data to be fetched by Dataddo.
    Data source - troubleshooting step 2

The expression in the brackets defines a dynamics date range. This date range is set to the date range you have chosen during the configuration of the source (such as yesterday, last 7 days, last 30 days, etc). In this example, the range set to the last 30 days, including today.
If you want to load historical data, the easiest way to do this is by either specifying the exact start date, end date, and time of the period for which you want to get data. Or you can use a dynamic date range, which needs to be set in the URL.

Case 1: Static Date Range

To pull data from 1 September 2020 until the end of 14 September 2021, change the terms in the brackets exact date in ISO 8601 time format. An example URL would be:

https://us16.api.mailchimp.com/3.0/campaigns?offset=0&count=500&since_send_time=2020-09-01T00:00:00+00:00&before_send_time=2020-09-14T23:59:59+00:00

Case 2: Dynamic Date Range
To pull data for last year, the expression in the brackets needs to be set to 1y1. An example URL would be:

https://us16.api.mailchimp.com/3.0/campaigns?offset=0&count=500&since_send_time={{1y1}}T00:00:00+00:00&before_send_time={{1y1}}T23:59:59+00:00

If you wanted, for example, data from 90 days ago, not including today, you would use 90d1 An example URL would be:

 https://us16.api.mailchimp.com/3.0/campaigns?offset=0&count=500&since_send_time={{90d1}}
  1. Once the date range is set in the URL, Save the changes. Now, you are ready to load the historical data. This is simply done by clicking on the Manual Data Load button. You need to choose whether you want to append the historical data to the data you have already fetched, or if you want to replace the existing data. Once you click Confirm your historical data will be loaded.
    In the source panel, click on the three dots and select Show Logs to check if the operation was successful.

  2. Finally, return the URL to the original setting. Or you can choose to set up a dynamic date range. Dataddo will pull data from that dynamic data range for every future snapshot.

For example, if you have the snapshotting schedule set to Weekly and you only want data from the last seven days (not including today), you use the expression 7d1 in the URL. You can also choose whether you want the new snapshot to replace all the previous, or be appended to your existing snapshots.

Test Extraction

After the URL is updated to whatever data range you want, you can use Test Extraction to see the API request. If something goes wrong, an error message will be displayed, such as a time-out error or syntax error. In the best-case scenario, you will see all the API requests and the total number of rows (at the very bottom of the page).

Tip & Tricks for Specific Connectors

Facebook Pages

You need to load historical data in batches of 90 days, up to 5 years back.

Facebook Ads

The Facebook Ads API limits the historical data load to the last 37 months (i.e. 3 years).

Facebook Posts

Data cannot be broken down by days, you can only extract the lifetime values of your metrics such as impressions or clicks. Set the snapshot keeping policy to Append to build a time series of your data.

Instagram Business (page)

The limit for historical data is 30 days, therefore you will need to load the data in batches of 30 days and you can go as far as one year back.

NOTE:
An exception is the New Followers Count metric. The Instagram Business API limits this metric to only pull the number of followers each day and can only load the last 30 days. It is not possible to load older data for this metric. Set your snapshot keeping policy to Append the data and build a time series of your data.

Instagram Story

  • Due to API limitations, Dataddo can only access data while the story is active (24-hour window). You can create a source only when you have a story currently posted to the account, otherwise, there is no data to access.
  • We recommend setting up the source to hourly syncing frequency in order to get data as often as possible and to be able to get the values at the end of the 24-hour window.
  • Allow Empty is an option that will pull the data even if there is no story posted. The source will not break because empty data will be pushed. Contact our Solutions Team to enable this feature.
  • By selecting Timestamp, the data will show the metrics over time of a single story, along with the timestamp.

Google Analytics

  • Due to a Google Analytics API limitation, some combinations of metrics will not work. You can check Google Analytics official documentation for the metrics that can be queried together.
  • There is a limited number of metrics and dimensions you can select per one source. You may select a maximum of 10 metrics and 7 dimensions.
  • You can edit the limit of how much data is being loaded for the historical load.
TIP: How can I see if I've reached the limit?

If you see a round number (e.g. 10,000) in the logs, you probably haven't loaded all of the data. An easy way to estimate the limit is to check how much daily data is being loaded when creating the source in the data preview, or go to Sources and click on the log of the source to see the row counts from the last load.

HubSpot

  • You cannot select a date range. Dataddo will pull the full list of the selected contacts or deals.
  • To keep track of historical records, choose a timestamp and set the snapshot keeping policy to Append to build a time series.

TIP:
To get owners' names for deals, you need to create a new source with a specific dataset: Deal Owners. Then use the Data Blending feature when creating a flow, where the owner ID will be the Join Key field.

Stripe

To change a date range you need to use the advanced settings and change the date range in the URL (e.g. 1d1, 1m1) .

TIP:
Go to advanced settings and change the URL from 1d1 (loading the last day of data) to 1m1 to load the last month of data. For more info about dynamic date ranges, follow our guide.

Google Search Console

You can only load data from the past 16 months.

Google My Business

You can only load data from the past 14 months.

Google Analytics 4

You can only load data from the past year.

Pinterest

When loading historical data, data must be loaded in batches of 6 months. However, there is no limit to how far back you can go.

Standard connectors

For most of the standard connectors, you can change the date range in the URL under Advanced Settings. For Sendinblue, Simplicate, Klaviyo, Google Search Console, dynamic data range has been implemented.


Need assistance?

Feel free to contact us and we will help you with the setup of your sources and/or flows. To speed up the process of resolving your ticket, make sure you provide us with sufficient information.


Was this article helpful?

What's Next