---
title: "Data Backfilling for Database Replication"
slug: "data-backfilling-for-database-replication"
description: "Simply load historical data while replicating your database with Dataddo. Easily load historical data from sources to destinations with one-time requests."
tags: ["Data flow", "Data backfilling"]
updated: 2023-10-26T10:59:36Z
published: 2023-10-26T10:59:36Z
---

> ## Documentation Index
> Fetch the complete documentation index at: https://docs.dataddo.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Data Backfilling for Database Replication

Dataddo's **Data Backfilling** feature is ideal for database replication tasks. It facilitates the data transfer from more traditional database systems such as [MySQL](/docs/mysql), [PostgreSQL](/docs/postgres), or [SQL Server](/docs/universal-sql-server) to cloud-based data storage solutions like [Snowflake](/docs/snowflake), [BigQuery](/docs/google-bigquery), [Redshift](/docs/redshift), or [Databricks](/docs/databricks).

          
          

In case you need to backfill data from **business apps** or other **cloud services**, please refer to [this article](/docs/data-backfilling-to-storages).

## The Process

1. Go to the [**Sources**](https://app.dataddo.com/sources) page and [trigger the extraction](/v1/docs/data-backfilling-for-database-replication#trigger-the-extraction) for your selected source. If needed, specify the partition, e.g. using OFFSET and LIMIT options.
2. Initiate the [data writing](/docs/data-backfilling-for-database-replication#trigger-data-writing-to-the-storage) in the relevant flow. For large data sets, you may need to use [partitioning](/docs/data-backfilling-for-database-replication#update-the-extraction-query). The recommended **maximum** size of single partition is **5M rows**.

![Core Concepts - Data Backfilling - DB Replication](https://cdn.document360.io/084ed225-3f99-4644-a2da-39ca0cd5ef45/Images/Documentation/Core%20Concepts%20-%20Data%20Backfilling%20-%20DB%20Replication.png)

## Trigger the Extraction

1. Go to the [**Sources**](https://app.dataddo.com/sources) page and click on your source.
2. Go to the **Advanced Settings** tab and edit the **Statement** field to temporarily update the extraction query (e.g. add partitioning clauses). For more information, see [the section on updating the extraction query](/docs/data-backfilling-for-database-replication#update-the-extraction-query).
3. **Save** your configuration.
4. Click on the **Manual Data Load** button next to your source and set the **Snapshot Keeping Policy** to **Replace**.
5. Click on **Load Data** to initiate the extraction.

![Database Replication - data backfiling - ad hoc load](https://cdn.document360.io/084ed225-3f99-4644-a2da-39ca0cd5ef45/Images/Documentation/Database%20Replication%20-%20data%20backfiling%20-%20ad%20hoc%20load.png)

### Update the Extraction Query

          Once data backfilling is complete, please revert the source configuration to its original settings to ensure smooth regular data extractions.

          

To partition an extensive data set, you'll need to modify the extraction query.

![Database Replication - data backfiling](https://cdn.document360.io/084ed225-3f99-4644-a2da-39ca0cd5ef45/Images/Documentation/Database%20Replication%20-%20data%20backfiling.png)

1. Click on your source and navigate to the **Advanced Settings** tab.
2. Locate the **Statement** field, which contains the SQL query responsible for data extraction.
3. Add partitioning clauses to this query to break down the data extraction into manageable chunks. For example, you can use SQL syntax like LIMIT and OFFSET as shown below to extract the first 5 million rows from a table.
4. **Save** your configuration.

```
SELECT * FROM mytable LIMIT 0, 5000000
```

## Trigger Data Writing to the Storage

          **DATADDO TIP**

          

Before starting this operation, please **check the [write mode](/docs/data-storages#write-modes) of your flow**. We recommend using the [**upsert**](https://docs.dataddo.com/docs/data-storages#upsert) write mode, which either adds new records or updates existing ones, reducing the risk of duplicate entries.

1. Go to the [**Flows**](https://app.dataddo.com/flows) page and click on the **Manual Data Insert** button next to your flow.
2. **Confirm** the data insert and click on **Insert Data**.
3. Confirm the operation was successful by checking logs or inspecting the warehouse directly.

![Historical data load - load data to flow](https://cdn.document360.io/084ed225-3f99-4644-a2da-39ca0cd5ef45/Images/Documentation/Historical%20data%20load%20-%20load%20data%20to%20flow.png)

---

## Troubleshooting

### Context Deadline Exceeded Error

```
rpc error: code = DeadlineExceeded desc = context deadline exceeded
```

This issue may be caused by **extracting too many rows**. Please [update your extraction query](/v1/docs/data-backfilling-for-database-replication#update-the-extraction-query) to smaller partitions. The recommended maximum partition size is 5M rows.

### Cannot Change Write Mode to Upsert in Existing Flows

Changing to the [upsert](https://docs.dataddo.com/docs/data-storages#upsert) write mode for an existing flow is not possible for some specific destinations like [MySQL](/docs/mysql-destination). In these cases, Dataddo creates indexes to ensure smooth system operations and as such, write mode chages to existing flows are restricted.

**Use one of the following workarounds:**

1. If you can **lose the data** in the existing table without consequences or **recover it through backfilling**:
  1. **Re-create** the flow using the [**upsert**](/docs/data-storages#upsert) write mode to ensure that the table and indexes are correctly established.
2. If **preserving** the data is crucial:
  1. **Delete** the existing flow.
  2. Add a **unique key index** on the columns you wish to include in the [**composite key**](/docs/data-storages#upsert). For example, using the SQL query below.
  3. Create a new flow with **upsert** as the write mode and a [composite key](/docs/data-storages#upsert) that incorporates the column on which you created the unique index.

```
ALTER TABLE table_name
ADD UNIQUE INDEX unique_index_name (column1, column2);
```

---
