Data Quality Firewall
  • 2 Minutes to read
  • Dark

Data Quality Firewall

  • Dark

Article Summary

Data Quality Firewall is a feature available for data flows delivering data to data storage destinations (e.g. BigQuery, Snowflake, Synapse, or Amazon S3).

Its primary function is to ensure the accuracy and quality of the data being transferred. By checking and validating data before it reaches its destination, the firewall offers mechanisms to prevent corrupted or non-compliant data from entering your systems. By intercepting such data, Dataddo can notably reduce the complexity and overhead otherwise required to maintain the desired data quality level.

Core Concepts - Data Quality Firewall

Key Features

  1. Data Checks: Perform checks on null values, zero values, and anomalies. For more details on this process, refer to the deep dive on data quality features.
  2. Column-level Business Rules: Configure and apply specific rules for each column. Column-level granularity allows for precise control, ensuring that only data meeting specific criteria is accepted for transfer.
  3. Blocking and Non-blocking Mode: Select which operational mode Data Firewall should take in case of an issue.
    1. Blocking Mode: Stops the data transfer if it doesn't meet the established column-specific business rules. This ensures that downstream systems only receive compliant data.
    2. Non-blocking Mode: Allows the data to proceed even if issues are detected, logs the discrepancies for review, and sends a notification to the user.

Configure Data Quality Firewall

  1. In the Flows tab, click on your data flow to edit it or create a new flow.
  2. Navigate to the Data Quality Rules tab and click on Add Rule.
  3. Select for which columns you want to check null/zero/anomaly values.
  4. For next action steps, select Block to temporarily halt the entire data extraction process until you decide how to proceed.
  5. Test your rules before Saving the flow.

Data Quality Watcher - Set up

Overriding Data Quality Firewall

Once you have set your rules up, they will be enforced even during a manual data insert. This might be undesirable as it can cause the historical data load to stop altogether

When manually inserting your data, you can opt for ignoring the rules by checking the Skip flow rules validation box.

Data Quality - Skip rules

Was this article helpful?