---
title: "PII Exclusion and Hashing"
slug: "pii-exclusion-and-hashing"
description: "Excluding personally identifiable information (PII) in Dataddo during the data extraction process is a proactive measure to enhance data privacy and security."
tags: ["Resource", "Data flow", "Data transformation"]
updated: 2025-11-23T14:02:40Z
published: 2025-11-23T14:02:40Z
---

> ## Documentation Index
> Fetch the complete documentation index at: https://docs.dataddo.com/llms.txt
> Use this file to discover all available pages before exploring further.

# PII Exclusion and Hashing

In the realm of data management, protecting personally identifiable information (PII) is essential. Dataddo helps ensure data privacy and security by identifying sensitive fields automatically and allowing users to **exclude** or **hash** them during connector configuration.

By offering the flexibility to exclude or hash sensitive data, Dataddo enables organizations to tailor their data privacy strategies to meet their specific needs while maintaining both compliance and security.

**Benefits of PII Identification**

- **Enhanced Privacy**: Sensitive information like names, social security numbers, and personal addresses is identified and can be excluded or hashed to minimize exposure.
- **Compliance**: Ensures alignment with data privacy regulations (e.g., GDPR, CCPA) by preventing sensitive data from entering your data warehouse.
- **Reduced Risks**: Minimizes the likelihood of data breaches or compliance violations by handling sensitive data securely at the source.
- **Simplified Data Governance**: Keeps sensitive data out of your warehouse, streamlining governance and reducing complexity.

![Identify PII](https://cdn.document360.io/084ed225-3f99-4644-a2da-39ca0cd5ef45/Images/Documentation/PII%20identification.png)

## Exclude PII

Dataddo allows you to proactively remove any sensitive information, such as names, social security numbers, and personal addresses, from the dataset before it reaches the data warehouse.

**Key Benefits of PII Exclusion:**

- Prevents sensitive data from ever entering the data warehouse or downstream systems.
- Simplifies compliance by ensuring no PII is retained.
- Reduces the scope of data security measures needed.

During ***data source*** creation, exclude PII from data extraction by simply unselecting the appropriate metrics and attributes.

![PII exclusion](https://cdn.document360.io/084ed225-3f99-4644-a2da-39ca0cd5ef45/Images/Documentation/PII%20Exclusion.png)

## Sensitive Data Hashing

Hashing is a process that transforms sensitive information into an encrypted, unreadable format called a hash value using cryptographic algorithms. This fixed-length string is unique to the input data, serving as its digital fingerprint, and cannot be reverse-engineered into its original form.

**Key Characteristics of Hashing:**

- **Irreversible**: Unlike encryption, hashing is a one-way process. The original input data cannot be reconstructed from the hash value.
- **Deterministic**: The same input will always produce the same hash value, making it useful for data matching and validation.
- **Fixed-Length Output**: Regardless of the input size, the hash value generated is always of a fixed length (e.g., 256 bits for SHA-256).

**Example:**

- Input: `"JohnDoe@example.com"`
- Hash: `d4c74594d841139328695756648b6bd6`

This ensures the data is secure and can still be used for matching or validation without exposing sensitive information.

- Allows you to retain sensitive information for matching or validation without exposing its original form.
- Enhances compliance by securely encrypting PII before it enters the data warehouse.
- Uses the latest, most secure, and fastest cryptographic technologies for robust protection.e to **hash** these fields rather than excluding them. This ensures the sensitive information is encrypted and remains usable for analysis where needed.

A source is a collection of data from an authorized service that's been connected via a Dataddo connector. Data within the source is automatically refreshed based on the source's configuration.
