Documentation
- Documentation
- Headless API

Amazon S3

3 Minutes to read

Share
Dark
Light

Amazon S3

3 Minutes to read

Share
Dark
Light

Article summary

Did you find this summary helpful?

Thank you for your feedback!

Amazon S3 (Simple Storage Service) is a scalable cloud storage service provided by Amazon Web Services (AWS). It offers a way to store and retrieve data, such as files, images, videos, and backups, in a highly durable and easily accessible manner, making it a foundational component for various cloud-based applications and services.

Prerequisites

You have created an AWS S3 bucket.
You have created an IAM user with AmazonS3FullAccess policy attached to your bucket.

Authorize Connection to Amazon S3

In AWS Portal

Create S3 Bucket

Sign in to the AWS Management Portal and navigate to S3 Console.
Click on +Create bucket. Name your bucket and choose AWS Region. You will need to provide this information to Dataddo.
Click on the Create button to finalize the process.

Configure the Access Permissions

Navigate to the IAM service, click on Users and continue with Add User.
Name your user and for the access type, select Programmatic access.
In the Set permissions step, choose Attach existing policies directly. Attach a policy that grants the necessary S3 permissions. This could be one of the following:
- The AmazonS3FullAccess policy, which grants full access to all S3 resources.
- A custom policy that only grants access to the necessary bucket and actions.
Save the IAM user's credentials. You will be given an access key and a secret key. These credentials will be needed for configuring the connection to your S3 bucket in Dataddo.

Use the template below for configuring the Access Policy. Make sure to replace your-bucket-name with your bucket identifier.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:PutObjectAcl",
        "s3:GetObject",
        "s3:GetObjectAcl",
        "s3:DeleteObject",
        "s3:ListBucket",
      ],
      "Resource": "arn:aws:s3:::your-bucket-name/*"
    }
  ]
}

In Dataddo

In the Authorizers tab, click on Authorize New Service and select S3.
You will be asked to fill the following fields
1. Bucket: Provide the identifier of S3 bucket you want to use for reading or writing the data.
2. Region: Region of the S3 bucket.
3. Key: Provide your AWS Access Key.
4. Secret: Provide your AWS Secret Key.
Click on Save.

Create a New S3 Destination

Under the Destinations tab, click on the Create Destination button and select the destination from the list.
Select your account from the drop-down menu.
Fill in the Path. Use the name of the folder in your Container + slash (e.g. "database/" ).
Name your destination and click on Save to create your destination.

Need to authorize another connection?

Click on Add new Authorizer in drop-down menu during authorizer selection and follow the on-screen prompts. You can also go to the Authorizers page and click on Add New Service.

Creating a Flow to S3 Storage

Navigate to Flows and click on Create Flow.
Click on Connect Your Data to add your source(s).
Click on Connect Your Data Destination to add the destination.
Choose the write mode and fill in the other required information.
Check the Data Preview to see if your configuration is correct.
Name your flow and click on Create Flow to finish the setup.

File Partitioning

File partitioning splits large datasets into smaller, manageable partitions, based on criteria like date. This technique enhances data organization, query performance, and management by grouping subsets of data with shared attributes.

During flow creation:

Select one of the predefined file name patterns.
Define your own custom name to suit your partitioning needs.

Example of a custom file name
When creating a custom file name, use variations of the offered file names.

For example, use a base file name and add a different date range pattern :

xyz_{{1d1|Ymd}}

Using this file name, Dataddo will create a new file named xyz every day, e.g. xyz_20xx0101, xyz_20xx0102 etc.

Troubleshooting

Error Message `kms:GenerateDataKey`

ERROR MESSAGE

Action failed: stream transfer: write data from stream: upload JSON to s3: uploading data to S3 bucket 'bucket-name': operation error S3: PutObject, https response error StatusCode: 403, RequestID: request_id, HostID: host_id api error AccessDenied: User: user_name is not authorized to perform: kms:GenerateDataKey on resource: arn:aws:s3:::bucket-name/ because no identity-based policy allows the kms:GenerateDataKey action

This issue is most likely caused by your buckets utilizing special encryption methods. To solve this, you'll need to include the kms:GenerateDataKey permission scope to your IAM user using, for example, this template:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "kms:GenerateDataKey",
            "Resource": "arn:aws:s3:::your-bucket-name/*"
        }
    ]
}

Was this article helpful?

What's Next

AlloyDB

Table of contents

Prerequisites
Authorize Connection to Amazon S3
Create a New S3 Destination
Creating a Flow to S3 Storage
Troubleshooting

Amazon S3

Prerequisites

Authorize Connection to Amazon S3

In AWS Portal

Create S3 Bucket

Configure the Access Permissions

In Dataddo

Create a New S3 Destination

Creating a Flow to S3 Storage

File Partitioning

Troubleshooting

Error Message kms:GenerateDataKey

Related Articles

What's Next

Error Message `kms:GenerateDataKey`