Amazon S3

Overview of Amazon S3 managed sink connector.

The Amazon Simple Storage Service (S3) sink connector allows you to send data from your GlassFlow pipeline directly to an S3 bucket, making it easy to store processed data for further analysis or archiving.

Connector Details

To configure the AWS S3 sink connector, you will need the following details:

  • AWS Access Key ID: Your AWS access key ID.
  • AWS Secret Access Key: Your AWS secret access key.
  • AWS Region: The AWS region where your S3 bucket is located.
  • S3 Bucket Name: The name of the S3 bucket where the data will be stored.
  • S3 Folder Name: The folder path within the bucket where your data files will be saved.

Obtaining Connection Credentials

To obtain your AWS credentials, follow these steps:

  1. Log in to the AWS Management Console.
  2. Navigate to the IAM Dashboard and create a new user or use an existing one with permissions to write to S3. For more details, refer to the AWS IAM Documentation.
  3. Attach the necessary policies (e.g., AmazonS3FullAccess).
  4. Generate Access Keys and note down the Access Key ID and Secret Access Key. For detailed steps, see Managing access keys.
  5. Create or locate an existing S3 bucket in the AWS Management Console. Ensure the bucket permissions allow data writing, and specify a folder path if needed.

Setting Up the AWS S3 Sink Connector

Using WebApp

  1. Log in to the GlassFlow WebApp and navigate to the "Pipelines" section.
  2. Create a new pipeline or open an existing one.
  3. Configure the Data Sink:
    • Choose "Amazon S3" as the data sink type.
    • Enter your Amazon Access Key ID, Secret Access Key, Region, S3 Bucket Name, and Folder Name.
  4. Click Next Step and confirm your pipeline data sink settings.

Using Python SDK

Try out and run Jupyter nobotebook example on the GitHub repo to integrate with Amazon S3 using Python SDK.

Once configured, your GlassFlow pipeline will start sending processed data directly to the specified Amazon S3 bucket and folder, enabling easy storage and retrieval.