Skip to main content

Copy Objects between S3 Buckets using AWS Lambda

In this article we will use AWS Lambda service to copy objects/files from one S3 bucket to another. Below are the steps we will follow in order to do that:

  1. Create two buckets in S3 for source and destination.
  2. Create an IAM role and policy which can read and write to buckets. 
  3. Create a Lamdba function to copy the objects between buckets.
  4. Assign IAM role to the Lambda function.
  5. Create an S3 event trigger to execute the Lambda function. 

1. Create S3 Buckets:

Create 2 buckets in S3 for source and destination. You can refer to my previous post for the steps about creating a S3 bucket. I have created the buckets highlighted in blue below, that I will be using in this example:



2. Create IAM Policy and Role:

Now go to Security -> IAM (Identity and Access Management).
  • Click on Policies -> Create policy
  • Click on JSON tab and enter below lines. You will need to modify line number 10 and 18 with the source and destination buckets that you have created.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::source-bucket104/*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::dest-bucket104/*"
        }
    ]
}




In the above declaration, I am creating a policy with read permissions "GetObject" to source bucket "source-bucket104" and write permissions "PutObject" to destination bucket "dest-bucket104".
  • Click on Review policy.
  • Provide a name for your policy and click "Create Policy".
  • Now click on Roles -> Create role
  • Under "Select type of trusted entity", select "AWS Service"
  • Select S3 service
  • Click "Next: Permissions"
  • Now attach the policy that you created in previous step to this role by selecting the checkbox next to the policy name and click next.

  • Click Next on the add tags screen.
  • Provide a name for the role and click "Create role".
3. Create Lambda Function:
  • Go to Services -> Compute -> Lambda
  • Click "Create function"
  • Provide a name of the function.
  • Select runtime as "Python 3.8"
  • Under "Permissions", click on "Choose or create an execution role".
  • From the drop down list choose the role that was created in previous step.
  • Click "Use an existing role".
  • Click "Create function"

  • Under the function code, type the below code:
    import json
    import boto3
    s31 = boto3.client("s3")
    def lambda_handler(event, context):
   dest_bucket = 'dest-bucket104'
   src_bucket = event['Records'][0]['s3']['bucket']['name']
   filename = event['Records'][0]['s3']['object']['key']
   copy_source = {'Bucket': src_bucket, 'Key': filename}
   s31.copy_object(CopySource=copy_source, Bucket=dest_bucket, Key=filename)
   return {
       'statusCode': 200,
       'body': json.dumps('Files Copied')
   }



4. Create Trigger:
  • Click "Add Trigger"
  • In Trigger configuration, select S3.
  • Under bucket "Select the Source bucket"
  • Under "Event Type", select "Put"
  • Click "Add" 

  • Click Save
Lambda function should look like below:


Test your Lambda function:

Do a test by uploading a file to the source S3 bucket. If all the configurations are correct, the file(s) should be copied to the destination bucket.

Comments

Popular posts from this blog

Load records from csv file in S3 file to RDS MySQL database using AWS Data Pipeline

 In this post we will see how to create a data pipeline in AWS which picks data from S3 csv file and inserts records in RDS MySQL table.  I am using below csv file which contains a list of passengers. CSV Data stored in the file Passenger.csv Upload Passenger.csv file to S3 bucket using AWS ClI In below screenshot I am connecting the RDS MySQL instance I have created in AWS and the definition of the table that I have created in the database testdb. Once we have uploaded the csv file we will create the data pipeline. There are 2 ways to create the pipeline.  Using "Import Definition" option under AWS console.                    We can use import definition option while creating the new pipeline. This would need a json file which contains the definition of the pipeline in the json format. You can use my Github link below to download the JSON definition: JSON Definition to create the Data Pipeline Using "Edit Architect" ...

How to check progress of dbcc shrinkfile

  Query to check progress of dbcc shrinkfile select s.session_id,command,t.text,percent_complete,s.start_time from sys.dm_exec_requests s  CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) t where command like '%Dbcc%'

Unix Bash Script

Bash Script that will recursively find all files under a directory and also under sub-directories. It will print all the files and also show the count for the words inside the files. count_words.sh for i in `find $1 -name "*" -type f` do wc -w $i done count_words.sh <directory_name>