This blog is to capture files from S3 [AWS] via Talend. There was a requirement to capture files comes in various patterns on clickstream data. Some files got common prefix [ ex Page, PageSummary, PageError].
Each tS3List:
- Uncheck "List all buckets objects"
- Provide your bucket name under "Bucket name", provide "Key prefix" as needed
In my case under the bucket, several directories. So used "directory_name/File Prefix"
This is how i could distinguish the above example of common prefix.
"directory_name/Page 2014-"
"directory_name/PageSummary"
"directory_name/PageError"
tS3Get:
Bucket: Provide your bucket name
Key: ((String)globalMap.get("tS3List_1_CURRENT_KEY")) -- See NO double quotes
File: ""/Users/shota/"+((String)globalMap.get("tS3List_1_CURRENT_KEY"))
PS: These all files are connected with a central S3 connection object.
Any question, please provide a comment.
- Uncheck "List all buckets objects"
- Provide your bucket name under "Bucket name", provide "Key prefix" as needed
In my case under the bucket, several directories. So used "directory_name/File Prefix"
This is how i could distinguish the above example of common prefix.
"directory_name/Page 2014-"
"directory_name/PageSummary"
"directory_name/PageError"
tS3Get:
Bucket: Provide your bucket name
Key: ((String)globalMap.get("tS3List_1_CURRENT_KEY")) -- See NO double quotes
File: ""/Users/shota/"+((String)globalMap.get("tS3List_1_CURRENT_KEY"))
PS: These all files are connected with a central S3 connection object.
Any question, please provide a comment.