Oracle-to-S3 data uploader Wiki
Ad-hoc data uploader for Oracle and Amazon-S3
Brought to you by:
alexbuz
Welcome to your wiki!
This is the default page, edit it as you see fit. To add a new page simply reference it within brackets, e.g.: [SamplePage].
The wiki uses Markdown syntax.
Oracle-to-S3 data uploader.
Let's you stream your Oracle table/query data to Amazon-S3 from Windows CLI (command line).
Features:
- Streams Oracle table data to Amazon-S3.
- No need to create CSV extracts before upload to S3.
- Data stream is compressed while upload to S3.
- No need for Amazon AWS CLI.
- Works from your OS Windows desktop (command line).
- It's executable (Oracle_To_S3_Uploader.exe) - no need for Python install.
- It's 64 bit - it will work on any vanilla DOS for 64-bit Windows.
- AWS Access Keys are not passed as arguments.
- Written using Python/boto/PyInstaller.
Version
Purpose
How it works
Audience
Database/ETL developers, Data Integrators, Data Engineers, Business Analysts, AWS Developers, DevOps,
Designated Environment
Pre-Prod (UAT/QA/DEV)
Usage
Example
Environment variables
set_env.bat:
Test upload with data dump.
In this example complete table
test2
get's uploaded to Aamzon-S3 as compressed CSV file.Contents of the file table_query.sql:
Also temporary dump file is created for analysis (by default there are no files created)
Use
-s, --create_data_dump
to dump streamed data.If target bucket does not exists it will be created in user controlled region.
Use argument
-t, --s3_location
to set target region nameContents of the file test.bat:
Executing
test.bat
:Download
git clone https://github.com/alexbuz/Oracle_To_S3_Data_Uploader
oracle_to_s3_uploader 1.2
FAQ
Can it load Oracle data to Amazon S3 file?
Yes, it is the main purpose of this tool.
Can developers integrate
Oracle_To_S3_Data_Uploader
into their ETL pipelines?Yes. Assuming they are doing it on OS Windows.
How fast is data upload using
CSV Loader for Redshift
?As fast as any implementation of multi-part load using Python and boto.
How to inscease upload speed?
Input data stream is getting compressed before upload to S3. So not much could be done here.
You may want to run it closer to source or target for better performance.
What are the other ways to move large amounts of data from Oracle to S3?
You can write a sqoop script that can be scheduled as an 'EMR Activity' under Data Pipeline.
Does it create temporary data file to facilitate data load to S3?
No
Can I log transfered data for analysis?
Yes, Use
-s, --create_data_dump
to dump streamed data.Explain first step of data transfer?
The query file you provided is used to select data form target Oracle server.
Stream is compressed before load to S3.
Explain second step of data transfer?
Compressed data is getting uploaded to S3 using multipart upload protocol.
What technology was used to create this tool
I used SQLPlus, Python, Boto to write it.
Boto is used to upload file to S3.
SQLPlus is used to spool data to compressor pipe.
Where are the sources?
Please, contact me for sources.
Can you modify functionality and add features?
Yes, please, ask me for new features.
Do you have any AWS Certifications?
Yes, AWS Certified Developer (Associate)
Can you create similar/custom data tool for our business?
Yes, you can PM me here or email at
alex_buz@yahoo.com
.I'll get back to you within hours.
Links
Last edit: Alex Buzunov 2016-04-07