This hadoop streaming jar can recognize plain text intput in fa or fq format and convert a logical unit to a single line for later hadoop streaming application.
All the java codes are in src folder, they can be compiled and added to a jar by using update-jar.sh in tools folder. After running update-jar.sh, there will be a GaeaStreaming.jar in tools folder.
These classes are complied and added to original hadoop streaming jar:
contrib/streaming/hadoop-0.20.2-streaming.jar.
User can use this new jar instead of original jar. Readme
This hadoop streaming jar can recognize plain text intput in fa or fq format and convert a logical unit to a single line for later hadoop streaming application.
All the java codes are in src folder, they can be compiled and added to a jar by using update-jar.sh in tools folder. After running update-jar.sh, there will be a GaeaStreaming.jar in tools folder.
These classes are complied and added to original hadoop streaming jar:
contrib/streaming/hadoop-0.20.2-streaming.jar.
User can use this new jar instead of original jar.
Readme
Source code
Last edit: luoxuefeng1 2012-06-01