Menu

TriageLength

Tomasz

Triage by length

The length tool selects reads based on the number of bases. It is useful for removing reads too short to align reliably.


Singe-end samples

To extract reads longer than, say, 40 from a single-end sample:

java -jar triagetools.jar length --length 40 -i allreads.txt.gz -o myreads.txt.gz

This will create one output file myreads-lengths-40-max.txt.gz. To obtain a full partitioning of the input, ie. one file with long reads, and another with the shorter ones, add the --all flag:

java -jar triagetools.jar length --length 40 --all -i allreads.txt.gz -o myreads.txt.gz

This will create two output files myreads-lengths-0-40.txt.gz and myreads-lengths-40-max.txt.gz.

To partition reads into multiple length bins:

java -jar triagetools.jar length --length 30,40 -i allreads.txt.gz -o myreads.txt.gz


Paired-end samples

Paired-end samples are processed using multiple -i and -o flags:

java -jar triagetools.jar length --length 40 -i allreads_1.txt.gz -i allreads_2.txt.gz 
    -o myreads_1.txt.gz -o myreads_2.txt.gz

In this case, a read pair is placed into a bin if either read in the pair is longer than the threshold.


Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.