flexible barcode and adapter removal for sequencing platforms
Flexbar preprocesses high-throughput sequencing data efficiently. It demultiplexes barcoded runs and removes adapter sequences. Moreover, trimming and filtering features are provided. Flexbar increases mapping rates and improves genome and transcriptome assemblies. It supports next-generation sequencing data in fasta/q and csfasta/q format from Illumina, Roche 454, and the SOLiD platform.
Parameter names changed in Flexbar. Please review scripts. The recent months, default settings were optimised, several bugs were fixed and various improvements were made, e.g. revamped command-line interface, new trimming modes as well as lower time and memory consumption.
Matthias Dodt, Johannes T. Roehr, Rina Ahmed, Christoph Dieterich: Flexbar — flexible barcode and adapter processing for next-generation sequencing platforms. MDPI Biology 2012, 1(3):895-905.
- Demultiplexing of barcoded sequencing runs
- Detection and removal of adapter sequences
- Exact global alignment with free end-gaps
- Paired reads and separate barcode reads
- Color and letter space sequencing data
- Wildcard N for barcodes and adapters
- Filtering reads with uncalled bases
- Trimming based on phred quality scores
- Extensive logging features, e.g. alignments
- Detailed documentation on manual page
- Galaxy tool definition available in Tool Shed
- Sequence analysis based on SeqAn library
- Multi-threaded computation using TBB library
There a problem when I Used version2.0, The parameter of --cut-off don't work;but the version 1.8 have no such problem;do I have someting wrong understand with the program。 your sincerely Fei
Output files are truncated when processing sanger-fastq using this command-line: flexbar -n 32 -t flexbar-3 -f fastq-sanger -s 121025_0005_000000000-A2056_1.sanfastq -a linker3.fasta -ae RIGHT -at 1 -aa -m 15 >flexbar-3.fastq.log 2>&1 I've not looked at the sources, but it might be a failure to flush the ouput stream.
fast download and works, recommended.
I like your program, but the reporting of the stats is a little confusing for Pair end data. It is not easy to understand how many PAIRS you have removed based on it not passing one of the filters. If one of the pairs contains an N, you report 1 read (of two total) containing an N, but you also remove the other reads (as you should) but you don't report this number. So if BOTH reads contained the N, you ALSO report 1 for the Discard N but still only remove 1 read pair....Even though there are 2 reads with an N Also, it seems that the USED field is always the same as the READS IN FILE field...I would have expected the USED field to be READS IN FILE minus DISCARD_B minus DISCARD_QUAL Or am I missing something ?
theflexibleadap works perfectly.