Menu

filterbyname

etindall
2015-05-21
2015-07-22
  • etindall

    etindall - 2015-05-21

    Brian, I'm having some trouble with the 'filterbyname' function not removing any reads.
    The line of code I am running:
    filterbyname.sh in=$inputdir/read1.fastq in2=$inputdir/read2.fastq out=$outdir/read1filtered.fastq out2=$outdir/read2filtered.fastq names=$inputdir/FilterReads.txt
    The 'names' file is a txt file with 1x readname per line eg:
    @D00580:D00580:AC58BGANXX:2:1211:12654:10833
    @D00580:D00580:AC58BGANXX:2:2309:12469:49507
    The function seems to be running fine, and I can definitely locate these names in my original read1 and read2.fastq files, but they are not being removed from the filtered file. The 'Reads Processed' = 'Reads Out' and I can still grep these readnames in the filtered fastq files. Can you please let me know where I might be going wrong?
    Thank you.

     
  • Brian Bushnell

    Brian Bushnell - 2015-07-22

    "filterbyname" by default requires an exact name match. In fastq format, this line:
    @D00580:D00580:AC58BGANXX:2:1211:12654:10833
    ...indicates a read named:
    D00580:D00580:AC58BGANXX:2:1211:12654:10833

    In other words, the @ symbol is not part of the name. I will probably in the future allow a leading @ or > symbol as that seems to be a common use-case.

     

Log in to post a comment.

MongoDB Logo MongoDB