Activity for BBMap

  • BBMap BBMap released /BBMap_39.08.tar.gz

  • BBMap BBMap released /BBMap_39.08.tar.gz

  • Eric Boyden Eric Boyden posted a comment on discussion General Discussion

    I was looking for a java-based alternative to either cutadapt or ngmerge, and bbmerge seems to be able to do much of what both other tools can do (primarily I was looking for fixed read trimming, quality trimming, and paired read dovetail trimming). A few suggested improvements: * Allow simultaneous fixed trimming and q-trimming. Right now it seems like q-trimming supercedes fixed trimming if both are specified, regardless of which end you want to trim in each case; but there are situations where...

  • Eric Boyden Eric Boyden posted a comment on discussion General Discussion

    I was looking for a java-based global short-read aligner and this software performs excellently, with lots of neat features that many other aligners lack. In descending order of priority, here are some ideas for improvement: * Update SAM spec to v1.6 (currently 1.3/1.4), primarily to produce an updated @HD line to indicate query grouped output (GO:query). Many downstream tools (e.g. some fgbio tools) rely on this and won't work unless this is explicitly specified, and resorting just to add this annotation...

  • Brian Bushnell Brian Bushnell posted a comment on ticket #69

    When I wrote that, PacBio did not have paired reads. They have a new sequencing machine now for short reads that I think does produce pairs but I have not seen any data for it so I'm not sure of the header structure.

  • Max Rozenblum Max Rozenblum posted a comment on ticket #69

    I will check in with the lab, but my understanding is that these came from a NextSeq or NovaSeq and didn't have any modifications. Thanks for the quick response. I took a peak at FASTQ.java and saw the following code block: // Here we try to weed out PacBio, which will differ after the last slash: for (int i = idxSlash1 + 2; i < len1; i++) { if (id1.charAt(i) != id2.charAt(i)) { return false; } } I am using reformat.sh to do the following: - make sure reads are paired - count the number of reads/bases...

  • Brian Bushnell Brian Bushnell modified a comment on ticket #69

    The problem here is that the read headers differ in two places. Normally, Illumina uses one of these two formats: @stuff/1 @stuff/2 or @stuff 1:morestuff @stuff 2:morestuff Of these, the /1 and /2 is obsolete for Illumina as far as I know, though Complete Genomics / BGI are adopting it. My effort to determine pairing is based on observation of Illumina data since there is no formal fastq specification regarding pair naming conventions, and they usually put the read identifier in the "optional description"....

  • Brian Bushnell Brian Bushnell posted a comment on ticket #69

    The problem here is that the reads differ in two places. Normally, Illumina uses one of these two formats: @stuff/1 @stuff/2 or @stuff 1:morestuff @stuff 2:morestuff Of these, the /1 and /2 is obsolete for Illumina as far as I know, though Complete Genomics / BGI are adopting it. My effort to determine pairing is based on observation of Illumina data since there is no formal fastq specification regarding pair naming conventions, and they usually put the read identifier in the "optional description"....

  • Max Rozenblum Max Rozenblum modified a comment on ticket #69

    I decided to subset my FASTQ to a single read so the files are more than manageable to demonstrate the issue: - bad_R* -> this FASTQ pair is the read as shown in the post above. This fails with vpair enabled. - bad-no-desc_R* -> this FASTQ pair is the same read where the optional description (text after the space) has been trimmed. This succeeds with vpair enabled. - bad-no-trail_R* -> this FASTQ pair is the same read except the /1 and /2 has been removed from the sequence identifier. This succeeds...

  • Max Rozenblum Max Rozenblum posted a comment on ticket #69

    I decided to subset my FASTQ to a single read so the files are more than manageable to demonstrate the issue: - bad_R -> this FASTQ pair is the read as shown in the post above. This fails with vpair enabled. - bad-no-desc_R -> this FASTQ pair is the same read where the optional description (text after the space) has been trimmed. This succeeds with vpair enabled. - bad-no-trail_R -> this FASTQ pair is the same read except the /1 and /2 has been removed from the sequence identifier. This succeeds...

  • Max Rozenblum Max Rozenblum created ticket #69

    reformat.sh vpair fails matching reads.

  • Max Rozenblum Max Rozenblum posted a comment on ticket #65

    clarifying that I missed cq=f. This can be closed.

  • Robert Riley Robert Riley created ticket #68

    Deprecate basecov files in pileup.sh

  • Marc Hoeppner Marc Hoeppner created ticket #67

    sendsketch does not obey proxy settings

  • Martin Mokrejs Martin Mokrejs created ticket #66

    Stitching read pair R1 andf R2 sequences based on alignment from SAM/BAM

  • Max Rozenblum Max Rozenblum created ticket #65

    `refromat.sh` mode that does not change base quality scores

  • Priscilla Glenn Priscilla Glenn modified a comment on discussion General Discussion

    Hi, I recently read on biostars HERE that I could use your tools to determine the GC content of my reads. My reads are paired reads though and I wanted to adjust this to determine the GC content for each chromosome. I was able to manually split my bam file into the various chromosomes but am unsure how best to use reformat and stats on these files. When I tried to tell reformat that it was paired, and provide two output file names, it didn't work. It runs properly if I say they are unpaired but I'm...

  • Priscilla Glenn Priscilla Glenn modified a comment on discussion General Discussion

    Hi, I recently read on biostars [https://www.biostars.org/p/9546248/]HERE that I could use your tools to determine the GC content of my reads. My reads are paired reads though and I wanted to adjust this to determine the GC content for each chromosome. I was able to manually split my bam file into the various chromosomes but am unsure how best to use reformat and stats on these files. When I tried to tell reformat that it was paired, and provide two output file names, it didn't work. It runs properly...

  • Priscilla Glenn Priscilla Glenn modified a comment on discussion General Discussion

    Hi, I recently read on biostars [https://www.biostars.org/p/9546248/] that I could use your tools to determine the GC content of my reads. My reads are paired reads though and I wanted to adjust this to determine the GC content for each chromosome. I was able to manually split my bam file into the various chromosomes but am unsure how best to use reformat and stats on these files. When I tried to tell reformat that it was paired, and provide two output file names, it didn't work. It runs properly...

  • Priscilla Glenn Priscilla Glenn posted a comment on discussion General Discussion

    Hi, I recently read on biostars [https://www.biostars.org/p/9546248/]HERE that I could use your tools to determine the GC content of my reads. My reads are paired reads though and I wanted to adjust this to determine the GC content for each chromosome. I was able to manually split my bam file into the various chromosomes but am unsure how best to use reformat and stats on these files. When I tried to tell reformat that it was paired, and provide two output file names, it didn't work. It runs properly...

  • BBMap BBMap released /BBMap_39.06.tar.gz

  • BBMap BBMap released /BBMap_39.05.tar.gz

  • Brian Bushnell Brian Bushnell posted a comment on ticket #64

    Good suggestion; I'm opening a new process for bgzip and piping the input. Shouldn't be too hard to catch the error code.

  • Max Rozenblum Max Rozenblum posted a comment on ticket #64

    Another alternative solution for me might just be to run gzip -dct ${FASTQ_PATH} on each gzipped fastq I'd like to analyze. This will catch gzip corruption. However it might be useful to propagate errors like these directly through bbmap suite.

  • Max Rozenblum Max Rozenblum created ticket #64

    Propagation of internal error codes

  • Brian Bushnell Brian Bushnell posted a comment on ticket #63

    I was able to replicate this behavior and it is fixed for the next release (39.05, probably this week). Sometimes I don't notice this kind of issue because I always keep paired reads interleaved in a single file.

  • Max Rozenblum Max Rozenblum posted a comment on ticket #63

    Though the issue above still stands. I realize that running repair.sh on paired FASTQs will toss the singletons and reorder the reads into the same order (though it appears to be randomly ordered relative to the two input FASTQs) as out and out2.

  • Max Rozenblum Max Rozenblum created ticket #63

    sortbyname.sh not creating out2 even when specified

  • BBMap BBMap released /BBMap_39.04.tar.gz

  • C.S. C.S. created ticket #62

    bbduk.sh with maq option and outs keeps low-quality mates

  • Sabrin Abdelghany Sabrin Abdelghany created ticket #61

    filtering all not microbial reads

  • Brian Bushnell Brian Bushnell posted a comment on ticket #60

    Hi Pierre, Sorry about that, those two and a couple other versions of CrisprFinder made it into the release accidentally. Only CrisprFinder.java should be there. I'll delete them for the next release. Thanks for notifying me! -Brian

  • Pierre Gruet Pierre Gruet created ticket #60

    Errors when compiling all Java classes in version 39.03

  • BBMap BBMap released /BBMap_39.03.tar.gz

  • BBMap BBMap released /BBMap_39.02.tar.gz

  • Eric Blanc Eric Blanc created ticket #59

    bbduk.sh fails with error"java.lang.NoSuchMethodError: java.lang.Process.isAlive()Z"

  • Jay Osvatic Jay Osvatic posted a comment on discussion General Discussion

    Hello, I am currently trying to make a more reproducible workflow (removing weird paths to files mostly) and I am having trouble figuring out the shortcut that can be used for phiX174 as a reference for BBDuk. It looks like the adapters file can be called like this: "ref=adapters", eliminating the extra path before it. Is there a similar way to write the phiX174_ill.ref.fa.gz file? Thanks!

  • Zachary Foster Zachary Foster created ticket #58

    Error from `taxtree.sh`: No level found for strain

  • Zachary Foster Zachary Foster created ticket #57

    taxtree.sh usage guide outdated?

  • Joshua McCauley Joshua McCauley created ticket #56

    dedupe.sh tries to access Index Integer.MAX_VALUE

  • BBMap BBMap released /BBMap_39.01.tar.gz

  • Jake Jake posted a comment on discussion General Discussion

    Seems like it is done with -10*math.log(sum([10**(-(ord(c)-33)/10) for c in line4])/len(line4),10) rather than just sum([ord(c)-33 for c in line4])/len(line4) So, its the quality score of the average probability.

  • BBMap BBMap released /BBMap_39.00.tar.gz

  • Jake Jake posted a comment on discussion General Discussion

    I've been using bbduk for quality filtering and I pretty much just took it for granted. However, I've been looking at it as I usually use minavequality=15 which removes about 15-20% of our reads. minavequality=30 removes almost 100% of them. However when I do the quality averaging by hand, almost all of my reads have an average quality above 30. I've been taking the ASCII values of the quality scores and subtracting 33, summing them up and dividing by the length. How is bbduk computing the average...

  • BBMap BBMap released /BBMap_38.99.tar.gz

  • BBMap BBMap released /BBMap_38.98.tar.gz

  • Muku Muku posted a comment on ticket #54

    Hi, I am having issues with java version as well. Did you figure out ? I am using HPC, no luck. I have also tried installing using conda environment. Still no luck . Let me know if you have figured out the java version . Thank you.

  • Muku Muku created ticket #55

    java.lang.ClassNotFoundException: align2.BBSplitter

  • Muku Muku posted a comment on discussion General Discussion

    I am trying to run bbsplit: bbsplit.sh build=1 threads=12 ref_x=$ref_Human ref_y=$ref_Mouse path=$indexded_ref I am running into following error: Error: Could not find or load main class align2.BBSplitter Caused by: java.lang.ClassNotFoundException: align2.BBSplitter srun: error: b001: task 0: Exited with exit code 1 Is there a way to solve this issue ? Thank you.

  • BBMap BBMap released /BBMap_38.97.tar.gz

  • darked90 darked90 created ticket #54

    info about compatible java versions/benchmarks

  • Samuel Ruiz-Pérez Samuel Ruiz-Pérez modified a comment on ticket #25

    Hi! Is this still the current chromosome size limit? I'm trying to use BBSplit to do a dual RNA seq analysis (human contamination) and the indexing part fails with a “The reference file appears to be empty” error too: BBSplit Error: Creating merged reference file ~/ref/genome/1/merged_ref_6286725898541.fa.gz Ref merge time: 133.394 seconds. Executing align2.BBMap [ow=t, fastareadlen=500, minhits=1, minratio=0.56, maxindel=20, qtrim=rl, untrim=t, trimq=6, in1=R1_001.fastq.gz, in2=R2_001.fastq.gz,...

  • Samuel Ruiz-Pérez Samuel Ruiz-Pérez posted a comment on ticket #25

    Hi! Is this still the current chromosome size limit? I'm trying to use BBSplit to do a dual RNA seq analysis (human contamination) and the indexing part fails with a “The reference file appears to be empty” error too: BBSplit Error: Creating merged reference file ~/ref/genome/1/merged_ref_6286725898541.fa.gz Ref merge time: 133.394 seconds. Executing align2.BBMap [ow=t, fastareadlen=500, minhits=1, minratio=0.56, maxindel=20, qtrim=rl, untrim=t, trimq=6, in1=R1_001.fastq.gz, in2=R2_001.fastq.gz,...

  • Linton Freund Linton Freund created ticket #53

    pileup crashing

  • BBMap BBMap released /BBMap_38.96.tar.gz

  • Tiffany Hu Tiffany Hu posted a comment on ticket #52

    I forget to mention: we are trying to have the alignment run as quickly as possible, and we want to feed the alignment file into another function via a pipeline, which is why we need the tool to finish successfully.

  • Tiffany Hu Tiffany Hu created ticket #52

    bbmap taking large amounts of time // freezing?

  • BBMap BBMap released /BBMap_38.95.tar.gz

  • Brian Bushnell Brian Bushnell posted a comment on ticket #51

    Have you retried this recently? There was a problem with the configuration of the servers for a few weeks which should have been resolved on ~Jan 6th, but I'm not 100% sure it's fixed for everyone. Also, can you tell me which version of Java and BBTools you are using?

  • Joshua Kling Joshua Kling created ticket #51

    sendsketch error writing to server

  • Linton Freund Linton Freund modified a comment on discussion General Discussion

  • Linton Freund Linton Freund posted a comment on discussion General Discussion

    I am currently using pileup.sh to determine the coverage of genes in my metagenomes by comparing my read mapping alignment file to the gene predictions of the contigs identified by Prodigal. I was wondering if anyone could tell me the output for the gene_coverage.tmp file, and if there are the same headers as the contig_comverage.tmp files? I am not sure if I should use the avgDepth value or the depthSum for my coverage value when I go to normalize these counts to counts per million (CPM). I am adding...

  • Linton Freund Linton Freund posted a comment on discussion General Discussion

    I am currently using pileup.sh to determine the coverage of genes in my metagenomes by comparing my read mapping alignment file to the gene predictions of the contigs identified by Prodigal. I was wondering if anyone could tell me the output for the gene_coverage.tmp file, and if there are the same headers as the contig_comverage.tmp files? I am not sure if I should use the avgDepth value or the depthSum for my coverage value when I go to normalize these counts to counts per million (CPM). I am adding...

  • Britton Britton created ticket #50

    BBMap is unintentionally chopping up my sequences

  • jdv jdv created ticket #49

    bbduk: 'rename' doesn't work if 'mincovfraction' is set

  • BBMap BBMap released /BBMap_38.94.tar.gz

  • Silas Kieser Silas Kieser posted a comment on ticket #48

    I see the same problem when running only bbmap.sh ref=Human/SAMEA103958167/SAMEA103958167_contigs.fasta in1=Human/SAMEA103958167/sequence_quality_control/SAMEA103958167_QC_R1.fastq.gz in2=Human/SAMEA103958167/sequence_quality_control/SAMEA103958167_QC_R2.fastq.gz minid=0.9 nodisk out=mapped.sam -Xmx50g threads=5

  • Silas Kieser Silas Kieser posted a comment on ticket #48

    I set minid=0.9 and minratio was 0.56

  • Silas Kieser Silas Kieser created ticket #48

    Minid is not respected in bbwrap

  • BBMap BBMap released /BBMap_38.93.tar.gz

  • BBMap BBMap released /BBMap_38.92.tar.gz

  • BBMap BBMap released /BBMap_38.91.tar.gz

  • BBMap BBMap released /BBMap_38.91.tar.gz

  • benjbuch benjbuch posted a comment on ticket #38

    I have a probably related "issue" where mismatches to "real bases" are preferred over matches to "NNNN", something like: ACGTNNNNNNGGC (reference) ACGT––––––AT– (observed) ACGTAT––––––– (expected) So, I would need to change the gap extension penalty or some other parameters, which I don't see to be accessible in the user interface. (I have worked around it by deleting the 3' end for the moment.) However, I'm not sure if these parameters are at all accessible since BBMap uses a "convex gap penalty"...

  • Jordi Camps Jordi Camps created ticket #47

    Chastity filter and flags

  • Jordi Camps Jordi Camps posted a comment on ticket #46

    I'm having the same issue with currently latest version (38.90).

  • Kevin Bryan Kevin Bryan created ticket #46

    calcmem.sh: invalid use of [ -v

  • BBMap BBMap released /BBMap_38.90.tar.gz

  • BBMap BBMap released /BBMap_38.89.tar.gz

  • BBMap BBMap released /BBMap_38.88.tar.gz

  • Jacob Nearing Jacob Nearing posted a comment on ticket #45

    I would like to give an update to this. I found the issue it occurs when you pause the job and then try and resume it in the background using the "bg" command on a unix system.

  • Jacob Nearing Jacob Nearing created ticket #45

    bbmap.sh identifying "broken reads" that are not broken

  • darked90 darked90 created ticket #44

    reformat.sh

  • Andrew Schriefer Andrew Schriefer created ticket #43

    reformat.sh extin flag not working for bash process substitutions

  • Tania Holtzem Tania Holtzem created ticket #42

    callvariants.sh & marked duplicated reads

  • BBMap BBMap released /BBMap_38.87.tar.gz

  • Samuel White Samuel White posted a comment on ticket #41

    To be clear, the values are swapped in all stats.sh output formats, not just format=7.

  • Samuel White Samuel White created ticket #41

    stats.sh: N50 and L50 values are switched

  • Shahar Rezenman Shahar Rezenman posted a comment on discussion General Discussion

    Hey all, First of all I have to say that I really appreciate all the work that has been done on BBmap, Thanks! I wondered if there is any option of adding a flag to allow mismatches in the indices sequences while demultiplexing. When looking into it I noticed that I can retreive many reads that have only one mismatch in one of indices and that's a shame throwing them away. When designing and choosing my indices I make sure the hamming distance is large enough so I am not concerned by accidently classifying...

  • Jordi Camps Jordi Camps modified a comment on ticket #40

    Just discovered the extin=fq parameter. But having reformat.sh in1=<(preprocess "$fq1") in2=<(preprocess "$fq2") extin=fq qin=33 ... does not solve the issue either: $ reformat.sh qin=33 extin=fq in1=<(<test_1.fastq) in2=<(<test_2.fastq) out=test.sam java -ea -Xms300m -cp /apps/BBMAP/38.57/bbmap/current/ jgi.ReformatReads qin=33 extin=fq in1=/dev/fd/63 in2=/dev/fd/62 out=test.sam Executing jgi.ReformatReads [qin=33, extin=fq, in1=/dev/fd/63, in2=/dev/fd/62, out=test.sam] Set INTERLEAVED to false...

  • Jordi Camps Jordi Camps posted a comment on ticket #40

    Just discovered the extin=fq parameter. But having reformat.sh in1=<(preprocess "$fq1").fq in2=<(preprocess "$fq2").fq extin=fq qin=33 ... does not solve the issue either: $ reformat.sh qin=33 extin=fq in1=<(<test_1.fastq) in2=<(<test_2.fastq) out=test.sam java -ea -Xms300m -cp /apps/BBMAP/38.57/bbmap/current/ jgi.ReformatReads qin=33 extin=fq in1=/dev/fd/63 in2=/dev/fd/62 out=test.sam Executing jgi.ReformatReads [qin=33, extin=fq, in1=/dev/fd/63, in2=/dev/fd/62, out=test.sam] Set INTERLEAVED to...

  • Jordi Camps Jordi Camps created ticket #40

    Multiple streams input

  • Chidera Nwosu Chidera Nwosu created ticket #39

    Understanding BBMAP output

  • G M G M created ticket #38

    Change gap opening/extension penalty

  • Hjruscheweyh Hjruscheweyh created ticket #37

    Sequences lost in dedupe.sh

  • BBMap BBMap released /BBMap_38.86.tar.gz

  • jnylander jnylander created ticket #36

    Warning against using phylip2fasta.sh

  • J C J C created ticket #35

    Internal Evolutionary Model for BBMap(?)

  • BBMap BBMap released /BBMap_38.84.tar.gz

1 >