BBMap Activity

Activity for BBMap

1 month ago
BBMap released /BBMap_39.08.tar.gz
1 month ago
BBMap released /BBMap_39.08.tar.gz
3 months ago
Eric Boyden posted a comment on discussion General Discussion

I was looking for a java-based alternative to either cutadapt or ngmerge, and bbmerge seems to be able to do much of what both other tools can do (primarily I was looking for fixed read trimming, quality trimming, and paired read dovetail trimming). A few suggested improvements: * Allow simultaneous fixed trimming and q-trimming. Right now it seems like q-trimming supercedes fixed trimming if both are specified, regardless of which end you want to trim in each case; but there are situations where...
3 months ago
Eric Boyden posted a comment on discussion General Discussion

I was looking for a java-based global short-read aligner and this software performs excellently, with lots of neat features that many other aligners lack. In descending order of priority, here are some ideas for improvement: * Update SAM spec to v1.6 (currently 1.3/1.4), primarily to produce an updated @HD line to indicate query grouped output (GO:query). Many downstream tools (e.g. some fgbio tools) rely on this and won't work unless this is explicitly specified, and resorting just to add this annotation...
3 months ago
Brian Bushnell posted a comment on ticket #69

When I wrote that, PacBio did not have paired reads. They have a new sequencing machine now for short reads that I think does produce pairs but I have not seen any data for it so I'm not sure of the header structure.
3 months ago
Max Rozenblum posted a comment on ticket #69

I will check in with the lab, but my understanding is that these came from a NextSeq or NovaSeq and didn't have any modifications. Thanks for the quick response. I took a peak at FASTQ.java and saw the following code block: // Here we try to weed out PacBio, which will differ after the last slash: for (int i = idxSlash1 + 2; i < len1; i++) { if (id1.charAt(i) != id2.charAt(i)) { return false; } } I am using reformat.sh to do the following: - make sure reads are paired - count the number of reads/bases...
3 months ago
Brian Bushnell modified a comment on ticket #69

The problem here is that the read headers differ in two places. Normally, Illumina uses one of these two formats: @stuff/1 @stuff/2 or @stuff 1:morestuff @stuff 2:morestuff Of these, the /1 and /2 is obsolete for Illumina as far as I know, though Complete Genomics / BGI are adopting it. My effort to determine pairing is based on observation of Illumina data since there is no formal fastq specification regarding pair naming conventions, and they usually put the read identifier in the "optional description"....
3 months ago
Brian Bushnell posted a comment on ticket #69

The problem here is that the reads differ in two places. Normally, Illumina uses one of these two formats: @stuff/1 @stuff/2 or @stuff 1:morestuff @stuff 2:morestuff Of these, the /1 and /2 is obsolete for Illumina as far as I know, though Complete Genomics / BGI are adopting it. My effort to determine pairing is based on observation of Illumina data since there is no formal fastq specification regarding pair naming conventions, and they usually put the read identifier in the "optional description"....
3 months ago
Max Rozenblum modified a comment on ticket #69

I decided to subset my FASTQ to a single read so the files are more than manageable to demonstrate the issue: - bad_R* -> this FASTQ pair is the read as shown in the post above. This fails with vpair enabled. - bad-no-desc_R* -> this FASTQ pair is the same read where the optional description (text after the space) has been trimmed. This succeeds with vpair enabled. - bad-no-trail_R* -> this FASTQ pair is the same read except the /1 and /2 has been removed from the sequence identifier. This succeeds...
3 months ago
Max Rozenblum posted a comment on ticket #69

I decided to subset my FASTQ to a single read so the files are more than manageable to demonstrate the issue: - bad_R -> this FASTQ pair is the read as shown in the post above. This fails with vpair enabled. - bad-no-desc_R -> this FASTQ pair is the same read where the optional description (text after the space) has been trimmed. This succeeds with vpair enabled. - bad-no-trail_R -> this FASTQ pair is the same read except the /1 and /2 has been removed from the sequence identifier. This succeeds...
3 months ago
Max Rozenblum created ticket #69

reformat.sh vpair fails matching reads.
3 months ago
Max Rozenblum posted a comment on ticket #65

clarifying that I missed cq=f. This can be closed.
4 months ago
Robert Riley created ticket #68

Deprecate basecov files in pileup.sh
4 months ago
Marc Hoeppner created ticket #67

sendsketch does not obey proxy settings
6 months ago
Martin Mokrejs created ticket #66

Stitching read pair R1 andf R2 sequences based on alignment from SAM/BAM
7 months ago
Max Rozenblum created ticket #65

`refromat.sh` mode that does not change base quality scores
9 months ago
Priscilla Glenn modified a comment on discussion General Discussion

Hi, I recently read on biostars HERE that I could use your tools to determine the GC content of my reads. My reads are paired reads though and I wanted to adjust this to determine the GC content for each chromosome. I was able to manually split my bam file into the various chromosomes but am unsure how best to use reformat and stats on these files. When I tried to tell reformat that it was paired, and provide two output file names, it didn't work. It runs properly if I say they are unpaired but I'm...
9 months ago
Priscilla Glenn modified a comment on discussion General Discussion

Hi, I recently read on biostars [https://www.biostars.org/p/9546248/]HERE that I could use your tools to determine the GC content of my reads. My reads are paired reads though and I wanted to adjust this to determine the GC content for each chromosome. I was able to manually split my bam file into the various chromosomes but am unsure how best to use reformat and stats on these files. When I tried to tell reformat that it was paired, and provide two output file names, it didn't work. It runs properly...
9 months ago
Priscilla Glenn modified a comment on discussion General Discussion

Hi, I recently read on biostars [https://www.biostars.org/p/9546248/] that I could use your tools to determine the GC content of my reads. My reads are paired reads though and I wanted to adjust this to determine the GC content for each chromosome. I was able to manually split my bam file into the various chromosomes but am unsure how best to use reformat and stats on these files. When I tried to tell reformat that it was paired, and provide two output file names, it didn't work. It runs properly...
9 months ago
Priscilla Glenn posted a comment on discussion General Discussion

Hi, I recently read on biostars [https://www.biostars.org/p/9546248/]HERE that I could use your tools to determine the GC content of my reads. My reads are paired reads though and I wanted to adjust this to determine the GC content for each chromosome. I was able to manually split my bam file into the various chromosomes but am unsure how best to use reformat and stats on these files. When I tried to tell reformat that it was paired, and provide two output file names, it didn't work. It runs properly...
9 months ago
BBMap released /BBMap_39.06.tar.gz
9 months ago
BBMap released /BBMap_39.05.tar.gz
9 months ago
Brian Bushnell posted a comment on ticket #64

Good suggestion; I'm opening a new process for bgzip and piping the input. Shouldn't be too hard to catch the error code.
9 months ago
Max Rozenblum posted a comment on ticket #64

Another alternative solution for me might just be to run gzip -dct ${FASTQ_PATH} on each gzipped fastq I'd like to analyze. This will catch gzip corruption. However it might be useful to propagate errors like these directly through bbmap suite.
9 months ago
Max Rozenblum created ticket #64

Propagation of internal error codes
9 months ago
Brian Bushnell posted a comment on ticket #63

I was able to replicate this behavior and it is fixed for the next release (39.05, probably this week). Sometimes I don't notice this kind of issue because I always keep paired reads interleaved in a single file.
9 months ago
Max Rozenblum posted a comment on ticket #63

Though the issue above still stands. I realize that running repair.sh on paired FASTQs will toss the singletons and reorder the reads into the same order (though it appears to be randomly ordered relative to the two input FASTQs) as out and out2.
9 months ago
Max Rozenblum created ticket #63

sortbyname.sh not creating out2 even when specified
10 months ago
BBMap released /BBMap_39.04.tar.gz
10 months ago
C.S. created ticket #62

bbduk.sh with maq option and outs keeps low-quality mates
10 months ago
Sabrin Abdelghany created ticket #61

filtering all not microbial reads
10 months ago
Brian Bushnell posted a comment on ticket #60

Hi Pierre, Sorry about that, those two and a couple other versions of CrisprFinder made it into the release accidentally. Only CrisprFinder.java should be there. I'll delete them for the next release. Thanks for notifying me! -Brian
10 months ago
Pierre Gruet created ticket #60

Errors when compiling all Java classes in version 39.03
10 months ago
BBMap released /BBMap_39.03.tar.gz
11 months ago
BBMap released /BBMap_39.02.tar.gz
1 year ago
Eric Blanc created ticket #59

bbduk.sh fails with error"java.lang.NoSuchMethodError: java.lang.Process.isAlive()Z"
1 year ago
Jay Osvatic posted a comment on discussion General Discussion

Hello, I am currently trying to make a more reproducible workflow (removing weird paths to files mostly) and I am having trouble figuring out the shortcut that can be used for phiX174 as a reference for BBDuk. It looks like the adapters file can be called like this: "ref=adapters", eliminating the extra path before it. Is there a similar way to write the phiX174_ill.ref.fa.gz file? Thanks!
2 years ago
Zachary Foster created ticket #58

Error from `taxtree.sh`: No level found for strain
2 years ago
Zachary Foster created ticket #57

taxtree.sh usage guide outdated?
2 years ago
Joshua McCauley created ticket #56

dedupe.sh tries to access Index Integer.MAX_VALUE
2 years ago
BBMap released /BBMap_39.01.tar.gz
2 years ago
Jake posted a comment on discussion General Discussion

Seems like it is done with -10*math.log(sum([10**(-(ord(c)-33)/10) for c in line4])/len(line4),10) rather than just sum([ord(c)-33 for c in line4])/len(line4) So, its the quality score of the average probability.
2 years ago
BBMap released /BBMap_39.00.tar.gz
2 years ago
Jake posted a comment on discussion General Discussion

I've been using bbduk for quality filtering and I pretty much just took it for granted. However, I've been looking at it as I usually use minavequality=15 which removes about 15-20% of our reads. minavequality=30 removes almost 100% of them. However when I do the quality averaging by hand, almost all of my reads have an average quality above 30. I've been taking the ASCII values of the quality scores and subtracting 33, summing them up and dividing by the length. How is bbduk computing the average...
2 years ago
BBMap released /BBMap_38.99.tar.gz
2 years ago
BBMap released /BBMap_38.98.tar.gz
2 years ago
Muku posted a comment on ticket #54

Hi, I am having issues with java version as well. Did you figure out ? I am using HPC, no luck. I have also tried installing using conda environment. Still no luck . Let me know if you have figured out the java version . Thank you.
2 years ago
Muku created ticket #55

java.lang.ClassNotFoundException: align2.BBSplitter
2 years ago
Muku posted a comment on discussion General Discussion

I am trying to run bbsplit: bbsplit.sh build=1 threads=12 ref_x=$ref_Human ref_y=$ref_Mouse path=$indexded_ref I am running into following error: Error: Could not find or load main class align2.BBSplitter Caused by: java.lang.ClassNotFoundException: align2.BBSplitter srun: error: b001: task 0: Exited with exit code 1 Is there a way to solve this issue ? Thank you.
2 years ago
BBMap released /BBMap_38.97.tar.gz
2 years ago
darked90 created ticket #54

info about compatible java versions/benchmarks
2 years ago
Samuel Ruiz-Pérez modified a comment on ticket #25

Hi! Is this still the current chromosome size limit? I'm trying to use BBSplit to do a dual RNA seq analysis (human contamination) and the indexing part fails with a “The reference file appears to be empty” error too: BBSplit Error: Creating merged reference file ~/ref/genome/1/merged_ref_6286725898541.fa.gz Ref merge time: 133.394 seconds. Executing align2.BBMap [ow=t, fastareadlen=500, minhits=1, minratio=0.56, maxindel=20, qtrim=rl, untrim=t, trimq=6, in1=R1_001.fastq.gz, in2=R2_001.fastq.gz,...
2 years ago
Samuel Ruiz-Pérez posted a comment on ticket #25

Hi! Is this still the current chromosome size limit? I'm trying to use BBSplit to do a dual RNA seq analysis (human contamination) and the indexing part fails with a “The reference file appears to be empty” error too: BBSplit Error: Creating merged reference file ~/ref/genome/1/merged_ref_6286725898541.fa.gz Ref merge time: 133.394 seconds. Executing align2.BBMap [ow=t, fastareadlen=500, minhits=1, minratio=0.56, maxindel=20, qtrim=rl, untrim=t, trimq=6, in1=R1_001.fastq.gz, in2=R2_001.fastq.gz,...
2 years ago
Linton Freund created ticket #53

pileup crashing
2 years ago
BBMap released /BBMap_38.96.tar.gz
3 years ago
Tiffany Hu posted a comment on ticket #52

I forget to mention: we are trying to have the alignment run as quickly as possible, and we want to feed the alignment file into another function via a pipeline, which is why we need the tool to finish successfully.
3 years ago
Tiffany Hu created ticket #52

bbmap taking large amounts of time // freezing?
3 years ago
BBMap released /BBMap_38.95.tar.gz
3 years ago
Brian Bushnell posted a comment on ticket #51

Have you retried this recently? There was a problem with the configuration of the servers for a few weeks which should have been resolved on ~Jan 6th, but I'm not 100% sure it's fixed for everyone. Also, can you tell me which version of Java and BBTools you are using?
3 years ago
Joshua Kling created ticket #51

sendsketch error writing to server
3 years ago
Linton Freund modified a comment on discussion General Discussion
3 years ago
Linton Freund posted a comment on discussion General Discussion

I am currently using pileup.sh to determine the coverage of genes in my metagenomes by comparing my read mapping alignment file to the gene predictions of the contigs identified by Prodigal. I was wondering if anyone could tell me the output for the gene_coverage.tmp file, and if there are the same headers as the contig_comverage.tmp files? I am not sure if I should use the avgDepth value or the depthSum for my coverage value when I go to normalize these counts to counts per million (CPM). I am adding...
3 years ago
Linton Freund posted a comment on discussion General Discussion

I am currently using pileup.sh to determine the coverage of genes in my metagenomes by comparing my read mapping alignment file to the gene predictions of the contigs identified by Prodigal. I was wondering if anyone could tell me the output for the gene_coverage.tmp file, and if there are the same headers as the contig_comverage.tmp files? I am not sure if I should use the avgDepth value or the depthSum for my coverage value when I go to normalize these counts to counts per million (CPM). I am adding...
3 years ago
Britton created ticket #50

BBMap is unintentionally chopping up my sequences
3 years ago
jdv created ticket #49

bbduk: 'rename' doesn't work if 'mincovfraction' is set
3 years ago
BBMap released /BBMap_38.94.tar.gz
3 years ago
Silas Kieser posted a comment on ticket #48

I see the same problem when running only bbmap.sh ref=Human/SAMEA103958167/SAMEA103958167_contigs.fasta in1=Human/SAMEA103958167/sequence_quality_control/SAMEA103958167_QC_R1.fastq.gz in2=Human/SAMEA103958167/sequence_quality_control/SAMEA103958167_QC_R2.fastq.gz minid=0.9 nodisk out=mapped.sam -Xmx50g threads=5
3 years ago
Silas Kieser posted a comment on ticket #48

I set minid=0.9 and minratio was 0.56
3 years ago
Silas Kieser created ticket #48

Minid is not respected in bbwrap
3 years ago
BBMap released /BBMap_38.93.tar.gz
3 years ago
BBMap released /BBMap_38.92.tar.gz
3 years ago
BBMap released /BBMap_38.91.tar.gz
3 years ago
BBMap released /BBMap_38.91.tar.gz
3 years ago
benjbuch posted a comment on ticket #38

I have a probably related "issue" where mismatches to "real bases" are preferred over matches to "NNNN", something like: ACGTNNNNNNGGC (reference) ACGT––––––AT– (observed) ACGTAT––––––– (expected) So, I would need to change the gap extension penalty or some other parameters, which I don't see to be accessible in the user interface. (I have worked around it by deleting the 3' end for the moment.) However, I'm not sure if these parameters are at all accessible since BBMap uses a "convex gap penalty"...
3 years ago
Jordi Camps created ticket #47

Chastity filter and flags
3 years ago
Jordi Camps posted a comment on ticket #46

I'm having the same issue with currently latest version (38.90).
3 years ago
Kevin Bryan created ticket #46

calcmem.sh: invalid use of [ -v
3 years ago
BBMap released /BBMap_38.90.tar.gz
3 years ago
BBMap released /BBMap_38.89.tar.gz
3 years ago
BBMap released /BBMap_38.88.tar.gz
4 years ago
Jacob Nearing posted a comment on ticket #45

I would like to give an update to this. I found the issue it occurs when you pause the job and then try and resume it in the background using the "bg" command on a unix system.
4 years ago
Jacob Nearing created ticket #45

bbmap.sh identifying "broken reads" that are not broken
4 years ago
darked90 created ticket #44

reformat.sh
4 years ago
Andrew Schriefer created ticket #43

reformat.sh extin flag not working for bash process substitutions
4 years ago
Tania Holtzem created ticket #42

callvariants.sh & marked duplicated reads
4 years ago
BBMap released /BBMap_38.87.tar.gz
4 years ago
Samuel White posted a comment on ticket #41

To be clear, the values are swapped in all stats.sh output formats, not just format=7.
4 years ago
Samuel White created ticket #41

stats.sh: N50 and L50 values are switched
4 years ago
Shahar Rezenman posted a comment on discussion General Discussion

Hey all, First of all I have to say that I really appreciate all the work that has been done on BBmap, Thanks! I wondered if there is any option of adding a flag to allow mismatches in the indices sequences while demultiplexing. When looking into it I noticed that I can retreive many reads that have only one mismatch in one of indices and that's a shame throwing them away. When designing and choosing my indices I make sure the hamming distance is large enough so I am not concerned by accidently classifying...
4 years ago
Jordi Camps modified a comment on ticket #40

Just discovered the extin=fq parameter. But having reformat.sh in1=<(preprocess "$fq1") in2=<(preprocess "$fq2") extin=fq qin=33 ... does not solve the issue either: $ reformat.sh qin=33 extin=fq in1=<(<test_1.fastq) in2=<(<test_2.fastq) out=test.sam java -ea -Xms300m -cp /apps/BBMAP/38.57/bbmap/current/ jgi.ReformatReads qin=33 extin=fq in1=/dev/fd/63 in2=/dev/fd/62 out=test.sam Executing jgi.ReformatReads [qin=33, extin=fq, in1=/dev/fd/63, in2=/dev/fd/62, out=test.sam] Set INTERLEAVED to false...
4 years ago
Jordi Camps posted a comment on ticket #40

Just discovered the extin=fq parameter. But having reformat.sh in1=<(preprocess "$fq1").fq in2=<(preprocess "$fq2").fq extin=fq qin=33 ... does not solve the issue either: $ reformat.sh qin=33 extin=fq in1=<(<test_1.fastq) in2=<(<test_2.fastq) out=test.sam java -ea -Xms300m -cp /apps/BBMAP/38.57/bbmap/current/ jgi.ReformatReads qin=33 extin=fq in1=/dev/fd/63 in2=/dev/fd/62 out=test.sam Executing jgi.ReformatReads [qin=33, extin=fq, in1=/dev/fd/63, in2=/dev/fd/62, out=test.sam] Set INTERLEAVED to...
4 years ago
Jordi Camps created ticket #40

Multiple streams input
4 years ago
Chidera Nwosu created ticket #39

Understanding BBMAP output
4 years ago
G M created ticket #38

Change gap opening/extension penalty
4 years ago
Hjruscheweyh created ticket #37

Sequences lost in dedupe.sh
4 years ago
BBMap released /BBMap_38.86.tar.gz
4 years ago
jnylander created ticket #36

Warning against using phylip2fasta.sh
4 years ago
J C created ticket #35

Internal Evolutionary Model for BBMap(?)
4 years ago
BBMap released /BBMap_38.84.tar.gz

1 >