BBMap / Tickets / #25 Can't build index: "The reference file appears to be empty."

#25 Can't build index: "The reference file appears to be empty."

Milestone: 1.0

Status: open

Owner: nobody

Labels: None

Updated: 2022-05-17

Created: 2019-10-29

Creator: Brook Byrns

Private: No

I'm trying to create a reference index with the following command (openjdk version "1.8.0_232"):
bbmap.sh ref=Lancer.fixed.fasta

And I receive the following error:
java -ea -Xmx414496m -Xms414496m -cp /home/brook/src/bbmap/current/ align2.BBMap build=1 overwrite=true fastareadlen=500 ref=Lancer.fixed.fasta
Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, ref=Lancer.fixed.fasta]
Version 38.70

No output file.
Writing reference.
Executing dna.FastaToChromArrays2 [Lancer.fixed.fasta, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stoppad=8000, nodisk=false]

Set genScaffoldInfo=true
Set genome to 1

Loaded Reference: 0.009 seconds.
Exception in thread "main" java.lang.AssertionError: 1, 0
The reference file appears to be empty.
at align2.BBIndex.loadIndex(BBIndex.java:96)
at align2.BBMap.loadIndex(BBMap.java:372)
at align2.BBMap.main(BBMap.java:33)

I am able to run the same command with the phix reference fasta included with BBmap, but I cannot spot any relevant differences between the example fasta and my fasta. The fasta I am using contains 22 (wheat) chromosomes with the names (chr1A, chr1B, chr1D... etc). Any help would be appreciated.

Discussion

Brian Bushnell - 2019-10-29

Hi - I'm really sorry about that, but BBMap does not support Wheat as it has a chromosome longer than 500Mbp, the current limit. It's the only organism I'm aware of that has this issue. I'll try to clarify the error message. You could break the chromosome at the centromere, but you're probably better off using a different aligner.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Mattias - 2020-03-23
  
  I am running into the same issue. The chromosome of wheat go up to 830 Mp. Would it be possible to incease the limit with a command?
  A lot of plant genomes have chromosomes of that size, see https://en.wikipedia.org/wiki/List_of_sequenced_plant_genomes and https://www.researchgate.net/publication/321833590_List_of_plant_genome_sequenced_with_genome_size_and_chromosome_numbers
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Samuel Ruiz-Pérez - 2022-05-17
  
  Hi!
  
  Is this still the current chromosome size limit? I'm trying to use BBSplit to do a dual RNA seq analysis (human contamination) and the indexing part fails with a “The reference file appears to be empty” error too:
  
  BBSplit Error: Creating merged reference file ~/ref/genome/1/merged_ref_6286725898541.fa.gz Ref merge time: 133.394 seconds. Executing align2.BBMap [ow=t, fastareadlen=500, minhits=1, minratio=0.56, maxindel=20, qtrim=rl, untrim=t, trimq=6, in1=R1_001.fastq.gz, in2=R2_001.fastq.gz, ambiguous2=all, minratio=0.56, minhits=1, maxindel=16000, outu1=clean1.fq, outu2=clean2.fq, ref=ref/genome/1/merged_ref_6286725898541.fa.gz, out_x=outx_#.fq, out_y=outy_#.fq] Version 38.96 Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.560 Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.560 Retaining first best site only for ambiguous mappings. NOTE: Deleting contents of ref/genome/1 because reference is specified and overwrite=true Writing reference. Executing dna.FastaToChromArrays2 [ref/genome/1/merged_ref_6286725898541.fa.gz, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stoppad=8000, nodisk=false] Set genScaffoldInfo=true Set genome to 1 Loaded Reference: 0.002 seconds. Exception in thread "main" java.lang.AssertionError: 1, 0 The reference file appears to be empty. at align2.BBIndex.loadIndex(BBIndex.java:96) at align2.BBMap.loadIndex(BBMap.java:371) at align2.BBMap.main(BBMap.java:32) at align2.BBSplitter.main(BBSplitter.java:48)
  
  Some chromosomes of other genome I'm using exceed the 1,500 Mbp mark. Should I use a different aligner then?
  
  Thanks.
  
  Last edit: Samuel Ruiz-Pérez 2022-05-17
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Can't build index: "The reference file appears to be empty."

BBMap short read aligner, and other bioinformatic tools.

Milestone

Searches

Help

#25 Can't build index: "The reference file appears to be empty."

Discussion