Menu

#25 Can't build index: "The reference file appears to be empty."

1.0
open
nobody
None
2022-05-17
2019-10-29
Brook Byrns
No

I'm trying to create a reference index with the following command (openjdk version "1.8.0_232"):
bbmap.sh ref=Lancer.fixed.fasta

And I receive the following error:
java -ea -Xmx414496m -Xms414496m -cp /home/brook/src/bbmap/current/ align2.BBMap build=1 overwrite=true fastareadlen=500 ref=Lancer.fixed.fasta
Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, ref=Lancer.fixed.fasta]
Version 38.70

No output file.
Writing reference.
Executing dna.FastaToChromArrays2 [Lancer.fixed.fasta, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stoppad=8000, nodisk=false]

Set genScaffoldInfo=true
Set genome to 1

Loaded Reference: 0.009 seconds.
Exception in thread "main" java.lang.AssertionError: 1, 0
The reference file appears to be empty.
at align2.BBIndex.loadIndex(BBIndex.java:96)
at align2.BBMap.loadIndex(BBMap.java:372)
at align2.BBMap.main(BBMap.java:33)

I am able to run the same command with the phix reference fasta included with BBmap, but I cannot spot any relevant differences between the example fasta and my fasta. The fasta I am using contains 22 (wheat) chromosomes with the names (chr1A, chr1B, chr1D... etc). Any help would be appreciated.

Discussion

  • Brian Bushnell

    Brian Bushnell - 2019-10-29

    Hi - I'm really sorry about that, but BBMap does not support Wheat as it has a chromosome longer than 500Mbp, the current limit. It's the only organism I'm aware of that has this issue. I'll try to clarify the error message. You could break the chromosome at the centromere, but you're probably better off using a different aligner.

     
    • Mattias

      Mattias - 2020-03-23

      I am running into the same issue. The chromosome of wheat go up to 830 Mp. Would it be possible to incease the limit with a command?
      A lot of plant genomes have chromosomes of that size, see https://en.wikipedia.org/wiki/List_of_sequenced_plant_genomes and https://www.researchgate.net/publication/321833590_List_of_plant_genome_sequenced_with_genome_size_and_chromosome_numbers

       
    • Samuel Ruiz-Pérez

      Hi!

      Is this still the current chromosome size limit? I'm trying to use BBSplit to do a dual RNA seq analysis (human contamination) and the indexing part fails with a “The reference file appears to be empty” error too:

      BBSplit Error:
      
      Creating merged reference file ~/ref/genome/1/merged_ref_6286725898541.fa.gz
      Ref merge time:         133.394 seconds.
      Executing align2.BBMap [ow=t, fastareadlen=500, minhits=1, minratio=0.56, maxindel=20, qtrim=rl, untrim=t, trimq=6, in1=R1_001.fastq.gz, in2=R2_001.fastq.gz, ambiguous2=all, minratio=0.56, minhits=1, maxindel=16000, outu1=clean1.fq, outu2=clean2.fq, ref=ref/genome/1/merged_ref_6286725898541.fa.gz, out_x=outx_#.fq, out_y=outy_#.fq]
      Version 38.96
      
      Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.560
      Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.560
      Retaining first best site only for ambiguous mappings.
      NOTE:   Deleting contents of ref/genome/1 because reference is specified and overwrite=true
      Writing reference.
      Executing dna.FastaToChromArrays2 [ref/genome/1/merged_ref_6286725898541.fa.gz, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stoppad=8000, nodisk=false]
      
      Set genScaffoldInfo=true
      Set genome to 1
      
      Loaded Reference:   0.002 seconds.
      Exception in thread "main" java.lang.AssertionError: 1, 0
      The reference file appears to be empty.
          at align2.BBIndex.loadIndex(BBIndex.java:96)
          at align2.BBMap.loadIndex(BBMap.java:371)
          at align2.BBMap.main(BBMap.java:32)
          at align2.BBSplitter.main(BBSplitter.java:48)
      

      Some chromosomes of other genome I'm using exceed the 1,500 Mbp mark. Should I use a different aligner then?

      Thanks.

       

      Last edit: Samuel Ruiz-Pérez 2022-05-17

Log in to post a comment.