[Bio-bwa-help] bwa index -a is: Details on database 2GB limit?
Status: Beta
Brought to you by:
lh3lh3
From: Henrik B. <hb...@bi...> - 2013-11-18 00:28:43
|
Hi. At http://bio-bwa.sourceforge.net/bwa.shtml one can read for 'bwa index -a is' that: "IS linear-time algorithm for constructing suffix array. It requires 5.37N memory where N is the size of the database. IS is moderately fast, but does not work with database larger than 2GB. IS is the default algorithm due to its simplicity. The current codes for IS algorithm are reimplemented by Yuta Mori." I have few questions I hope someone could help me with: Q. If one still run it "with [a] database larger than 2GB", what happens? What is the error message and when/how soon does it appear? Q. I'd like to be able to test for this before launching 'bwa index -a is' (trying to write a "bullet-proof" wrapper used by others). Exactly what does "database larger than 2GB" refer to? Is "database" referring to the FASTA file, the generated BWA index, or the genome sequence? Also, does "GB" refer to gigabases (=giganucleotides) or gigabytes (of a file)? Thank you, Henrik (bwa 0.7.5a-r405) |