gap5 empty databases?

  • Hello again,

    firstly, thanks for all the help so far. Staden now seems to be installed, and I can launch gap4 and gap5. Now I have another problem though…

    I've made some databases using tg_index, which seemed to work. The output is:

    *** I/O stats (type, write count/size read count/size) ***
    GT_RecArray         174          44758       2              0
    GT_Bin            10139         179601    3256          61993
    GT_Range           8645      202669925    3219       89057269
    GT_BTree           7672       81027754     976       20495927
    GT_Track              0              0       0              0
    GT_Contig          1239          17763     286           1716
    GT_Seq                0              0       0              0
    GT_Anno               0              0       0              0
    GT_AnnoEle            0              0       0              0
    GT_SeqBlock       14002      742843222   13334              0
    GT_AnnoEleBlock   31005       20952399       0              0

    However, when I try and load this into gap5, there doesn't seem to be anything there:

    GT_RecArray           0              0       2            835
    GT_Bin                0              0       0              0
    GT_Range              0              0       0              0
    GT_BTree              0              0       2           1776
    GT_Track              0              0       0              0
    GT_Contig             0              0       0              0
    GT_Seq                0              0       0              0
    GT_Anno               0              0       0              0
    GT_AnnoEle            0              0       0              0
    GT_SeqBlock           0              0       0              0
    GT_AnnoEleBlock       0              0       0              0

    All options under the view menu are pale grey and unclickable…

    Thanks in advance for your help,


  • James Bonfield
    James Bonfield

    Those stats are listing the number and total size of read and write calls for each type of data in the database. It is natural for tg_index to give large figures while gap5 gives small ones as tg_index will be writing every record at least once while gap5 will be doing only minimal amounts of I/O unless you ask for something complex like recomputing the entire consensus for all contigs.

    However you should be seeing something when you open gap5. I've seen on some X11 servers that the contig selector appears blank (eg MacOS X based ones) - I haven't yet worked out why, but using Tk 8.5 instead of Tk 8.4 solved that issue. The Contig List is a better starting point therefore to see what contigs you have and how many reads are in them. Although the fact they're grey seems to be implying it finds no contigs, which is odd!

    Is this data public, or can you test it on some known public data to see if that works for you? I'd like to try and recreate the problem. Also which version is this?


  • I think it's using tk8.5 - at least that's what is in my /opt/local/lib (which I set as the path to tk during compile).

    The data is from a mira assembly. I've tried converting both the ace file to a database, and also the caf file. Out of interest, how large should you expect these files to be? This is what I have:

    -rw-r-r-  1 kerensa  staff   5.4G  8 Nov 12:54 18A_out.ace
    -rw-r-r-  1 kerensa  staff   309M  8 Nov 13:19 18Aace.g5d
    -rw-r-r-  1 kerensa  staff   934K  8 Nov 13:19 18Aace.g5x
    -rw-r-r-  1 kerensa  staff   860M  8 Nov 13:09 18Adb.g5d
    -rw-r-r-  1 kerensa  staff   1.7M  8 Nov 13:09 18Adb.g5x
    -rw-r-r-  1 kerensa  staff   9.4G  8 Nov 11:38 tmp.caf.caf

    Maybe to start with, would it be possible to send me a test gap5 database that you know you can open, and I'll see if I can open it here? That way we will know if it is the installation or the data. Also I'm happy to try it on some public data, but it would be helpful to know that it is data someone has already successfully turned into a gap5 database.


  • James Bonfield
    James Bonfield

    Those file sizes don't look totally out of whack.

    I put a small subset of ERR051768 on our ftp site. See:

    The bam was exported from gap5 (to sam and then bam via samtools), but I verified it can reload it again via tg_index.