#88 tg_index -C -> gap5 "export as CAF" renames reads

open
Gap5 (15)
5
2011-03-30
2011-03-30
Bastien Chevreux
No

Hi James,

assuming one has a file "bla.caf" and does the following:

tg_index -C bla.caf
gap5 bla.0.g5d
In gap5: File -> Export sequences -> as CAF

then all the reads in the resulting "bla.0.caf" file will have been renamed by appending a dot and then some (gap5 internal?) number. I'm not sure whether there's an intention behind the behaviour, but I filed it as "bug" as I think that this should not be. Export as SAM e.g. does not do that, while export as ACE does something else (appen ".f" and ".r")

Best,
Bastien

Discussion

  • James Bonfield
    James Bonfield
    2011-04-18

    Looking at this code it appears a bit dumb at present. If your read doesn't contain a dot then it assumes it lacks suffixes and fwd/rev pairs will have the same name. Then it just appends ".<record_id>".

    Ideally it'd spot duplicates and resolve the conflicts, but it may be very slow to implement. I'll experiment. I've already found a related bug as contig names can clash with reading names too.