Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#1 Lucy deals poorly with long sequence headers

closed
None
5
2008-09-03
2008-04-18
Anonymous
No

It appears that Lucy is unable to cope with FASTA header lines longer than some limit - typically it complains about illegal characters in the sequence data.

This implies that instead of raising an error, truncating the line, or other sensible behavior, it goes on to treat any excess lenght of the header line as actual sequence data.

A workaround is to truncate headers to only include the sequence label (first word), but it would be preferable if Lucy correctly handled long headers, or at least had a robust way to deal with them.

-k

Discussion

  • Michael Holmes
    Michael Holmes
    2008-04-22

    • status: open --> pending
     
  • Michael Holmes
    Michael Holmes
    2008-04-22

    Logged In: YES
    user_id=2030343
    Originator: NO

    This is a legitimate bug, caused by a too small buffer (256 bytes) for reading from the FASTA file. I will fix and post a new version within a few days.

     
  • Michael Holmes
    Michael Holmes
    2008-04-22

    • status: pending --> open
     
  • Michael Holmes
    Michael Holmes
    2008-09-03

    • assigned_to: nobody --> thermodog
    • status: open --> closed
     
  • Michael Holmes
    Michael Holmes
    2008-09-03

    Logged In: YES
    user_id=2030343
    Originator: NO

    This is fixed in version 1.20. I allocated a larger line buffer for reading from the input files.