abi write support

2013-01-21
2013-04-18
  • Will Stokes

    Will Stokes - 2013-01-21

    Are there any plans to add support for writing ABI files? I personally prefer the SCF/ZTR file formats, when editing a sequence trace to fixed mistakes by the base caller, saving to SCF or ZTR results in losing the associated raw trace data. It would be nice to  be able to write to abi files to avoid this data loss. Other programs based on iolib (4Peaks) silenly convert your data from ab1 to scf without changing the file extension and you end up losing the raw data without knowing it.

    Similarly, it would be nice if the ab1 reader supported reading back information on how to align the raw trace data with the processed trace data. The documention for abi_set_data_counts() could be improved to indicate that:

    1,2,3,4 - used to load raw data
    5,6,7,8 - used to load gel volatage (volts / 10), gel current, eletrophoretic power, and gel temperature
    9,10,11,12 - used to load processed data (default)

     
  • Will Stokes

    Will Stokes - 2013-01-21

    Minor comment: my questions are in regard to the iolib package. Also note that when loading raw data the peak values are technically signed values but are stored as 2 byte unsigned integers in the Read object which is technically misleading and threw me for a loop at first.

     
  • James Bonfield

    James Bonfield - 2013-01-21

    Currently we don't have time to do this, although I wouldn't be against it in principle. Obviously we would be happy to accept any working patches.

    The original code was implemented by reverse engineering the file format, so no attempt was made to write to the files as the authors worked out what to decode and ignored the other fields unused by us.  As you point out since then the AB1 specification has been published openly so there is no real reason to not support it fully other than time.

    Regarding the raw peak data, it's complicated! Some instruments produced AB1 format files using the full dynamic range from 0 to 65535, forcing it to be unsigned. I'm not quite sure what the proper method is for dealing with this. Since the publication of the spec it appears the correct approach is indeed to use signed instead, but I'm unsure of how many things it would break now. (Testing could take a considerable time!)

    James

     
  • Will Stokes

    Will Stokes - 2013-01-21

    Thanks for writing back James. I've been using iolib in my trace viewer in part to save myself time since writing an file importer/exporter for all these sequence trace files formats would be a giant pain and iolib does most of this already. :-) Some day I may have time to look into writing ab1 files but I don't expect that any time soon unfortunately. I have been successfully reading and displaying raw data using abi_set_data_counts() and casting the values to signed shorts without problems although I suppose it sounds like in theory there might be some files out there that give me trouble. I simply don't use the max trace value variable and ignore it since I'v never come across a file with unsigned values.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks