From: Mitch S. <mit...@be...> - 2009-10-21 03:36:44
|
Martin A. Hansen wrote: > I have studied the GFF3 format, and it is quite clear to me that it > should be possible to parse GFF3 entries in a step-wise manner using > very little memory - including parent/childs Not in the general case, no. Imagine a file with 2.3 million features, all of which are children of one parent feature. Or that those 2.3 million features are children of a bunch of parent features, with all the parents at the end (or at some unpredictable place within the file). If you impose the right kind of (topological) ordering constraint on the GFF, you could improve your average-case memory usage, at the cost of making your GFF parser much less general. Or you could build the ordering into your parser, but then you'd have to do a fairly memory-hungry topological sort at the beginning. > Does Bio::DB support MySQL? The bioperl documentation talks all about this. Mitch |