From: Pablo N. M. <pa...@pa...> - 2005-06-10 14:06:18
|
Hi folks, I find working with tab delimited files quite uncomfortable and sometimes dangerous. We don't have ways to check well formedness or schema compliance (like in XML with XSDs or DTDs). This could cause execution halts after long time running or worse: wrong data loaded into the database. I defend the idea of having such a generic plugin for loading XML into GUS, also based on a data description file. I've noticed that NCBI already offer XML as a possible format for download. Other data sources tend to do the same. Any thoughts on this? About the GUS XML effort, I find it very interesting. I'll check the material to get to know it better. Best, Pablo ----- Original Message ----- From: "Terry Clark" <tc...@it...> To: "Eric E. Snyder" <es...@vb...> Cc: <gus...@li...> Sent: Thursday, June 09, 2005 7:43 PM Subject: Re: [GUSDEV] Generic GUS data loader for tab delimited files > Dear Eric, > We have a such an effort underway using XML formatted input data. > Here's a pointer to the project > http://flora.ittc.ku.edu/xmlgus/ > This method requires > some_format -> GUS' XML -> GUS object layer > > The system, running as a plugin, reads input in a GUS XML format that > is formatted to correspond with relational tables and GUS objects. > The mapping is instantiated in the XMLGUS framework as a YACC grammar > chosen for structure and the declarative approach for the plugin. > We're adding automation to some of the intermediate steps presently. > I'd be happy to help you try this out if you are interested. > > all the best, > > Terry > > On 0, "Eric E. Snyder" <es...@vb...> wrote: >> Dear GUSdev, >> >> We have been having some trouble loading DNA annotation data via the >> gbparser plugin. We have been able to get around the problem in this >> instance by using addrow, which is quite general but impossibly slow. I >> cannot help but think there must be a generic tool for loading >> tab-delimited data files into GUS. >> >> Assuming there isn't, I think it would be time well spent if someone >> wrote a plugin for GUS that would *efficiently* load data in >> tab-delimited format based on instructions described in a >> general-purpose data description file. This file would identify the >> tables and fields corresponding to each column in the input file. It >> would also need to define the rules for associating data from records >> stored in multiple tables and probably do other things as well. >> >> Any takers? I would be happy to spend whatever time is necessary to >> define the requirements for such a system. If it doesn't already exist >> somewhere in the GUS community, I certainly think it would be useful. >> >> I apologize in advance if this is a recent or frequent topic for this >> list. I just subscribed and wasn't able to access sourceforge to check >> the archives. >> >> Thanks! >> eesnyder >> -- >> Eric E. Snyder, Ph.D. >> Virginia Bioinformatics Institute >> Washington Street Phase 1 (0447) >> Virginia Polytechnic Institute and State University >> Blacksburg, VA 24061 >> USA >> >> Office: (540) 231-5428 >> Mobile: (540) 230-5225 >> Fax: (540) 231-2891 >> Email: ees...@vb... >> JDAM: N 37 12'01.6", W 80 24'26.9" > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: NEC IT Guy Games. How far can you > shotput > a projector? How fast can you ride your desk chair down the office luge > track? > If you want to score the big prize, get to know the little guy. > Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20 > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |