Re: [XMLPipeDB-developer] 499 - PROBLEM - M tuberculosis xml tag importation

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Right, schema issues are unlikely.  Most count discrepancies like this that I've seen have boiled down to forming the right query.  Then, knowing the right query (in both XML and SQL), it's a matter of making sure that TallyEngine asks that same query.

John David N. Dionisio, PhD
Associate Professor, Computer Science
Loyola Marymount University

On Feb 7, 2011, at 5:48 PM, Richard Brous wrote:

> OK, so based on your approach:
>  
> 1. I'll start with reviewing the queries for xmlpipedb-match and sql queries needed for the respective results as you requested.
>  
> I was also thinking I may need to review the schema from xml into postgres but the issue isn't likely a schema error. The error most likely lies in how xmlpipedbutils queries the data from xml source and writes to the tables what it returns?
>  
> 2. I'll review the code: trace the entrance of tally engine in the gmbuilder code then follow it through the xmlpipedbutils.
>  
> Richard
> 
> On Sat, Feb 5, 2011 at 10:28 AM, John David N. Dionisio <do...@lm...> wrote:
> Just wanted to confirm (since I wasn't sure in the first e-mail) --- the XMLPipeDB Utilities source code is in trunk/xmlpipedbutils in SourceForge's Subversion repo.
> 
> John David N. Dionisio, PhD
> Associate Professor, Computer Science
> Loyola Marymount University
> 
> 
> 
> On Feb 5, 2011, at 10:02 AM, Richard Brous wrote:
> 
> > Hi Dondi,
> >
> > So I'm at the point in working with M tuberculosis that I was able to exactly reproduce Dr. Dahlquist's problematic TallyEngine results.
> >
> > gmb2b60 Results
> >
> >
> >
> > Now the proverbial question - What next to solve the Ordered Locus import/count issue?
> >
> > **********************************************
> > Here is my thought process:
> >
> > Step 1: How does the import process work at the high level? (obviously correct me if I'm wrong)
> >
> > I believe that basically as each XML tag is read, it is placed in the proper Postgres table(s) based on some criteria. There is also likely some sort of check that each individual tag is in valid XML format unless we don't care at this stage (care at export) or maybe the parser just skips over and goes on to the next .
> >
> > Step 2: What could be the problem?
> >
> > Either -
> > a. XML tags are being parsed incorrectly (ignored/skipped)?
> > b. Decision criteria of which table they should be added to?
> >
> > **********************************************
> >
> > I read on the sourceforge wiki:
> >
> > XMLPipeDB has a modular architecture with three components that may be used separately or together. XSD-to-DB reads an XSD (XML Schema Definition) and automatically generates an SQL schema, Java classes, and Hibernate mappings. XMLPipeDB Utilities provides functionality for configuring the database, importing data, and performing queries. GenMAPP Builder is based on the XMLPipeDB Utilities and exports GenMAPP-compatible Gene Databases based on data from UniProt and Gene Ontology (GO).
> >
> > So I should probably start with the XMLPipeDB Utilities which are where? I don't see any in the basic distribution or are they not standalone and called from the command line?
> >
> > Thanks!
> >
> > Richard
> 
> 
> <ATT00001..txt><ATT00002..txt>