[XMLPipeDB-developer] Info table
Brought to you by:
kdahlquist,
zugzugglug
From: Kam D. <kda...@lm...> - 2011-08-12 19:08:43
|
Hi, I think we had better leave the info table with only one record (option 2). The species names can be separated by pipes " | " as they are in the Systems table where there are multiple species. To my knowledge, the only time GenMAPP needs to access the info table is for the "DisplayOrder" field, I don't know what would happen if there were multiple records there. I know that the spec for the table says that it should only be one record, but I don't know if it would crash if there were multiple records. To be on the safe side, I think we should just keep it to the one record. Best, Kam At 09:45 PM 8/10/2011, John David N. Dionisio wrote: >Greetings, > >I think we have to turn to Dr. Dahlquist's GenMAPP knowledge here to >get the definitive answer. I see two choices: > >- The Info table should have one record for each species that the >.gdb holds, in which case the change you need is to wrap that single >submit call inside a loop, so that submit is called once for each >chosen species. > >- The Info table should always have one record, and if the .gdb >holds multiple species, the "Species" column should be some >concatenation of multiple species names. In this case, you would >still call submit only once, but the value you send into the >"Species" column is some accumulation of all chosen species names. > >Admittedly I don't know which way is right (I assumed the former as >of our Tuesday meeting, but on further examination I'm no longer >quite so sure). > >For Kam --- what does GenMAPP expect to see in the Info table if the >opened .gdb contains multiple species? > >John David N. Dionisio, PhD >Associate Professor, Computer Science >Loyola Marymount University > > >On Aug 10, 2011, at 9:36 PM, Richard Brous wrote: > > > OK, continued to review ExportToGenMAPP and dug into the creation > of the first TableManager tmA on line 118. > > > > In reading through the method, my understanding is that it > creates a new TableManager based on the selectedDatabaseProfile > (which is UniProt). > > > > This is performed by the method getInfoTableManager() which then > calls method submit(String tableName, QueryType queryType, > String[][] columnNamesToValues); > > > > the code is as follows: > > > > tableManager.submit("Info", QueryType.insert, new String[][] { { > "Owner", owner }, { "Version", new > SimpleDateFormat("yyyyMMdd").format(version) }, { "MODSystem", > modSystem }, { "Species", speciesProfile.getSpeciesName() }, { > "Modify", new SimpleDateFormat("yyyyMMdd").format(modify) }, { > "DisplayOrder", displayOrder }, { "Notes", notes } }); > > > > > > The modification of this line centers on { "Species", > speciesProfile.getSpeciesName() }, since it originally processed a > single species. > > > > So now I need to populate the arguments with the species > contained within selectedDatabaseprofile.selectedSpeciesProfiles. > > > > I think I'll start with the baseArgument up to MODSystem, then > append as many species as necessary, and then cap off the end with > the rest starting at Modify. (similar to your approach in > ExportGoData, populateUniprotGoTableFromSQL(char chosenAspect, > List<Integer> taxonIds) line 513 > > > > Please let me know if this approach or analysis is off track. > > > > Thanks! > > > > Richard > > > > > > > > > > > > > > On Wed, Aug 10, 2011 at 5:50 PM, Richard Brous <rbr...@gm...> wrote: > > Updated repository to include all Gene Ontology changes discussed > during our meeting yesterday. > > > > Digging into TableManager next. > > > > Richard > > > > On Fri, Aug 5, 2011 at 10:06 AM, Richard Brous <rbr...@gm...> wrote: > > whew... thanks for the detailed reply. I will digest this a bit > and get back to you with further questions. > > > > rb > > > > On Thu, Aug 4, 2011 at 11:18 PM, John David N. Dionisio > <do...@lm...> wrote: > > Greetings, > > > > Sorry for the delay. I wasn't able to walk through the relevant > code until this evening. > > > > As Kam said, GOA serves as the link between the UniProt and GO > IDs. It essentially determines which GO IDs get exported by using > GOA to see which GO IDs are associated with an exported UniProt > ID. The populateUniprotGoTableFromSQL, in its current form, > extracts the GO association records that match the given taxon ID > then exports, as UniProt-GO pairs, the GO and UniProt IDs > referenced within that GO association record. Processing that > follows this is then based on the GO IDs that got exported --- and > that's how the current code avoids exporting the entire list of GO terms. > > > > The operative query is on the second line of populateUniprotGoTableFromSQL: > > > > String uniProtAndGOIDSQL = "select db_object_id, go_id, > evidence_code, with_or_from from goa where db like '%UniProt%' and > taxon = 'taxon:" + taxon + "'"; > > > > In plain English, this selects the GOA records whose database is > UniProt and whose taxon ID is the given taxon. An additional > condition is added for the "aspect" (All, Component, Function, or > Process) that is to be exported. This is another reduction filter, > to further shrink the number of exported GO terms and thus avoid > MAPPFinder issues later on. > > > > Given this, the proper expansion here is to change the taxon > predicate to a multiple predicate. That is, this method can be > changed to now accept a collection or array of taxon IDs, and the > base query should then be changed so that it accepts any taxon from > that collection. More or less, you want: > > > > private void populateUniprotGoTableFromSQL(char chosenAspect, > int[] taxons) throws SQLException { > > > > ...then, instead of the single string, you want to iterate > through the taxon IDs: > > > > StringBuilder baseQueryBuilder = new StringBuilder("select > db_object_id, go_id, evidence_code, with_or_from from goa where db > like '%UniProt%'"); > > boolean first = true; > > for (int taxon: taxons) { > > baseQueryBuilder.append(first ? " and (" : " or "); > > baseQueryBuilder > > .append("taxon = 'taxon:") > > .append(taxon).append("'"); > > first = false; > > } > > baseQueryBuilder.append(")"); > > > > ...and so on. I just sort of rattled this off so there may be > little glitches, but anyway this is just to give you an overall idea. > > > > Put another way, no, you do not need to iterate this method for > each taxon ID. Instead, you can still call this method once, with > the multiplicity of taxon IDs emerging in terms of the actual > condition used for selecting the GO terms to be exported (based on > the available GOA records, which as you may recall are loaded from .goa files). > > > > As a side note, right here you have an opportunity for a little > sanity check regarding the content of the relational database: GO > terms will only be exported if GOA records for the desired taxon > IDs have been imported into the database. So, as a pre-flight > check, one can see if there are any GOA records at all for each > chosen taxon ID. If there are none, then the .goa file for that > species needs to be imported into the relational database. > > > > Hope this helps... > > > > John David N. Dionisio, PhD > > Associate Professor, Computer Science > > Loyola Marymount University > > > > > > On Aug 4, 2011, at 1:00 PM, Kam Dahlquist wrote: > > > > > Hi, > > > > > > Dondi will have to chime in on this, but I think this is where > things are going to get tricky. > > > > > > The final gdb does not actually contain the entire GO, it gets > trimmed somehow based on the GO associations for a particular > species. This is because MAPPFinder cannot handle loading the > entire GO. Since there is some type of species-specific trimming > going on, it's quite possible that this will need to iterate. > > > > > > However, I don't have the foggiest idea of how this works, so > Dondi will have to chime in. > > > > > > Best, > > > Kam > > > > > > At 12:09 AM 8/4/2011, you wrote: > > >> Wednesday 8/3/11 progress: > > >> > > >> 1. After following the ExportPanel1.java ground zero code of: > databaseProfile.setSelectedSpeciesProfile( selectedProfile ); > > >> > > >> I found the method in DatabaseProfile.java plus a getter method; > > >> SpeciesProfile setSelectedSpeciesProfile( speciesProfile ) and > SpeciesProfile getSelectedSpeciesProfile( speciesProfile ) > > >> > > >> I created two new methods that each handle List<Object> of > SpeciesProfiles argument instead of a single SpeciesProfile; > setSelectedSpeciesProfiles and getSelectedSpeciesProfiles. > > >> > > >> This enabled the ExportPanel1 ground zero code to become: > databaseProfile.setSelectedSpeciesProfiles(selectedSpecies); > > >> > > >> 2. public static void export() on line 104 in ExportToGenMAPP.java > > >> > > >> On line 107 ExportGoData is instantiated which I found in > ExportGoData.java and calls a method: public void export(char > chosenAspect, int taxon). > > >> > > >> Within export, taxon id is required for another method: > private void populateGoTables(char chosenAspect, int taxon). > > >> > > >> Within populateGoTables, taxon id is required for another > method: private void populateUniprotGoTableFromSQL( char > chosenAspect, int taxon). > > >> > > >> But, if the export to GDB process starts off with exporting GO > data, doesn't it only need to do that once no matter how many > species are selected? As you probably realize, I'm leading towards > not having to iterate through this for each taxon id if possible. > > >> > > >> Also, how does the export actually work? How are GO ids and > UniProt ids related within the table? > > >> > > >> Thanks! > > >> > > >> Richard > > >> > > >> > > > <ATT00001..txt><ATT00002..txt> > > > > > > > ------------------------------------------------------------------------------ > > BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA > > The must-attend event for mobile developers. Connect with experts. > > Get tools for creating Super Apps. See the latest technologies. > > Sessions, hands-on labs, demos & much more. Register early & save! > > http://p.sf.net/sfu/rim-blackberry-1 > > _______________________________________________ > > xmlpipedb-developer mailing list > > xml...@li... > > https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer > > > > > > > > <ATT00001..txt><ATT00002..txt> > > >------------------------------------------------------------------------------ >Get a FREE DOWNLOAD! and learn more about uberSVN rich system, >user administration capabilities and model configuration. Take >the hassle out of deploying and managing Subversion and the >tools developers use with it. >http://p.sf.net/sfu/wandisco-dev2dev >_______________________________________________ >xmlpipedb-developer mailing list >xml...@li... >https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer |