Re: [XMLPipeDB-developer] Info table
Brought to you by:
kdahlquist,
zugzugglug
From: John D. N. D. <do...@lm...> - 2011-08-16 19:21:43
|
Hi Rich, Yeah, relations is pretty complicated. I suggest you skip that and go to just the system tables for now (i.e., the tables that hold the IDs themselves). The relationship tables hold pairs of IDs (i.e., which one from one system corresponds to the ID from another) and so is an additional level of complexity, I think. Regarding your prior question about the Info table --- I'm not or'ing the species names. I'm merely concatenating the names with pipes ("|") in between, as Kam specified. While the final value may look like an "or," it is ultimately just a string. The "|" could just have easily been a comma, semicolon, or other separator. Your prior guess as to how that would look, with the multiple { "Species", speciesName } pairs, would actually be equivalent to multiple values for the single Species column, which does not fit the relational model. The Species column can have only one value, and in this case it is a single string that is the concatenation of all selected species names, separated by "|" characters. Hope that clears things up... John David N. Dionisio, PhD Associate Professor, Computer Science Loyola Marymount University On Aug 16, 2011, at 12:11 PM, Richard Brous wrote: > OK, have moved onto getRelationsTableManager() ... this one seems pretty complicated... > > Have reviewed the method and submethods called and have the following questions: > > 1. What are the RelationshipTables that are stored in relationshipTables? > aren't they species dependent? Can't seem to find where they were created? > If not, then: > > 2. How should the if/else conditional be handled? > > if > > (speciesProfile.getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) | > > speciesProfile.getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) { > tableManager = > > speciesProfile.getRelationsTableManagerCustomizations(stp.systemTable1, > stp. > > systemTable2, templateDefinedSystemToSystemCode, tableManager); > This obviously needs to be made aware of Lists of SpeciesProfiles... to build the correct tableManager, do I need this conditional to run through every species or should we enclose all the code (if and else) within a loop through each species? > > Thanks. > > Richard > > > On Tue, Aug 16, 2011 at 11:02 AM, Richard Brous <rbr...@gm...> wrote: > Sorry I typed and sent that out too quickly before getting my thoughts completely together... > > First off i thought that the array[][] portion of the submit needed to be in the following format: { "Species", speciesName1 }, { "Species", speciesName2 }, { "Species", speciesName3 },... > but I see you specified it as: { "Species", speciesName1 | speciesName2 | speciesName3 }, ... OR'ing each of the species names? > > I see what you did regarding the date object... yes no reason to recreate a second object... > > As mentioned previously I will continue through the next methods, submitting (well thought out) changes and emailing out updates... > > Richard > On Tue, Aug 16, 2011 at 10:36 AM, Richard Brous <rbr...@gm...> wrote: > Took a look at your code and realized what I had done wrong. I should have just broken out the loop as the second argument of the 3 in the submit method. That was a rookie move on my part. > > I am moving on to the next method and will keep my mind on the syntactic solution that I'm not thinking through prior to moving forward with an implementation. > > Richard > > On Mon, Aug 15, 2011 at 1:30 AM, John David N. Dionisio <do...@lm...> wrote: > OK, everything is committed. I got up to the tweaks on the second panel in the wizard (Save As/GO Aspects), as well as the query change for exporting any combination of C, F, or P. > > I still have to do the GO OBO format check, plus the UI work on the remaining two wizard panels. But meanwhile, hope these latest changes work out well. > > John David N. Dionisio, PhD > Associate Professor, Computer Science > Loyola Marymount University > > > > On Aug 14, 2011, at 7:31 PM, Richard Brous wrote: > > > OK, using option 2 I have made changes to DatabaseProfile.java to allow for all species names to be included in the submit argument. > > But I'm stuck on how to change my StringBuilder object to a type that submit wants. Help please!! > > > > Submitted the above and some comment changes to SourceForge this evening. > > > > Richard > > > > On Fri, Aug 12, 2011 at 12:08 PM, Kam Dahlquist <kda...@lm...> wrote: > > Hi, > > > > I think we had better leave the info table with only one record > > (option 2). The species names can be separated by pipes " | " as > > they are in the Systems table where there are multiple species. To > > my knowledge, the only time GenMAPP needs to access the info table is > > for the "DisplayOrder" field, I don't know what would happen if there > > were multiple records there. I know that the spec for the table says > > that it should only be one record, but I don't know if it would crash > > if there were multiple records. To be on the safe side, I think we > > should just keep it to the one record. > > > > Best, > > Kam > > > > At 09:45 PM 8/10/2011, John David N. Dionisio wrote: > > >Greetings, > > > > > >I think we have to turn to Dr. Dahlquist's GenMAPP knowledge here to > > >get the definitive answer. I see two choices: > > > > > >- The Info table should have one record for each species that the > > >.gdb holds, in which case the change you need is to wrap that single > > >submit call inside a loop, so that submit is called once for each > > >chosen species. > > > > > >- The Info table should always have one record, and if the .gdb > > >holds multiple species, the "Species" column should be some > > >concatenation of multiple species names. In this case, you would > > >still call submit only once, but the value you send into the > > >"Species" column is some accumulation of all chosen species names. > > > > > >Admittedly I don't know which way is right (I assumed the former as > > >of our Tuesday meeting, but on further examination I'm no longer > > >quite so sure). > > > > > >For Kam --- what does GenMAPP expect to see in the Info table if the > > >opened .gdb contains multiple species? > > > > > >John David N. Dionisio, PhD > > >Associate Professor, Computer Science > > >Loyola Marymount University > > > > > > > > >On Aug 10, 2011, at 9:36 PM, Richard Brous wrote: > > > > > > > OK, continued to review ExportToGenMAPP and dug into the creation > > > of the first TableManager tmA on line 118. > > > > > > > > In reading through the method, my understanding is that it > > > creates a new TableManager based on the selectedDatabaseProfile > > > (which is UniProt). > > > > > > > > This is performed by the method getInfoTableManager() which then > > > calls method submit(String tableName, QueryType queryType, > > > String[][] columnNamesToValues); > > > > > > > > the code is as follows: > > > > > > > > tableManager.submit("Info", QueryType.insert, new String[][] { { > > > "Owner", owner }, { "Version", new > > > SimpleDateFormat("yyyyMMdd").format(version) }, { "MODSystem", > > > modSystem }, { "Species", speciesProfile.getSpeciesName() }, { > > > "Modify", new SimpleDateFormat("yyyyMMdd").format(modify) }, { > > > "DisplayOrder", displayOrder }, { "Notes", notes } }); > > > > > > > > > > > > The modification of this line centers on { "Species", > > > speciesProfile.getSpeciesName() }, since it originally processed a > > > single species. > > > > > > > > So now I need to populate the arguments with the species > > > contained within selectedDatabaseprofile.selectedSpeciesProfiles. > > > > > > > > I think I'll start with the baseArgument up to MODSystem, then > > > append as many species as necessary, and then cap off the end with > > > the rest starting at Modify. (similar to your approach in > > > ExportGoData, populateUniprotGoTableFromSQL(char chosenAspect, > > > List<Integer> taxonIds) line 513 > > > > > > > > Please let me know if this approach or analysis is off track. > > > > > > > > Thanks! > > > > > > > > Richard > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Aug 10, 2011 at 5:50 PM, Richard Brous <rbr...@gm...> wrote: > > > > Updated repository to include all Gene Ontology changes discussed > > > during our meeting yesterday. > > > > > > > > Digging into TableManager next. > > > > > > > > Richard > > > > > > > > On Fri, Aug 5, 2011 at 10:06 AM, Richard Brous <rbr...@gm...> wrote: > > > > whew... thanks for the detailed reply. I will digest this a bit > > > and get back to you with further questions. > > > > > > > > rb > > > > > > > > On Thu, Aug 4, 2011 at 11:18 PM, John David N. Dionisio > > > <do...@lm...> wrote: > > > > Greetings, > > > > > > > > Sorry for the delay. I wasn't able to walk through the relevant > > > code until this evening. > > > > > > > > As Kam said, GOA serves as the link between the UniProt and GO > > > IDs. It essentially determines which GO IDs get exported by using > > > GOA to see which GO IDs are associated with an exported UniProt > > > ID. The populateUniprotGoTableFromSQL, in its current form, > > > extracts the GO association records that match the given taxon ID > > > then exports, as UniProt-GO pairs, the GO and UniProt IDs > > > referenced within that GO association record. Processing that > > > follows this is then based on the GO IDs that got exported --- and > > > that's how the current code avoids exporting the entire list of GO terms. > > > > > > > > The operative query is on the second line of populateUniprotGoTableFromSQL: > > > > > > > > String uniProtAndGOIDSQL = "select db_object_id, go_id, > > > evidence_code, with_or_from from goa where db like '%UniProt%' and > > > taxon = 'taxon:" + taxon + "'"; > > > > > > > > In plain English, this selects the GOA records whose database is > > > UniProt and whose taxon ID is the given taxon. An additional > > > condition is added for the "aspect" (All, Component, Function, or > > > Process) that is to be exported. This is another reduction filter, > > > to further shrink the number of exported GO terms and thus avoid > > > MAPPFinder issues later on. > > > > > > > > Given this, the proper expansion here is to change the taxon > > > predicate to a multiple predicate. That is, this method can be > > > changed to now accept a collection or array of taxon IDs, and the > > > base query should then be changed so that it accepts any taxon from > > > that collection. More or less, you want: > > > > > > > > private void populateUniprotGoTableFromSQL(char chosenAspect, > > > int[] taxons) throws SQLException { > > > > > > > > ...then, instead of the single string, you want to iterate > > > through the taxon IDs: > > > > > > > > StringBuilder baseQueryBuilder = new StringBuilder("select > > > db_object_id, go_id, evidence_code, with_or_from from goa where db > > > like '%UniProt%'"); > > > > boolean first = true; > > > > for (int taxon: taxons) { > > > > baseQueryBuilder.append(first ? " and (" : " or "); > > > > baseQueryBuilder > > > > .append("taxon = 'taxon:") > > > > .append(taxon).append("'"); > > > > first = false; > > > > } > > > > baseQueryBuilder.append(")"); > > > > > > > > ...and so on. I just sort of rattled this off so there may be > > > little glitches, but anyway this is just to give you an overall idea. > > > > > > > > Put another way, no, you do not need to iterate this method for > > > each taxon ID. Instead, you can still call this method once, with > > > the multiplicity of taxon IDs emerging in terms of the actual > > > condition used for selecting the GO terms to be exported (based on > > > the available GOA records, which as you may recall are loaded from .goa files). > > > > > > > > As a side note, right here you have an opportunity for a little > > > sanity check regarding the content of the relational database: GO > > > terms will only be exported if GOA records for the desired taxon > > > IDs have been imported into the database. So, as a pre-flight > > > check, one can see if there are any GOA records at all for each > > > chosen taxon ID. If there are none, then the .goa file for that > > > species needs to be imported into the relational database. > > > > > > > > Hope this helps... > > > > > > > > John David N. Dionisio, PhD > > > > Associate Professor, Computer Science > > > > Loyola Marymount University > > > > > > > > > > > > On Aug 4, 2011, at 1:00 PM, Kam Dahlquist wrote: > > > > > > > > > Hi, > > > > > > > > > > Dondi will have to chime in on this, but I think this is where > > > things are going to get tricky. > > > > > > > > > > The final gdb does not actually contain the entire GO, it gets > > > trimmed somehow based on the GO associations for a particular > > > species. This is because MAPPFinder cannot handle loading the > > > entire GO. Since there is some type of species-specific trimming > > > going on, it's quite possible that this will need to iterate. > > > > > > > > > > However, I don't have the foggiest idea of how this works, so > > > Dondi will have to chime in. > > > > > > > > > > Best, > > > > > Kam > > > > > > > > > > At 12:09 AM 8/4/2011, you wrote: > > > > >> Wednesday 8/3/11 progress: > > > > >> > > > > >> 1. After following the ExportPanel1.java ground zero code of: > > > databaseProfile.setSelectedSpeciesProfile( selectedProfile ); > > > > >> > > > > >> I found the method in DatabaseProfile.java plus a getter method; > > > > >> SpeciesProfile setSelectedSpeciesProfile( speciesProfile ) and > > > SpeciesProfile getSelectedSpeciesProfile( speciesProfile ) > > > > >> > > > > >> I created two new methods that each handle List<Object> of > > > SpeciesProfiles argument instead of a single SpeciesProfile; > > > setSelectedSpeciesProfiles and getSelectedSpeciesProfiles. > > > > >> > > > > >> This enabled the ExportPanel1 ground zero code to become: > > > databaseProfile.setSelectedSpeciesProfiles(selectedSpecies); > > > > >> > > > > >> 2. public static void export() on line 104 in ExportToGenMAPP.java > > > > >> > > > > >> On line 107 ExportGoData is instantiated which I found in > > > ExportGoData.java and calls a method: public void export(char > > > chosenAspect, int taxon). > > > > >> > > > > >> Within export, taxon id is required for another method: > > > private void populateGoTables(char chosenAspect, int taxon). > > > > >> > > > > >> Within populateGoTables, taxon id is required for another > > > method: private void populateUniprotGoTableFromSQL( char > > > chosenAspect, int taxon). > > > > >> > > > > >> But, if the export to GDB process starts off with exporting GO > > > data, doesn't it only need to do that once no matter how many > > > species are selected? As you probably realize, I'm leading towards > > > not having to iterate through this for each taxon id if possible. > > > > >> > > > > >> Also, how does the export actually work? How are GO ids and > > > UniProt ids related within the table? > > > > >> > > > > >> Thanks! > > > > >> > > > > >> Richard > > > > >> > > > > >> > > > > > <ATT00001..txt><ATT00002..txt> > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > > > BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA > > > > The must-attend event for mobile developers. Connect with experts. > > > > Get tools for creating Super Apps. See the latest technologies. > > > > Sessions, hands-on labs, demos & much more. Register early & save! > > > > http://p.sf.net/sfu/rim-blackberry-1 > > > > _______________________________________________ > > > > xmlpipedb-developer mailing list > > > > xml...@li... > > > > https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer > > > > > > > > > > > > > > > > <ATT00001..txt><ATT00002..txt> > > > > > > > > >------------------------------------------------------------------------------ > > >Get a FREE DOWNLOAD! and learn more about uberSVN rich system, > > >user administration capabilities and model configuration. Take > > >the hassle out of deploying and managing Subversion and the > > >tools developers use with it. > > >http://p.sf.net/sfu/wandisco-dev2dev > > >_______________________________________________ > > >xmlpipedb-developer mailing list > > >xml...@li... > > >https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer > > > > > > ------------------------------------------------------------------------------ > > FREE DOWNLOAD - uberSVN with Social Coding for Subversion. > > Subversion made easy with a complete admin console. Easy > > to use, easy to manage, easy to install, easy to extend. > > Get a Free download of the new open ALM Subversion platform now. > > http://p.sf.net/sfu/wandisco-dev2dev > > _______________________________________________ > > xmlpipedb-developer mailing list > > xml...@li... > > https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer > > > > <ATT00001..txt><ATT00002..txt> > > > ------------------------------------------------------------------------------ > uberSVN's rich system and user administration capabilities and model > configuration take the hassle out of deploying and managing Subversion and > the tools developers use with it. Learn more about uberSVN and get a free > download at: http://p.sf.net/sfu/wandisco-dev2dev > _______________________________________________ > xmlpipedb-developer mailing list > xml...@li... > https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer > > > > <ATT00001..txt><ATT00002..txt> |