Re: [XMLPipeDB-developer] System table subquery
Brought to you by:
kdahlquist,
zugzugglug
From: John D. N. D. <do...@lm...> - 2011-10-01 20:14:29
|
Hi Rich, First, let's phrase out in English what gets done here overall. Then, still at the conceptual level, I'll talk about what needs to be done. Finally, we'll look at what needs to change in the code. In that part, we will also look at why you're getting syntax errors in your current working version (assuming that what you included is exactly the code that you currently have). At all points, be conscious of whether what I'm writing matches your current understanding, or not. If something clears up for you, then great; if not, let me know what isn't clear. 1. What is being done The method goes through the full list of relationship tables then, depending on those tables, chooses different algorithms for producing their data. The if conditions generally categorize these tables --- GeneOntology relationships are handled one way; UniProt relationships are handled another way; relationships that use other ID systems are handled yet another way. The condition on which you're stuck involves relationships involving an ID system that is *specific* to a particular species profile. In that situation, the code essentially defers all work to the specific species profile --- which makes sense, because, as a species-specific ID system, it is fair to assume that only the species profile knows how to handle its relationships to other ID systems. 2. What needs to be done, conceptually The change you are performing involves handling multiple species profiles. In other words, for every species profile that you are exporting, you need to give each of them an opportunity to handle the export to the currently chosen relationship table. That involves first checking if the current relationship table (e.g., "Blattner-GeneID") even applies to the species profile. If the relationship table does not involve species-specific ID systems, then that species profile effectively skips that relationship table. Otherwise, it then builds up the ID pairs to be exported, via the getSpeciesSpecificRelationshipTable method. 3. What needs to be done, in code So, given this view, let's look at the code: > else if ((selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) || > selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) && > !stp.systemTable2.equals("GeneOntology")) { Note that, as written, this code checks *only one* species profile, and that is after all prior cases (GeneOntology, UniProt, two different ones) have already been checked. Your proposed change now is this: > else if( > for(SpeciesProfile species : selectedSpeciesProfiles) { > (species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) || > species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) && > !stp.systemTable2.equals("GeneOntology") } > ) { You got the loop here, as described in part 2. You are indeed supposed to iterate through the selected species profiles, "ask" them if they need to do any exports with the current relationship table, then have them do so if they say "yes." Now, as to the problem with the actual code, look at how the process is phrased in English --- you *iterate first*, and *then* check if the relationship table matches. The source of your syntax error is that you are putting the for statement inside the if condition. If you step back for a moment, you'll see that this does not make semantic sense --- the if condition expects an expression that evaluates to a boolean value. The preface to a for statement is *not* such an expression --- in fact, it doesn't evaluate to anything. That is the heart of your syntax error. So, in reality, this last clause simply has to become a pure "else," since the condition has to be applied to *every single species profile*. The for loop can then take place within, and the if check happens *inside that*, since it has to happen for each selected species profile: else { for (SpeciesProfile species: selectedSpeciesProfiles) { if ((species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) || species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) && !"GeneOntology".equals(stp.systemTable2)) { // Have the species profile do the relationship table export. } } } That's the overall structure. Now, some housecleaning: there is already a "pure else" clause at the bottom. However, if you look at that code, it effectively does nothing: it just emits a single record with blanks for its fields. I think you can safely skip that. Or, if you really do want to make sure that an inapplicable relationship table does have at least this dummy record, you can have a boolean in the code given above that indicates whether or not the relationship table was handled: else { boolean relationshipTableWasHandled = false; /* Same for loop and if statement. */ { // This would be in the case that a species profile does perform an export. relationshipTableWasHandled = true; } if (!relationshipTableWasHandled) { // Do the single-blank-row export here. } } So, that's the run of it. Hope this walkthrough and breakdown clears up the issues. John David N. Dionisio, PhD Associate Professor, Computer Science Associate Director, University Honors Program Loyola Marymount University On Oct 1, 2011, at 11:34 AM, Richard Brous wrote: > I'm hung up on the last else if conditional within getRelationshipTableManager() > > This is the "Species-X or X-Species" conditional, excluding GeneOntology > > The original single species conditional is: > > else if ((selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) || > selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey( stp.systemTable2)) && > !stp.systemTable2.equals("GeneOntology")) { > > Now it needs to be multispecies aware obviously so... > > Can I add a for loop within the else if and change the following tablemanager creation logic , such as: > > else if( > for(SpeciesProfile species : selectedSpeciesProfiles) { > (species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) || > species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) && > !stp.systemTable2.equals("GeneOntology") } > ) { > > > adding the for loop is wrought with syntax errors, but the real question is ... am I missing a much simpler solution to solve this? > > Richard > > > On Sun, Sep 25, 2011 at 9:20 PM, Richard Brous <rbr...@gm...> wrote: > Worked on the code during the past few days and have committed some changes tonight. > Also successfully exported without error. > > UniProtDatabaseProfile.java > > • clean up of some comments and old code > • cleaned up logging code for getSystemTableManager() > • Worked on getRelationshipTableManager() > • [Uniprot - X conditional] > • rewrote SQL programmatically > • used looping to create correct setStrings > • added logging to surface details > • -[X - X conditional] > • rewrite of programmatic SQL is in progress (created stringbuilder etc. in preparation) > • added logging > • -[Species - X or X - Species] > • added minimal logging to be expanded upon > DatabaseProfile.java > > • cleaned up comments > • reviewed getRelationsTableManager() and added some logging to it > ExportToGenMAPP.java > > • cleaned up code and comments for readability > > > > Appreciate any feedback as usual =D > > Richard > > > On Mon, Sep 19, 2011 at 6:22 PM, John David N. Dionisio <do...@lm...> wrote: > Very cool; looks like we can move on now. Dr. Dahlquist and I suspect that the relationship tables may not actually be as hard as they seem; just a matter of tweaking the initial queries (which you're more comfortable doing now!) so that they return the corresponding records for all of the requested species. Onward we go :) > > John David N. Dionisio, PhD > Associate Professor, Computer Science > Associate Director, University Honors Program > Loyola Marymount University > > > > On Sep 19, 2011, at 11:09 AM, Kam Dahlquist wrote: > > > Hi, > > > > LMU can accept 20 MB attachments now; I don't know how big your file is zipped, but that's an option. You could also use LionShare. > > > > Glad to see success! > > > > Kam > > > > At 06:05 PM 9/18/2011, you wrote: > >> Tried to post my latest gdb export to the biodb wiki but receiving log in error (per separate email) > >> > >> But, I wanted to let you know I verified that each improper system table does in fact contain the id and species name (samples of each): > >> > >> > >> Pfam > >> ID Species Date > >> PF02866 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> PF07050 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> PF03479 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> PF02317 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> PF03279 |Pseudomonas aeruginosa| 9/14/2011 > >> PF09922 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> PF04205 |Pseudomonas aeruginosa| 9/14/2011 > >> PF03379 |Pseudomonas aeruginosa| 9/14/2011 > >> PF05389 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> > >> > >> RefSeq > >> ID Species Date > >> YP_039973 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> YP_040411 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> YP_039521 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> NP_253798 |Pseudomonas aeruginosa| 9/14/2011 > >> NP_254101 |Pseudomonas aeruginosa| 9/14/2011 > >> NP_248930 |Pseudomonas aeruginosa| 9/14/2011 > >> NP_251503 |Pseudomonas aeruginosa| 9/14/2011 > >> NP_249960 |Pseudomonas aeruginosa| 9/14/2011 > >> NP_253220 |Pseudomonas aeruginosa| 9/14/2011 > >> YP_040877 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> NP_253802 |Pseudomonas aeruginosa| 9/14/2011 > >> > >> > >> GeneId > >> ID Species Date > >> 879140 |Pseudomonas aeruginosa| 9/14/2011 > >> 2859243 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> 879337 |Pseudomonas aeruginosa| 9/14/2011 > >> 2860668 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> 881899 |Pseudomonas aeruginosa| 9/14/2011 > >> 882312 |Pseudomonas aeruginosa| 9/14/2011 > >> 2860009 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> 881520 |Pseudomonas aeruginosa| 9/14/2011 > >> 2861152 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> 881596 |Pseudomonas aeruginosa| 9/14/2011 > >> > >> > >> InterPro > >> ID Species Date > >> IPR022522 |Pseudomonas aeruginosa| 9/14/2011 > >> IPR003538 |Pseudomonas aeruginosa| 9/14/2011 > >> IPR016379 |Pseudomonas aeruginosa| 9/14/2011 > >> IPR016920 |Pseudomonas aeruginosa| 9/14/2011 > >> IPR000477 |Pseudomonas aeruginosa| 9/14/2011 > >> IPR008948 |Pseudomonas aeruginosa| 9/14/2011 > >> IPR005415 |Pseudomonas aeruginosa| 9/14/2011 > >> IPR009651 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> IPR007895 |Pseudomonas aeruginosa| 9/14/2011 > >> IPR008231 |Pseudomonas aeruginosa| 9/14/2011 > >> IPR004558 |Pseudomonas aeruginosa| 9/14/2011 > >> IPR011067 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> IPR006358 |Pseudomonas aeruginosa| 9/14/2011 > >> > >> > >> PDB > >> ID Species Date > >> 2ZWS |Pseudomonas aeruginosa| 9/14/2011 > >> 2F9I |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> 2EXV |Pseudomonas aeruginosa| 9/14/2011 > >> 3L34 |Pseudomonas aeruginosa| 9/14/2011 > >> 1EZM |Pseudomonas aeruginosa| 9/14/2011 > >> 2WYB |Pseudomonas aeruginosa| 9/14/2011 > >> 2F1L |Pseudomonas aeruginosa| 9/14/2011 > >> 1D7L |Pseudomonas aeruginosa| 9/14/2011 > >> 1Y12 |Pseudomonas aeruginosa| 9/14/2011 > >> 2IXH |Pseudomonas aeruginosa| 9/14/2011 > >> 1XAG |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> 2IXI |Pseudomonas aeruginosa| 9/14/2011 > >> > >> > >> EMBL > >> ID Species Date > >> AJ003006 |Pseudomonas aeruginosa| 9/14/2011 > >> BX571856 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > >> AJ633619 |Pseudomonas aeruginosa| 9/14/2011 > >> AJ633602 |Pseudomonas aeruginosa| 9/14/2011 > >> U07359 |Pseudomonas aeruginosa| 9/14/2011 > >> X54201 |Pseudomonas aeruginosa| 9/14/2011 > >> AY899300 |Pseudomonas aeruginosa| 9/14/2011 > >> AB085582 |Pseudomonas aeruginosa| 9/14/2011 > >> AB075926 |Pseudomonas aeruginosa| 9/14/2011 > >> AF306766 |Pseudomonas aeruginosa| 9/14/2011 > >> X99471 |Pseudomonas aeruginosa| 9/14/2011 > >> M21093 |Pseudomonas aeruginosa| 9/14/2011 > >> > >> > >> Richard > >> > >> On Wed, Sep 14, 2011 at 6:50 PM, Richard Brous <rbr...@gm...> wrote: > >> OK, going to commit my changes to UniProtDatabaseProfile: getSystemTableManager() > >> > >> Here is some logging info to confirm result.next() within the while loop is both id and species name: > >> > >> 3070712 [Thread-4] INFO edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile - getSystemTableManager(): while loop: ID:: IPR006314 Species:: Staphylococcus aureus (strain MRSA252) > >> 3070712 [Thread-4] INFO edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile - getSystemTableManager(): while loop: ID:: IPR013840 Species:: Staphylococcus aureus (strain MRSA252) > >> 3070728 [Thread-4] INFO edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile - getSystemTableManager(): while loop: ID:: IPR015887 Species:: Pseudomonas aeruginosa > >> 3070728 [Thread-4] INFO edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile - getSystemTableManager(): while loop: ID:: IPR016148 Species:: Pseudomonas aeruginosa > >> > >> Richard > >> > >> On Wed, Sep 14, 2011 at 3:20 PM, Richard Brous <rbr...@gm...> wrote: > >> Solid progress made and I now have a compilable working copy. > >> > >> I'm now working through how to plug the results of the query into the proper slots under the while loop. I had stumbled on an error where I was supplying the column name instead of the actual data from the tuple. Solved that and am hoping the current export will be the last before I commit to sourceforge. > >> > >> Richard > >> > >> > >> On Mon, Sep 12, 2011 at 10:59 PM, Richard Brous <rbr...@gm...> wrote: > >> Spent the weekend reviewing sql and I have achieved some clarity. > >> > >> I'm still working through things but not at least I have better context and can ask intelligent questions. > >> > >> The sub query was a big help, I'm not sure how long it would have taken me to do all the joins using ON to return "hjid | species name" > >> > >> Dondi - I'll stop by after theory tomorrow to discuss further. > >> > >> Thanks. > >> > >> Richard > >> > >> > >> On Thu, Sep 8, 2011 at 5:24 PM, John David N. Dionisio <do...@lm...> wrote: > >> Hi Rich, > >> > >> As discussed in our meeting, here is a first step toward the new system table query: > >> > >> SELECT entrytype.hjid, organismnametype.value FROM entrytype INNER JOIN organismtype ON (entrytype.organism = organismtype.hjid) inner join organismnametype on (organismtype.hjid = organismnametype.organismtype_name_hjid) INNER JOIN dbreferencetype ON(dbreferencetype.organismtype_dbreference_hjid = organismtype.hjid) WHERE dbreferencetype.type = 'NCBI Taxonomy' and (id = '90371'); > >> > >> (substitute the "id = " clause accordingly) > >> > >> John David N. Dionisio, PhD > >> Associate Professor, Computer Science > >> Associate Director, University Honors Program > >> Loyola Marymount University > >> > >> > >> > >> > >> ------------------------------------------------------------------------------ > >> Why Cloud-Based Security and Archiving Make Sense > >> Osterman Research conducted this study that outlines how and why cloud > >> computing security and archiving is rapidly being adopted across the IT > >> space for its ease of implementation, lower cost, and increased > >> reliability. Learn more. http://www.accelacomm.com/jaw/sfnl/114/51425301/ > >> _______________________________________________ > >> xmlpipedb-developer mailing list > >> xml...@li... > >> https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer > >> > >> > >> > >> > >> > >> > > <ATT00001..txt><ATT00002..txt> > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > xmlpipedb-developer mailing list > xml...@li... > https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer > > > <ATT00001..txt><ATT00002..txt> |