Re: [XMLPipeDB-developer] System table subquery
Brought to you by:
kdahlquist,
zugzugglug
From: Richard B. <rbr...@gm...> - 2011-10-01 22:29:09
|
Wow thanks for the detailed reply and anticipating questions that might follow! Glad my understanding of what was going on was solid even though my nested "else if for" code didn't actually make semantic sense. I realized it was doing as you described but wanted to keep it in that form as a vehicle to clearly relay my thinking. And since my maturity to object looping, I wasn't sure if there was a slick way to fool the if else loop into thinking the inner for loop was boolean. That leads to the transition of somehow combining the my "else if for" conditional and the existing else code into a combined else: I didn't believe I had the option of dumping the existing else block since it was doing something already (which also may be referenced by other code) but I should have thought about combining them in the else block. I'm sitting here thinking: Duh, it should have occurred to me. =/ But that said I have my head wrapped around the solution and will go ahead and code it out and complete getRelationshipTableManager() completely transitioned. Richard On Sat, Oct 1, 2011 at 1:14 PM, John David N. Dionisio <do...@lm...>wrote: > Hi Rich, > > First, let's phrase out in English what gets done here overall. Then, > still at the conceptual level, I'll talk about what needs to be done. > Finally, we'll look at what needs to change in the code. In that part, we > will also look at why you're getting syntax errors in your current working > version (assuming that what you included is exactly the code that you > currently have). At all points, be conscious of whether what I'm writing > matches your current understanding, or not. If something clears up for you, > then great; if not, let me know what isn't clear. > > 1. What is being done > > The method goes through the full list of relationship tables then, > depending on those tables, chooses different algorithms for producing their > data. > > The if conditions generally categorize these tables --- GeneOntology > relationships are handled one way; UniProt relationships are handled another > way; relationships that use other ID systems are handled yet another way. > The condition on which you're stuck involves relationships involving an ID > system that is *specific* to a particular species profile. In that > situation, the code essentially defers all work to the specific species > profile --- which makes sense, because, as a species-specific ID system, it > is fair to assume that only the species profile knows how to handle its > relationships to other ID systems. > > 2. What needs to be done, conceptually > > The change you are performing involves handling multiple species profiles. > In other words, for every species profile that you are exporting, you need > to give each of them an opportunity to handle the export to the currently > chosen relationship table. That involves first checking if the current > relationship table (e.g., "Blattner-GeneID") even applies to the species > profile. If the relationship table does not involve species-specific ID > systems, then that species profile effectively skips that relationship > table. Otherwise, it then builds up the ID pairs to be exported, via the > getSpeciesSpecificRelationshipTable method. > > 3. What needs to be done, in code > > So, given this view, let's look at the code: > > > else if > ((selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) > || > > > selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) > && > > !stp.systemTable2.equals("GeneOntology")) { > > > Note that, as written, this code checks *only one* species profile, and > that is after all prior cases (GeneOntology, UniProt, two different ones) > have already been checked. Your proposed change now is this: > > > else if( > > for(SpeciesProfile species : selectedSpeciesProfiles) { > > > (species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) || > > > species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) && > > !stp.systemTable2.equals("GeneOntology") } > > ) { > > > You got the loop here, as described in part 2. You are indeed supposed to > iterate through the selected species profiles, "ask" them if they need to do > any exports with the current relationship table, then have them do so if > they say "yes." > > Now, as to the problem with the actual code, look at how the process is > phrased in English --- you *iterate first*, and *then* check if the > relationship table matches. The source of your syntax error is that you are > putting the for statement inside the if condition. If you step back for a > moment, you'll see that this does not make semantic sense --- the if > condition expects an expression that evaluates to a boolean value. The > preface to a for statement is *not* such an expression --- in fact, it > doesn't evaluate to anything. That is the heart of your syntax error. > > So, in reality, this last clause simply has to become a pure "else," since > the condition has to be applied to *every single species profile*. The for > loop can then take place within, and the if check happens *inside that*, > since it has to happen for each selected species profile: > > else { > for (SpeciesProfile species: selectedSpeciesProfiles) { > if > ((species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) || > > species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) && > !"GeneOntology".equals(stp.systemTable2)) { > > // Have the species profile do the relationship table export. > > } > } > } > > That's the overall structure. Now, some housecleaning: there is already a > "pure else" clause at the bottom. However, if you look at that code, it > effectively does nothing: it just emits a single record with blanks for its > fields. I think you can safely skip that. Or, if you really do want to > make sure that an inapplicable relationship table does have at least this > dummy record, you can have a boolean in the code given above that indicates > whether or not the relationship table was handled: > > else { > boolean relationshipTableWasHandled = false; > /* Same for loop and if statement. */ { > // This would be in the case that a species profile does perform > an export. > relationshipTableWasHandled = true; > } > > if (!relationshipTableWasHandled) { > // Do the single-blank-row export here. > } > } > > So, that's the run of it. Hope this walkthrough and breakdown clears up > the issues. > > John David N. Dionisio, PhD > Associate Professor, Computer Science > Associate Director, University Honors Program > Loyola Marymount University > > > On Oct 1, 2011, at 11:34 AM, Richard Brous wrote: > > > I'm hung up on the last else if conditional within > getRelationshipTableManager() > > > > This is the "Species-X or X-Species" conditional, excluding GeneOntology > > > > The original single species conditional is: > > > > else if > ((selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) > || > > > selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey( > stp.systemTable2)) && > > !stp.systemTable2.equals("GeneOntology")) { > > > > Now it needs to be multispecies aware obviously so... > > > > Can I add a for loop within the else if and change the following > tablemanager creation logic , such as: > > > > else if( > > for(SpeciesProfile species : selectedSpeciesProfiles) { > > > (species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) || > > > species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) && > > !stp.systemTable2.equals("GeneOntology") } > > ) { > > > > > > adding the for loop is wrought with syntax errors, but the real question > is ... am I missing a much simpler solution to solve this? > > > > Richard > > > > > > On Sun, Sep 25, 2011 at 9:20 PM, Richard Brous <rbr...@gm...> > wrote: > > Worked on the code during the past few days and have committed some > changes tonight. > > Also successfully exported without error. > > > > UniProtDatabaseProfile.java > > > > • clean up of some comments and old code > > • cleaned up logging code for getSystemTableManager() > > • Worked on getRelationshipTableManager() > > • [Uniprot - X conditional] > > • rewrote SQL programmatically > > • used looping to create correct setStrings > > • added logging to surface details > > • -[X - X conditional] > > • rewrite of programmatic SQL is in progress > (created stringbuilder etc. in preparation) > > • added logging > > • -[Species - X or X - Species] > > • added minimal logging to be expanded upon > > DatabaseProfile.java > > > > • cleaned up comments > > • reviewed getRelationsTableManager() and added some logging to it > > ExportToGenMAPP.java > > > > • cleaned up code and comments for readability > > > > > > > > Appreciate any feedback as usual =D > > > > Richard > > > > > > On Mon, Sep 19, 2011 at 6:22 PM, John David N. Dionisio <do...@lm...> > wrote: > > Very cool; looks like we can move on now. Dr. Dahlquist and I suspect > that the relationship tables may not actually be as hard as they seem; just > a matter of tweaking the initial queries (which you're more comfortable > doing now!) so that they return the corresponding records for all of the > requested species. Onward we go :) > > > > John David N. Dionisio, PhD > > Associate Professor, Computer Science > > Associate Director, University Honors Program > > Loyola Marymount University > > > > > > > > On Sep 19, 2011, at 11:09 AM, Kam Dahlquist wrote: > > > > > Hi, > > > > > > LMU can accept 20 MB attachments now; I don't know how big your file is > zipped, but that's an option. You could also use LionShare. > > > > > > Glad to see success! > > > > > > Kam > > > > > > At 06:05 PM 9/18/2011, you wrote: > > >> Tried to post my latest gdb export to the biodb wiki but receiving log > in error (per separate email) > > >> > > >> But, I wanted to let you know I verified that each improper system > table does in fact contain the id and species name (samples of each): > > >> > > >> > > >> Pfam > > >> ID Species Date > > >> PF02866 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> PF07050 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> PF03479 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> PF02317 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> PF03279 |Pseudomonas aeruginosa| 9/14/2011 > > >> PF09922 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> PF04205 |Pseudomonas aeruginosa| 9/14/2011 > > >> PF03379 |Pseudomonas aeruginosa| 9/14/2011 > > >> PF05389 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> > > >> > > >> RefSeq > > >> ID Species Date > > >> YP_039973 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> YP_040411 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> YP_039521 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> NP_253798 |Pseudomonas aeruginosa| 9/14/2011 > > >> NP_254101 |Pseudomonas aeruginosa| 9/14/2011 > > >> NP_248930 |Pseudomonas aeruginosa| 9/14/2011 > > >> NP_251503 |Pseudomonas aeruginosa| 9/14/2011 > > >> NP_249960 |Pseudomonas aeruginosa| 9/14/2011 > > >> NP_253220 |Pseudomonas aeruginosa| 9/14/2011 > > >> YP_040877 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> NP_253802 |Pseudomonas aeruginosa| 9/14/2011 > > >> > > >> > > >> GeneId > > >> ID Species Date > > >> 879140 |Pseudomonas aeruginosa| 9/14/2011 > > >> 2859243 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> 879337 |Pseudomonas aeruginosa| 9/14/2011 > > >> 2860668 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> 881899 |Pseudomonas aeruginosa| 9/14/2011 > > >> 882312 |Pseudomonas aeruginosa| 9/14/2011 > > >> 2860009 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> 881520 |Pseudomonas aeruginosa| 9/14/2011 > > >> 2861152 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> 881596 |Pseudomonas aeruginosa| 9/14/2011 > > >> > > >> > > >> InterPro > > >> ID Species Date > > >> IPR022522 |Pseudomonas aeruginosa| 9/14/2011 > > >> IPR003538 |Pseudomonas aeruginosa| 9/14/2011 > > >> IPR016379 |Pseudomonas aeruginosa| 9/14/2011 > > >> IPR016920 |Pseudomonas aeruginosa| 9/14/2011 > > >> IPR000477 |Pseudomonas aeruginosa| 9/14/2011 > > >> IPR008948 |Pseudomonas aeruginosa| 9/14/2011 > > >> IPR005415 |Pseudomonas aeruginosa| 9/14/2011 > > >> IPR009651 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> IPR007895 |Pseudomonas aeruginosa| 9/14/2011 > > >> IPR008231 |Pseudomonas aeruginosa| 9/14/2011 > > >> IPR004558 |Pseudomonas aeruginosa| 9/14/2011 > > >> IPR011067 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> IPR006358 |Pseudomonas aeruginosa| 9/14/2011 > > >> > > >> > > >> PDB > > >> ID Species Date > > >> 2ZWS |Pseudomonas aeruginosa| 9/14/2011 > > >> 2F9I |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> 2EXV |Pseudomonas aeruginosa| 9/14/2011 > > >> 3L34 |Pseudomonas aeruginosa| 9/14/2011 > > >> 1EZM |Pseudomonas aeruginosa| 9/14/2011 > > >> 2WYB |Pseudomonas aeruginosa| 9/14/2011 > > >> 2F1L |Pseudomonas aeruginosa| 9/14/2011 > > >> 1D7L |Pseudomonas aeruginosa| 9/14/2011 > > >> 1Y12 |Pseudomonas aeruginosa| 9/14/2011 > > >> 2IXH |Pseudomonas aeruginosa| 9/14/2011 > > >> 1XAG |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> 2IXI |Pseudomonas aeruginosa| 9/14/2011 > > >> > > >> > > >> EMBL > > >> ID Species Date > > >> AJ003006 |Pseudomonas aeruginosa| 9/14/2011 > > >> BX571856 |Staphylococcus aureus (strain MRSA252)| 9/14/2011 > > >> AJ633619 |Pseudomonas aeruginosa| 9/14/2011 > > >> AJ633602 |Pseudomonas aeruginosa| 9/14/2011 > > >> U07359 |Pseudomonas aeruginosa| 9/14/2011 > > >> X54201 |Pseudomonas aeruginosa| 9/14/2011 > > >> AY899300 |Pseudomonas aeruginosa| 9/14/2011 > > >> AB085582 |Pseudomonas aeruginosa| 9/14/2011 > > >> AB075926 |Pseudomonas aeruginosa| 9/14/2011 > > >> AF306766 |Pseudomonas aeruginosa| 9/14/2011 > > >> X99471 |Pseudomonas aeruginosa| 9/14/2011 > > >> M21093 |Pseudomonas aeruginosa| 9/14/2011 > > >> > > >> > > >> Richard > > >> > > >> On Wed, Sep 14, 2011 at 6:50 PM, Richard Brous <rbr...@gm...> > wrote: > > >> OK, going to commit my changes to UniProtDatabaseProfile: > getSystemTableManager() > > >> > > >> Here is some logging info to confirm result.next() within the while > loop is both id and species name: > > >> > > >> 3070712 [Thread-4] INFO > edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile > - getSystemTableManager(): while loop: ID:: IPR006314 Species:: > Staphylococcus aureus (strain MRSA252) > > >> 3070712 [Thread-4] INFO > edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile > - getSystemTableManager(): while loop: ID:: IPR013840 Species:: > Staphylococcus aureus (strain MRSA252) > > >> 3070728 [Thread-4] INFO > edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile > - getSystemTableManager(): while loop: ID:: IPR015887 Species:: > Pseudomonas aeruginosa > > >> 3070728 [Thread-4] INFO > edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile > - getSystemTableManager(): while loop: ID:: IPR016148 Species:: > Pseudomonas aeruginosa > > >> > > >> Richard > > >> > > >> On Wed, Sep 14, 2011 at 3:20 PM, Richard Brous <rbr...@gm...> > wrote: > > >> Solid progress made and I now have a compilable working copy. > > >> > > >> I'm now working through how to plug the results of the query into the > proper slots under the while loop. I had stumbled on an error where I was > supplying the column name instead of the actual data from the tuple. Solved > that and am hoping the current export will be the last before I commit to > sourceforge. > > >> > > >> Richard > > >> > > >> > > >> On Mon, Sep 12, 2011 at 10:59 PM, Richard Brous <rbr...@gm...> > wrote: > > >> Spent the weekend reviewing sql and I have achieved some clarity. > > >> > > >> I'm still working through things but not at least I have better > context and can ask intelligent questions. > > >> > > >> The sub query was a big help, I'm not sure how long it would have > taken me to do all the joins using ON to return "hjid | species name" > > >> > > >> Dondi - I'll stop by after theory tomorrow to discuss further. > > >> > > >> Thanks. > > >> > > >> Richard > > >> > > >> > > >> On Thu, Sep 8, 2011 at 5:24 PM, John David N. Dionisio <do...@lm...> > wrote: > > >> Hi Rich, > > >> > > >> As discussed in our meeting, here is a first step toward the new > system table query: > > >> > > >> SELECT entrytype.hjid, organismnametype.value FROM entrytype > INNER JOIN organismtype ON (entrytype.organism = organismtype.hjid) inner > join organismnametype on (organismtype.hjid = > organismnametype.organismtype_name_hjid) INNER JOIN dbreferencetype > ON(dbreferencetype.organismtype_dbreference_hjid = organismtype.hjid) WHERE > dbreferencetype.type = 'NCBI Taxonomy' and (id = '90371'); > > >> > > >> (substitute the "id = " clause accordingly) > > >> > > >> John David N. Dionisio, PhD > > >> Associate Professor, Computer Science > > >> Associate Director, University Honors Program > > >> Loyola Marymount University > > >> > > >> > > >> > > >> > > >> > ------------------------------------------------------------------------------ > > >> Why Cloud-Based Security and Archiving Make Sense > > >> Osterman Research conducted this study that outlines how and why cloud > > >> computing security and archiving is rapidly being adopted across the > IT > > >> space for its ease of implementation, lower cost, and increased > > >> reliability. Learn more. > http://www.accelacomm.com/jaw/sfnl/114/51425301/ > > >> _______________________________________________ > > >> xmlpipedb-developer mailing list > > >> xml...@li... > > >> https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer > > >> > > >> > > >> > > >> > > >> > > >> > > > <ATT00001..txt><ATT00002..txt> > > > > > > > ------------------------------------------------------------------------------ > > All the data continuously generated in your IT infrastructure contains a > > definitive record of customers, application performance, security > > threats, fraudulent activity and more. Splunk takes this data and makes > > sense of it. Business sense. IT sense. Common sense. > > http://p.sf.net/sfu/splunk-d2dcopy1 > > _______________________________________________ > > xmlpipedb-developer mailing list > > xml...@li... > > https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer > > > > > > <ATT00001..txt><ATT00002..txt> > > > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > _______________________________________________ > xmlpipedb-developer mailing list > xml...@li... > https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer > |