Re: [XMLPipeDB-developer] System table subquery
Brought to you by:
kdahlquist,
zugzugglug
|
From: Richard B. <rbr...@gm...> - 2011-10-01 22:36:14
|
Reread my last line and of course was reminded to proof any quickly typed,
edited and rewritten sentences before hitting send lol
OLD AND CONFUSING SENTENCE: "But that said I have my head wrapped around the
solution and will go ahead and code it out and complete
getRelationshipTableManager() completely transitioned."
NEW IMPROVED SENTENCE: "But that said, I have my head wrapped around the
solution and will go ahead and complete the changes to
getRelationshipTableManager(). These last changes will complete the
transition from single species to a multiple species aware method."
rb
On Sat, Oct 1, 2011 at 3:29 PM, Richard Brous <rbr...@gm...> wrote:
> Wow thanks for the detailed reply and anticipating questions that might
> follow!
>
> Glad my understanding of what was going on was solid even though my nested
> "else if for" code didn't actually make semantic sense. I realized it was
> doing as you described but wanted to keep it in that form as a vehicle to
> clearly relay my thinking. And since my maturity to object looping, I wasn't
> sure if there was a slick way to fool the if else loop into thinking the
> inner for loop was boolean.
>
> That leads to the transition of somehow combining the my "else if for"
> conditional and the existing else code into a combined else:
>
> I didn't believe I had the option of dumping the existing else block since
> it was doing something already (which also may be referenced by other code)
> but I should have thought about combining them in the else block. I'm
> sitting here thinking: Duh, it should have occurred to me. =/
>
> But that said I have my head wrapped around the solution and will go ahead
> and code it out and complete getRelationshipTableManager() completely
> transitioned.
>
> Richard
>
>
> On Sat, Oct 1, 2011 at 1:14 PM, John David N. Dionisio <do...@lm...>wrote:
>
>> Hi Rich,
>>
>> First, let's phrase out in English what gets done here overall. Then,
>> still at the conceptual level, I'll talk about what needs to be done.
>> Finally, we'll look at what needs to change in the code. In that part, we
>> will also look at why you're getting syntax errors in your current working
>> version (assuming that what you included is exactly the code that you
>> currently have). At all points, be conscious of whether what I'm writing
>> matches your current understanding, or not. If something clears up for you,
>> then great; if not, let me know what isn't clear.
>>
>> 1. What is being done
>>
>> The method goes through the full list of relationship tables then,
>> depending on those tables, chooses different algorithms for producing their
>> data.
>>
>> The if conditions generally categorize these tables --- GeneOntology
>> relationships are handled one way; UniProt relationships are handled another
>> way; relationships that use other ID systems are handled yet another way.
>> The condition on which you're stuck involves relationships involving an ID
>> system that is *specific* to a particular species profile. In that
>> situation, the code essentially defers all work to the specific species
>> profile --- which makes sense, because, as a species-specific ID system, it
>> is fair to assume that only the species profile knows how to handle its
>> relationships to other ID systems.
>>
>> 2. What needs to be done, conceptually
>>
>> The change you are performing involves handling multiple species profiles.
>> In other words, for every species profile that you are exporting, you need
>> to give each of them an opportunity to handle the export to the currently
>> chosen relationship table. That involves first checking if the current
>> relationship table (e.g., "Blattner-GeneID") even applies to the species
>> profile. If the relationship table does not involve species-specific ID
>> systems, then that species profile effectively skips that relationship
>> table. Otherwise, it then builds up the ID pairs to be exported, via the
>> getSpeciesSpecificRelationshipTable method.
>>
>> 3. What needs to be done, in code
>>
>> So, given this view, let's look at the code:
>>
>> > else if
>> ((selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey(stp.systemTable1)
>> ||
>> >
>> selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey(stp.systemTable2))
>> &&
>> > !stp.systemTable2.equals("GeneOntology")) {
>>
>>
>> Note that, as written, this code checks *only one* species profile, and
>> that is after all prior cases (GeneOntology, UniProt, two different ones)
>> have already been checked. Your proposed change now is this:
>>
>> > else if(
>> > for(SpeciesProfile species : selectedSpeciesProfiles) {
>> >
>> (species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) ||
>> >
>> species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) &&
>> > !stp.systemTable2.equals("GeneOntology") }
>> > ) {
>>
>>
>> You got the loop here, as described in part 2. You are indeed supposed to
>> iterate through the selected species profiles, "ask" them if they need to do
>> any exports with the current relationship table, then have them do so if
>> they say "yes."
>>
>> Now, as to the problem with the actual code, look at how the process is
>> phrased in English --- you *iterate first*, and *then* check if the
>> relationship table matches. The source of your syntax error is that you are
>> putting the for statement inside the if condition. If you step back for a
>> moment, you'll see that this does not make semantic sense --- the if
>> condition expects an expression that evaluates to a boolean value. The
>> preface to a for statement is *not* such an expression --- in fact, it
>> doesn't evaluate to anything. That is the heart of your syntax error.
>>
>> So, in reality, this last clause simply has to become a pure "else," since
>> the condition has to be applied to *every single species profile*. The for
>> loop can then take place within, and the if check happens *inside that*,
>> since it has to happen for each selected species profile:
>>
>> else {
>> for (SpeciesProfile species: selectedSpeciesProfiles) {
>> if
>> ((species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) ||
>>
>> species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) &&
>> !"GeneOntology".equals(stp.systemTable2)) {
>>
>> // Have the species profile do the relationship table export.
>>
>> }
>> }
>> }
>>
>> That's the overall structure. Now, some housecleaning: there is already a
>> "pure else" clause at the bottom. However, if you look at that code, it
>> effectively does nothing: it just emits a single record with blanks for its
>> fields. I think you can safely skip that. Or, if you really do want to
>> make sure that an inapplicable relationship table does have at least this
>> dummy record, you can have a boolean in the code given above that indicates
>> whether or not the relationship table was handled:
>>
>> else {
>> boolean relationshipTableWasHandled = false;
>> /* Same for loop and if statement. */ {
>> // This would be in the case that a species profile does
>> perform an export.
>> relationshipTableWasHandled = true;
>> }
>>
>> if (!relationshipTableWasHandled) {
>> // Do the single-blank-row export here.
>> }
>> }
>>
>> So, that's the run of it. Hope this walkthrough and breakdown clears up
>> the issues.
>>
>> John David N. Dionisio, PhD
>> Associate Professor, Computer Science
>> Associate Director, University Honors Program
>> Loyola Marymount University
>>
>>
>> On Oct 1, 2011, at 11:34 AM, Richard Brous wrote:
>>
>> > I'm hung up on the last else if conditional within
>> getRelationshipTableManager()
>> >
>> > This is the "Species-X or X-Species" conditional, excluding GeneOntology
>> >
>> > The original single species conditional is:
>> >
>> > else if
>> ((selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey(stp.systemTable1)
>> ||
>> >
>> selectedSpeciesProfiles.get(0).getSpeciesSpecificSystemTables().containsKey(
>> stp.systemTable2)) &&
>> > !stp.systemTable2.equals("GeneOntology")) {
>> >
>> > Now it needs to be multispecies aware obviously so...
>> >
>> > Can I add a for loop within the else if and change the following
>> tablemanager creation logic , such as:
>> >
>> > else if(
>> > for(SpeciesProfile species : selectedSpeciesProfiles) {
>> >
>> (species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable1) ||
>> >
>> species.getSpeciesSpecificSystemTables().containsKey(stp.systemTable2)) &&
>> > !stp.systemTable2.equals("GeneOntology") }
>> > ) {
>> >
>> >
>> > adding the for loop is wrought with syntax errors, but the real question
>> is ... am I missing a much simpler solution to solve this?
>> >
>> > Richard
>> >
>> >
>> > On Sun, Sep 25, 2011 at 9:20 PM, Richard Brous <rbr...@gm...>
>> wrote:
>> > Worked on the code during the past few days and have committed some
>> changes tonight.
>> > Also successfully exported without error.
>> >
>> > UniProtDatabaseProfile.java
>> >
>> > • clean up of some comments and old code
>> > • cleaned up logging code for getSystemTableManager()
>> > • Worked on getRelationshipTableManager()
>> > • [Uniprot - X conditional]
>> > • rewrote SQL programmatically
>> > • used looping to create correct setStrings
>> > • added logging to surface details
>> > • -[X - X conditional]
>> > • rewrite of programmatic SQL is in progress
>> (created stringbuilder etc. in preparation)
>> > • added logging
>> > • -[Species - X or X - Species]
>> > • added minimal logging to be expanded upon
>> > DatabaseProfile.java
>> >
>> > • cleaned up comments
>> > • reviewed getRelationsTableManager() and added some logging to it
>> > ExportToGenMAPP.java
>> >
>> > • cleaned up code and comments for readability
>> >
>> >
>> >
>> > Appreciate any feedback as usual =D
>> >
>> > Richard
>> >
>> >
>> > On Mon, Sep 19, 2011 at 6:22 PM, John David N. Dionisio <do...@lm...>
>> wrote:
>> > Very cool; looks like we can move on now. Dr. Dahlquist and I suspect
>> that the relationship tables may not actually be as hard as they seem; just
>> a matter of tweaking the initial queries (which you're more comfortable
>> doing now!) so that they return the corresponding records for all of the
>> requested species. Onward we go :)
>> >
>> > John David N. Dionisio, PhD
>> > Associate Professor, Computer Science
>> > Associate Director, University Honors Program
>> > Loyola Marymount University
>> >
>> >
>> >
>> > On Sep 19, 2011, at 11:09 AM, Kam Dahlquist wrote:
>> >
>> > > Hi,
>> > >
>> > > LMU can accept 20 MB attachments now; I don't know how big your file
>> is zipped, but that's an option. You could also use LionShare.
>> > >
>> > > Glad to see success!
>> > >
>> > > Kam
>> > >
>> > > At 06:05 PM 9/18/2011, you wrote:
>> > >> Tried to post my latest gdb export to the biodb wiki but receiving
>> log in error (per separate email)
>> > >>
>> > >> But, I wanted to let you know I verified that each improper system
>> table does in fact contain the id and species name (samples of each):
>> > >>
>> > >>
>> > >> Pfam
>> > >> ID Species Date
>> > >> PF02866 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> PF07050 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> PF03479 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> PF02317 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> PF03279 |Pseudomonas aeruginosa| 9/14/2011
>> > >> PF09922 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> PF04205 |Pseudomonas aeruginosa| 9/14/2011
>> > >> PF03379 |Pseudomonas aeruginosa| 9/14/2011
>> > >> PF05389 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >>
>> > >>
>> > >> RefSeq
>> > >> ID Species Date
>> > >> YP_039973 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> YP_040411 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> YP_039521 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> NP_253798 |Pseudomonas aeruginosa| 9/14/2011
>> > >> NP_254101 |Pseudomonas aeruginosa| 9/14/2011
>> > >> NP_248930 |Pseudomonas aeruginosa| 9/14/2011
>> > >> NP_251503 |Pseudomonas aeruginosa| 9/14/2011
>> > >> NP_249960 |Pseudomonas aeruginosa| 9/14/2011
>> > >> NP_253220 |Pseudomonas aeruginosa| 9/14/2011
>> > >> YP_040877 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> NP_253802 |Pseudomonas aeruginosa| 9/14/2011
>> > >>
>> > >>
>> > >> GeneId
>> > >> ID Species Date
>> > >> 879140 |Pseudomonas aeruginosa| 9/14/2011
>> > >> 2859243 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> 879337 |Pseudomonas aeruginosa| 9/14/2011
>> > >> 2860668 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> 881899 |Pseudomonas aeruginosa| 9/14/2011
>> > >> 882312 |Pseudomonas aeruginosa| 9/14/2011
>> > >> 2860009 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> 881520 |Pseudomonas aeruginosa| 9/14/2011
>> > >> 2861152 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> 881596 |Pseudomonas aeruginosa| 9/14/2011
>> > >>
>> > >>
>> > >> InterPro
>> > >> ID Species Date
>> > >> IPR022522 |Pseudomonas aeruginosa| 9/14/2011
>> > >> IPR003538 |Pseudomonas aeruginosa| 9/14/2011
>> > >> IPR016379 |Pseudomonas aeruginosa| 9/14/2011
>> > >> IPR016920 |Pseudomonas aeruginosa| 9/14/2011
>> > >> IPR000477 |Pseudomonas aeruginosa| 9/14/2011
>> > >> IPR008948 |Pseudomonas aeruginosa| 9/14/2011
>> > >> IPR005415 |Pseudomonas aeruginosa| 9/14/2011
>> > >> IPR009651 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> IPR007895 |Pseudomonas aeruginosa| 9/14/2011
>> > >> IPR008231 |Pseudomonas aeruginosa| 9/14/2011
>> > >> IPR004558 |Pseudomonas aeruginosa| 9/14/2011
>> > >> IPR011067 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> IPR006358 |Pseudomonas aeruginosa| 9/14/2011
>> > >>
>> > >>
>> > >> PDB
>> > >> ID Species Date
>> > >> 2ZWS |Pseudomonas aeruginosa| 9/14/2011
>> > >> 2F9I |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> 2EXV |Pseudomonas aeruginosa| 9/14/2011
>> > >> 3L34 |Pseudomonas aeruginosa| 9/14/2011
>> > >> 1EZM |Pseudomonas aeruginosa| 9/14/2011
>> > >> 2WYB |Pseudomonas aeruginosa| 9/14/2011
>> > >> 2F1L |Pseudomonas aeruginosa| 9/14/2011
>> > >> 1D7L |Pseudomonas aeruginosa| 9/14/2011
>> > >> 1Y12 |Pseudomonas aeruginosa| 9/14/2011
>> > >> 2IXH |Pseudomonas aeruginosa| 9/14/2011
>> > >> 1XAG |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> 2IXI |Pseudomonas aeruginosa| 9/14/2011
>> > >>
>> > >>
>> > >> EMBL
>> > >> ID Species Date
>> > >> AJ003006 |Pseudomonas aeruginosa| 9/14/2011
>> > >> BX571856 |Staphylococcus aureus (strain MRSA252)| 9/14/2011
>> > >> AJ633619 |Pseudomonas aeruginosa| 9/14/2011
>> > >> AJ633602 |Pseudomonas aeruginosa| 9/14/2011
>> > >> U07359 |Pseudomonas aeruginosa| 9/14/2011
>> > >> X54201 |Pseudomonas aeruginosa| 9/14/2011
>> > >> AY899300 |Pseudomonas aeruginosa| 9/14/2011
>> > >> AB085582 |Pseudomonas aeruginosa| 9/14/2011
>> > >> AB075926 |Pseudomonas aeruginosa| 9/14/2011
>> > >> AF306766 |Pseudomonas aeruginosa| 9/14/2011
>> > >> X99471 |Pseudomonas aeruginosa| 9/14/2011
>> > >> M21093 |Pseudomonas aeruginosa| 9/14/2011
>> > >>
>> > >>
>> > >> Richard
>> > >>
>> > >> On Wed, Sep 14, 2011 at 6:50 PM, Richard Brous <rbr...@gm...>
>> wrote:
>> > >> OK, going to commit my changes to UniProtDatabaseProfile:
>> getSystemTableManager()
>> > >>
>> > >> Here is some logging info to confirm result.next() within the while
>> loop is both id and species name:
>> > >>
>> > >> 3070712 [Thread-4] INFO
>> edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile
>> - getSystemTableManager(): while loop: ID:: IPR006314 Species::
>> Staphylococcus aureus (strain MRSA252)
>> > >> 3070712 [Thread-4] INFO
>> edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile
>> - getSystemTableManager(): while loop: ID:: IPR013840 Species::
>> Staphylococcus aureus (strain MRSA252)
>> > >> 3070728 [Thread-4] INFO
>> edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile
>> - getSystemTableManager(): while loop: ID:: IPR015887 Species::
>> Pseudomonas aeruginosa
>> > >> 3070728 [Thread-4] INFO
>> edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtDatabaseProfile
>> - getSystemTableManager(): while loop: ID:: IPR016148 Species::
>> Pseudomonas aeruginosa
>> > >>
>> > >> Richard
>> > >>
>> > >> On Wed, Sep 14, 2011 at 3:20 PM, Richard Brous <rbr...@gm...>
>> wrote:
>> > >> Solid progress made and I now have a compilable working copy.
>> > >>
>> > >> I'm now working through how to plug the results of the query into the
>> proper slots under the while loop. I had stumbled on an error where I was
>> supplying the column name instead of the actual data from the tuple. Solved
>> that and am hoping the current export will be the last before I commit to
>> sourceforge.
>> > >>
>> > >> Richard
>> > >>
>> > >>
>> > >> On Mon, Sep 12, 2011 at 10:59 PM, Richard Brous <rbr...@gm...>
>> wrote:
>> > >> Spent the weekend reviewing sql and I have achieved some clarity.
>> > >>
>> > >> I'm still working through things but not at least I have better
>> context and can ask intelligent questions.
>> > >>
>> > >> The sub query was a big help, I'm not sure how long it would have
>> taken me to do all the joins using ON to return "hjid | species name"
>> > >>
>> > >> Dondi - I'll stop by after theory tomorrow to discuss further.
>> > >>
>> > >> Thanks.
>> > >>
>> > >> Richard
>> > >>
>> > >>
>> > >> On Thu, Sep 8, 2011 at 5:24 PM, John David N. Dionisio <
>> do...@lm...> wrote:
>> > >> Hi Rich,
>> > >>
>> > >> As discussed in our meeting, here is a first step toward the new
>> system table query:
>> > >>
>> > >> SELECT entrytype.hjid, organismnametype.value FROM entrytype
>> INNER JOIN organismtype ON (entrytype.organism = organismtype.hjid) inner
>> join organismnametype on (organismtype.hjid =
>> organismnametype.organismtype_name_hjid) INNER JOIN dbreferencetype
>> ON(dbreferencetype.organismtype_dbreference_hjid = organismtype.hjid) WHERE
>> dbreferencetype.type = 'NCBI Taxonomy' and (id = '90371');
>> > >>
>> > >> (substitute the "id = " clause accordingly)
>> > >>
>> > >> John David N. Dionisio, PhD
>> > >> Associate Professor, Computer Science
>> > >> Associate Director, University Honors Program
>> > >> Loyola Marymount University
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> ------------------------------------------------------------------------------
>> > >> Why Cloud-Based Security and Archiving Make Sense
>> > >> Osterman Research conducted this study that outlines how and why
>> cloud
>> > >> computing security and archiving is rapidly being adopted across the
>> IT
>> > >> space for its ease of implementation, lower cost, and increased
>> > >> reliability. Learn more.
>> http://www.accelacomm.com/jaw/sfnl/114/51425301/
>> > >> _______________________________________________
>> > >> xmlpipedb-developer mailing list
>> > >> xml...@li...
>> > >> https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > > <ATT00001..txt><ATT00002..txt>
>> >
>> >
>> >
>> ------------------------------------------------------------------------------
>> > All the data continuously generated in your IT infrastructure contains a
>> > definitive record of customers, application performance, security
>> > threats, fraudulent activity and more. Splunk takes this data and makes
>> > sense of it. Business sense. IT sense. Common sense.
>> > http://p.sf.net/sfu/splunk-d2dcopy1
>> > _______________________________________________
>> > xmlpipedb-developer mailing list
>> > xml...@li...
>> > https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer
>> >
>> >
>> > <ATT00001..txt><ATT00002..txt>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> All of the data generated in your IT infrastructure is seriously valuable.
>> Why? It contains a definitive record of application performance, security
>> threats, fraudulent activity, and more. Splunk takes this data and makes
>> sense of it. IT sense. And common sense.
>> http://p.sf.net/sfu/splunk-d2dcopy2
>> _______________________________________________
>> xmlpipedb-developer mailing list
>> xml...@li...
>> https://lists.sourceforge.net/lists/listinfo/xmlpipedb-developer
>>
>
>
|