From: Chris S. <sto...@pc...> - 2006-02-07 02:39:10
|
Hi Josef, AlgorithmInvocation has a comment_string attribute that can be used to put your convenient name. AnalysisAlgorithm can be used to tie that to Similarity rows through the softlinks table_id, row_id. Note that there is also a record keeping attribute: row_alg_invocation_id that could be used as well - this typically stores a record of the plugin used to load the data. The semantics of an ExternalDatabaseRelease have been broadened to include data files so that would be OK too if you wanted to record details of who, when, where, what etc. in a more structured way. What you would want I guess is a linking table that says these similarities came from this external database release. Evidence could be use (target = similarity; fact = external database release). We are looking into altering Similarity or adding a table to better capture alignments (an attribute to indicate gaps). We can also consider providing a link to external database release as part of this if it makes sense. Cheers. Chris On Feb 6, 2006, at 12:13 PM, Josef Jurek wrote: > > > Any thoughts on this? > > Does implementing a "SRes.ExternalDatabaseRelease"-like way to group > rows in DoTS.Similarity together sound useful to anybody, > or do people find the use of Core.AnalysisAlgorithm to accomplish > such a thing completely satisfactory? > > Thanks, Josef > > > Daphne Preuss Laboratory > Molecular Genetics and Cell Biology > The University of Chicago > ju...@cs... > > voice: (773) 834-3985 > fax: (773) 702-6648 > > > I wrote: >> >> >> In the recent past, some of us needed a way to >> distinguish between blast results in DoTS.Similarity by >> the blast parameters used. For example, I might >> blast the same sets of sequences several times >> with different parameters and put all the results >> of all these blast searches into DoTS.Similarity. >> >> I was of course, able to jury-rig a way to do >> this, though now with GUS 3.5, an officially >> sanctioned method has been implemented >> by using the Core.AnalysisAlgorithm table. >> >> Below is a crude entity relationship diagram >> of how Core.AnalysisAlgorithm fits in with other >> relevant tables (be sure to view this file with >> the courier font): >> >> Core.Algorithm (name) >> | >> | >> Core.AlgorithmImplementation >> Core.AlgorithmParamKeyType >> | | \ / >> (string, float, int, ...) >> | | \ / >> Core.TableInfo Core.AlgorithmInvocation >> Core.AlgorithmParamKey >> | \ | \ / (description of >> parameter) >> | \ | \ / >> DoTS.Similarity Core.AnalysisAlgorithm Core.AlgorithmParam >> (parameters as a string) >> (individual parameter) >> >> >> And so, to insert into GUS a list of blast parameters such as: >> >> -p blastp -FD -W2 -G 11 -E 1 -e 0.5 -f 11 -M BLOSUM62 -b >> 1000000 -v 1000000 >> >> one would need to: >> >> have a row for every flag in Core.AlgorithmParamKey >> have a row for every value after a flag in Core.AlgorithmParam >> >> which is very complicated to both insert and query. I suppose if one >> wrote a gus plugin which is a wrapper around blast, inserting >> parameters >> in these all of these rows/fields could be easily taken care >> of, though we at the Preuss lab just are not going to do things >> that way. We may get blast data from a collaborator that took >> days to run on a multi-node cluster and we just want to dump >> this data into DoTS.Similarity/DoTS.SimilaritySpan and query >> it. We can't run this blast search again with a plugin. >> >> And again, querying blast results by blast parameter between the >> same set of sequences looks to be very complex with the >> above tables. Imagine writing SQL trying to distinguish between >> blast results based on these three sets of parameters. >> >> -p blastp -FD -W2 -G 11 -E 1 -e 0.5 -f 11 -M BLOSUM62 -b >> 1000000 -v 1000000 >> -p blastp -FD -W3 -G 11 -E 1 -e 0.5 -f 11 -M BLOSUM62 -b >> 1000000 -v 1000000 >> -p blastp -FD -W2 -G 11 -E 1 -e 0.5 -f 11 -M BLOSUM80 -b >> 1000000 -v 1000000 >> >> >> Perhaps people on the list can let me know if there are >> any labs outside of CBIL that are depositing and querying >> blast search parameters with the above tables. >> >> >> What we at the Preuss lab really need is a simple way >> to group rows in DoTS.Similarity together, much like the >> way one groups rows in DoTS.ExternalNASequence together with >> the table SRes.ExternalDatabaseRelease. Then a set of blast >> results could be labeled with a convenient name >> such as "ME vs Ath, W9" or "Jim's lab, -FF". >> >> >> I will go a head and implement something locally to do >> this, but I would think such a thing would not only >> be valuable, but necessary for others too. >> Does implementing a such a table (perhaps calling it >> DoTS.SimilaritySet) in the official distribution make >> sense? >> >> Or perhaps I am wrong in my understanding of how >> the Core.AnalysisAlgorithm table can be used, and >> there is a simpler way to do this. If so, I hope >> someone can enlighten me. >> >> >> Thank you for reading; >> >> Josef >> > > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through > log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD > SPLUNK! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |