From: <ju...@cs...> - 2006-02-06 17:13:44
|
Any thoughts on this? Does implementing a "SRes.ExternalDatabaseRelease"-like way to group rows in DoTS.Similarity together sound useful to anybody, or do people find the use of Core.AnalysisAlgorithm to accomplish such a thing completely satisfactory? Thanks, Josef Daphne Preuss Laboratory Molecular Genetics and Cell Biology The University of Chicago ju...@cs... voice: (773) 834-3985 fax: (773) 702-6648 I wrote: > > > In the recent past, some of us needed a way to > distinguish between blast results in DoTS.Similarity by > the blast parameters used. For example, I might > blast the same sets of sequences several times > with different parameters and put all the results > of all these blast searches into DoTS.Similarity. > > I was of course, able to jury-rig a way to do > this, though now with GUS 3.5, an officially > sanctioned method has been implemented > by using the Core.AnalysisAlgorithm table. > > Below is a crude entity relationship diagram > of how Core.AnalysisAlgorithm fits in with other > relevant tables (be sure to view this file with > the courier font): > > Core.Algorithm (name) > | > | > Core.AlgorithmImplementation Core.AlgorithmParamKeyType > | | \ / (string, float, int, ...) > | | \ / > Core.TableInfo Core.AlgorithmInvocation Core.AlgorithmParamKey > | \ | \ / (description of parameter) > | \ | \ / > DoTS.Similarity Core.AnalysisAlgorithm Core.AlgorithmParam (parameters as a string) > (individual parameter) > > > And so, to insert into GUS a list of blast parameters such as: > > -p blastp -FD -W2 -G 11 -E 1 -e 0.5 -f 11 -M BLOSUM62 -b 1000000 -v 1000000 > > one would need to: > > have a row for every flag in Core.AlgorithmParamKey > have a row for every value after a flag in Core.AlgorithmParam > > which is very complicated to both insert and query. I suppose if one > wrote a gus plugin which is a wrapper around blast, inserting parameters > in these all of these rows/fields could be easily taken care > of, though we at the Preuss lab just are not going to do things > that way. We may get blast data from a collaborator that took > days to run on a multi-node cluster and we just want to dump > this data into DoTS.Similarity/DoTS.SimilaritySpan and query > it. We can't run this blast search again with a plugin. > > And again, querying blast results by blast parameter between the > same set of sequences looks to be very complex with the > above tables. Imagine writing SQL trying to distinguish between > blast results based on these three sets of parameters. > > -p blastp -FD -W2 -G 11 -E 1 -e 0.5 -f 11 -M BLOSUM62 -b 1000000 -v 1000000 > -p blastp -FD -W3 -G 11 -E 1 -e 0.5 -f 11 -M BLOSUM62 -b 1000000 -v 1000000 > -p blastp -FD -W2 -G 11 -E 1 -e 0.5 -f 11 -M BLOSUM80 -b 1000000 -v 1000000 > > > Perhaps people on the list can let me know if there are > any labs outside of CBIL that are depositing and querying > blast search parameters with the above tables. > > > What we at the Preuss lab really need is a simple way > to group rows in DoTS.Similarity together, much like the > way one groups rows in DoTS.ExternalNASequence together with > the table SRes.ExternalDatabaseRelease. Then a set of blast > results could be labeled with a convenient name > such as "ME vs Ath, W9" or "Jim's lab, -FF". > > > I will go a head and implement something locally to do > this, but I would think such a thing would not only > be valuable, but necessary for others too. > Does implementing a such a table (perhaps calling it > DoTS.SimilaritySet) in the official distribution make > sense? > > Or perhaps I am wrong in my understanding of how > the Core.AnalysisAlgorithm table can be used, and > there is a simpler way to do this. If so, I hope > someone can enlighten me. > > > Thank you for reading; > > Josef > |