From: Sucheta T. <su...@vb...> - 2004-07-19 16:04:21
|
Got it. Thanks Sucheta At 11:50 AM 7/19/2004 -0400, Jonathan Schug wrote: >Sucheta: > >That info is in the NA (or AA) Sequence table. Remember that the query >and subject sequences can be in different external_database_releases so >that info must be on the individual sequences, not the similarity. > >Jonathan > >On Jul 19, 2004, at 11:26 AM, Sucheta Tripathy wrote: > >>Hi Jonathan, >> >>Our databases are not really having any sequence overlap. So as you >>suggested the external_database_release_id should be fine. But when I >>look into the Dots.similarity and the dots.similarityspan I don't see >>anything pointing to externaldatabase/externaldatabaserelease. I was >>wondering if I missed out something or there is a linker. >> >>Thanks >> >>Sucheta >> >>At 10:00 AM 7/19/2004 -0400, Jonathan Schug wrote: >>>Sucheta, and All: >>> >>>If the blast libraries do not overlap, i.e., contain different sets of >>>sequences, then there is probably no problem. You can simply >>>distinguish the Similarity rows by the target sequence's >>>external_database_release_id. >>> >>>If the libraries overlap, then the issue is more difficult. We don't >>>have the notion of a library in this sense, and of course the library >>>size affects the p-values for matches. There is a DoTS::Library table >>>that holds clone information that could be hacked to provide what you >>>want, but I do *not* recommend it as a long term solution. You could >>>also use the DoTS::DbRefNaSequence or DoTS::AASequenceDbRef as >>>appropriate to link to a DoTS::DbRef which links to an >>>ExternalDatabase. These tables could easily be used to gather >>>sequences into multiple BLAST database files. >>> >>>The best solution is probably to create new tables. >>> >>>I propose, then the following changes to the Similarity table and the >>>addition of new table to track search libraries: >>> >>>DoTS::Similarity >>> - add search_algorithm_invocation_id link to stably point to >>>parameter values for the search. >>> >>>SRes::SearchLibrary >>> - contains a description of the search library including entry count >>>etc. >>> >>>SRes::SearchLibraryMember >>> - uses a soft link, i.e., table_id, row_id to indicate membership. >>> - link is soft so that SearchLibrary can also be used for motifs, >>>etc. that may not be in sequence table. >>> - SearchLibrary might contain a table_id to record what kind of >>>entries are in the library. >>> >>>Thoughts? >>> >>>Jonathan >>> >>>On Jul 19, 2004, at 9:31 AM, Sucheta Tripathy wrote: >>> >>>>Hi Jonathan, >>>> >>>>Thanks for your reply. I also have a similar concern which I posted >>>>sometimes back. My concern is to do with multiple databases rather >>>>than the parameters for blast search. Currently we want to store >>>>blast >>>>results against 23 different databases. Any suggestions which table >>>>may be suitable? >>>> >>>>Thanks >>>> >>>>Sucheta >>>> >>>>At 12:18 AM 7/19/2004 -0400, Jonathan Schug wrote: >>>>>Josef: >>>>> >>>>>The two columns in DoTS::Similarity that might be useful are >>>>>algorithm >>>>>and row_alg_invocation_id. Algorithm is not recommended; it is >>>>>meant >>>>>to be used to distinguish, say, BLAST hits from FASTA hits. >>>>>Row_alg_invocation_id is better and will work. You can use the >>>>>AlgoithmInvocation parameters. However, this will not work if the >>>>>rows >>>>>are modified in some way later one. Later updates will change the >>>>>row_alg_invocation_id ruining this scheme. You could also consider >>>>>linking Similarity rows to the invocation via an Evidence row. This >>>>>is >>>>>more stable. >>>>> >>>>>PlasmoDB faced this when tuning BLAST parameters to avoid the >>>>>problems >>>>>with the high AT content of the Pf genome. They may have tuned the >>>>>parameters outside of the DB. >>>>> >>>>>You might also consider running the BLAST searches with the most >>>>>lenient parameters, then recreating the more stringent searches with >>>>>query parameters if this is possible >>>>> >>>>>Jonathan >>>>> >>>>> >>>>>-------------------------------------------------------------------- >>>>>-- -- --- >>>>>Jonathan Schug Center for Bioinformatics >>>>>js...@pc... Computational Biology and Informatics Lab >>>>>(215) 573-3113 voice University of Pennsylvania, >>>>>(215) 573-3111 fax 1413 Blockley Hall, Philadelphia, PA >>>>>19014-6021 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>------------------------------------------------------- >>>>>This SF.Net email is sponsored by BEA Weblogic Workshop >>>>>FREE Java Enterprise J2EE developer tools! >>>>>Get your free copy of BEA WebLogic Workshop 8.1 today. >>>>>http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click >>>>>_______________________________________________ >>>>>Gusdev-gusdev mailing list >>>>>Gus...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >>> >>> >>>------------------------------------------------------- >>>This SF.Net email is sponsored by BEA Weblogic Workshop >>>FREE Java Enterprise J2EE developer tools! >>>Get your free copy of BEA WebLogic Workshop 8.1 today. >>>http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click >>>_______________________________________________ >>>Gusdev-gusdev mailing list >>>Gus...@li... >>>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |