From: Chris S. <sto...@pc...> - 2003-12-17 12:58:00
|
John does state that these are for DoTS: >> Summary: Two new tables to support motif rejection >> We need two new tables in the DoTS schema to support >> marking a motif as a bad motif. I think they do belong in DoTS because these are sequence annotation that we have generated for our purposes. As Joan indicated, I think your idea for programmatically helping her identify these is a good one, but the calls would still need to be reviewed and a place is still needed to store those calls. Chris On Dec 16, 2003, at 4:20 PM, Angel Pizarro wrote: > John, > In the table defs you don't state what schema space you are slating > for these tables. I will assume that you mean DoTS. > > It is debatable whether this is the proper place for these tables. > Here are my reasons: > > 1) Prodom is an external resource that we are mirroring and we do not > want to know anything about their algorithms for identifying motifs, > we want to import them whole-sale. This is slightly different that > saying we are rejecting a prodom motif. (e.g. It is still a valid > prodom motif, although not a very informative one) > > 2) What you really want is to make quality assesment calls on the > motifs prior running some learning algorithm for GO assignements. Is > this application specific information? Genome annotation that is > useful for other folks? I do not know. GUSDEVers, please pipe in here. > Unless it is useful for folks to have this information tied to the > motif itself, I would place these tables on some application specific > space. > > If there is utility to a qualtiy assement for imported motifs, then we > should also track "good" quality motifs, etc, not just rejected or > misleading motifs. Information content algorithms come to mind... > > Angel > > John Iodice wrote: > >> >> GUS folks, >> >> I'm working on the GO term predictor, which uses BLAST similarities >> between proteins of known function and domains from CDD or ProDom to >> automatically assign functions to novel proteins. This system is >> built around rules, which are domain - GO-term pairs. In some >> cases, bad, repeat-rich domains have given rise to bad rules. We >> want to create the ability to mark these domains so they are not used >> for the generation or application of rules (and possibly other >> functions unrelated to the GO term predictor). >> >> We propose to do this by means of two new tables. A motif will be >> marked as rejected by its addition to the rejectedMotif table. It >> will be identified by a source_id/external_database_id pair. The >> record will also include, for documentation, an >> external_database_release_id and a motif_rejection_reason_id. The >> latter will be the primary key of the motifRejectionReason table, >> which will store a name and description for each reason. >> >> The request is number 854957. Here's a link. I include the text >> below: >> https://sourceforge.net/tracker/? >> func=detail&aid=854957&group_id=54213&atid=479181 >> <https://sourceforge.net/tracker/? >> func=detail&aid=854957&group_id=54213&atid=479181> >> >> Thanks in advance for any comments. >> John >> >> ---------------------------------------------------------------------- >> ---------------------------------------------------------------------- >> ------------------------------------- >> >> Summary: Two new tables to support motif rejection >> We need two new tables in the DoTS schema to support >> marking a motif as a bad motif. They look like this: >> >> rejectedMotif >> ( >> rejected_motif_id number(10) not null, >> source_id varchar2(32) not null, >> external_database_id number(10) not null, >> external_database_release_id number(10) not null, >> motif_rejection_reason_id number(10) not null, >> {plus housekeeping columns} >> ) >> >> motifRejectionReason >> ( >> motif_rejection_reason_id number(10) not null, >> name varchar2(255) not null, >> description varchar2(255), >> {plus housekeeping columns} >> ) >> >> rejectedMotif should have an index on (source_id, >> external_database_id). >> >> rejectedMotif rows will number in the hundreds at most >> for the forseeable future. There will likely never be more >> than 20 rows in motifRejectionReason. > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IBM Linux Tutorials. > Become an expert in LINUX or just sharpen your skills. Sign up for > IBM's > Free Linux Tutorials. Learn everything from the bash shell to sys > admin. > Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |