From: John I. <io...@pc...> - 2003-12-16 18:40:43
|
GUS folks, I'm working on the GO term predictor, which uses BLAST similarities between proteins of known function and domains from CDD or ProDom to automatically assign functions to novel proteins. This system is built around rules, which are domain - GO-term pairs. In some cases, bad, repeat-rich domains have given rise to bad rules. We want to create the ability to mark these domains so they are not used for the generation or application of rules (and possibly other functions unrelated to the GO term predictor). We propose to do this by means of two new tables. A motif will be marked as rejected by its addition to the rejectedMotif table. It will be identified by a source_id/external_database_id pair. The record will also include, for documentation, an external_database_release_id and a motif_rejection_reason_id. The latter will be the primary key of the motifRejectionReason table, which will store a name and description for each reason. The request is number 854957. Here's a link. I include the text below: https://sourceforge.net/tracker/?func=detail&aid=854957&group_id=54213&a tid=479181 Thanks in advance for any comments. John ------------------------------------------------------------------------ ------------------------------------------------------------------------ --------------------------------- Summary: Two new tables to support motif rejection We need two new tables in the DoTS schema to support marking a motif as a bad motif. They look like this: rejectedMotif ( rejected_motif_id number(10) not null, source_id varchar2(32) not null, external_database_id number(10) not null, external_database_release_id number(10) not null, motif_rejection_reason_id number(10) not null, {plus housekeeping columns} ) motifRejectionReason ( motif_rejection_reason_id number(10) not null, name varchar2(255) not null, description varchar2(255), {plus housekeeping columns} ) rejectedMotif should have an index on (source_id, external_database_id). rejectedMotif rows will number in the hundreds at most for the forseeable future. There will likely never be more than 20 rows in motifRejectionReason. |