From: Jonathan C. <cra...@pc...> - 2003-01-23 17:55:26
|
Debbie- > "manually created" doesn't seem as if it is a review status and this > situation could be covered with "manually reviewed correct" with evidence > being that it was manually created. It is a review status if you want to differentiate between "implicitly reviewed" (i.e., the annotator created it, and he/she would not have done so if he/she did not believe it to be correct) and "explicitly reviewed" (i.e., the entry, which already exists in the database, was retrieved and then examined to determine whether it's correct.) However, it's not clear that this is a distinction (between two different kinds of "reviewed, correct") that we should be making in ReviewStatus. There are at least two separate questions here: 1. Do we want to track which entries in the database were created manually, versus those that were created automatically and then approved by an annotator? I think that we're all in agreement that the answer to this question is a resounding "yes". Given that, the second question is: 2. Where should this information be stored? As you point out, we could record this information using the Evidence table. And, as I mentioned in a previous e-mail, we *have* to do it this way unless we change our ReviewStatus vocabulary so that each and every term in the vocabulary records whether the entry was originally created manually or automatically (so that we can track its original status through one or more rounds of update/re-review.) I don't think that this is a good idea, and after talking to Jonathan about it I think we're in agreement that we should drop the term for "manually created." We also have to bear in mind that our current notion of ReviewStatus is something that's fairly closely tied to the annotation process that we use in DoTS. There's nothing wrong with that, but it's quite possible that other sites will have different ideas about how ReviewStatus should be used. So at some point we should revisit this, but as long as the revised set of terms (see below) is agreeable to everyone on the mailing list, I think that we should stick with it for the time being. > I have to agree with Joan, that it may be safer to stick as closely to the > existing manually_reviewed values as possible, 0=unreviewed and 1=manually > reviewed correct and add 2=manually reviewed incorrect as well as > 4=updated. Can you be more specific about why changing the actual ids would be unsafe? (I hope you're not threatening me :)) I trust that you're not planning to rely on having hard-coded review_status_ids in your GUS 3.0 programs and queries, right? I myself have plenty of GUS 2.x scripts and queries that contain hard-coded internal identifiers (e.g., sequence_type_ids and external_db_ids, to name two of the most frequently-used ones.) However, when I convert these scripts to GUS 3.0 I'm going to have to rewrite them to be portable, meaning that I can't assume that other copies of GUS (perhaps running at other sites) will have the same internal ids. Unless we're willing to take these ids and publish them (as, for example, the GO consortium has done with their GO IDs), we can't rely on their being constant across different copies of GUS; it's just not good programming practice. Jonathan |