|
From: Jonathan C. <cra...@pc...> - 2003-01-23 17:55:26
|
Debbie-
> "manually created" doesn't seem as if it is a review status and this
> situation could be covered with "manually reviewed correct" with evidence
> being that it was manually created.
It is a review status if you want to differentiate between "implicitly
reviewed" (i.e., the annotator created it, and he/she would not have done
so if he/she did not believe it to be correct) and "explicitly reviewed"
(i.e., the entry, which already exists in the database, was retrieved and
then examined to determine whether it's correct.) However, it's not
clear that this is a distinction (between two different kinds of
"reviewed, correct") that we should be making in ReviewStatus. There are
at least two separate questions here:
1. Do we want to track which entries in the database were created manually,
versus those that were created automatically and then approved by an
annotator?
I think that we're all in agreement that the answer to this question is
a resounding "yes". Given that, the second question is:
2. Where should this information be stored?
As you point out, we could record this information using the Evidence
table. And, as I mentioned in a previous e-mail, we *have* to do it this
way unless we change our ReviewStatus vocabulary so that each and every
term in the vocabulary records whether the entry was originally created
manually or automatically (so that we can track its original status
through one or more rounds of update/re-review.) I don't think that this
is a good idea, and after talking to Jonathan about it I think we're in
agreement that we should drop the term for "manually created."
We also have to bear in mind that our current notion of ReviewStatus is
something that's fairly closely tied to the annotation process that we use
in DoTS. There's nothing wrong with that, but it's quite possible that
other sites will have different ideas about how ReviewStatus should be
used. So at some point we should revisit this, but as long as the revised
set of terms (see below) is agreeable to everyone on the mailing list, I
think that we should stick with it for the time being.
> I have to agree with Joan, that it may be safer to stick as closely to the
> existing manually_reviewed values as possible, 0=unreviewed and 1=manually
> reviewed correct and add 2=manually reviewed incorrect as well as
> 4=updated.
Can you be more specific about why changing the actual ids would be unsafe?
(I hope you're not threatening me :)) I trust that you're not planning to
rely on having hard-coded review_status_ids in your GUS 3.0 programs and
queries, right?
I myself have plenty of GUS 2.x scripts and queries that contain hard-coded
internal identifiers (e.g., sequence_type_ids and external_db_ids, to name
two of the most frequently-used ones.) However, when I convert these scripts
to GUS 3.0 I'm going to have to rewrite them to be portable, meaning that I
can't assume that other copies of GUS (perhaps running at other sites) will
have the same internal ids. Unless we're willing to take these ids and
publish them (as, for example, the GO consortium has done with their GO IDs),
we can't rely on their being constant across different copies of GUS; it's
just not good programming practice.
Jonathan
|