For the next load can you report any qualitative gene expression data which does not fit the format described here:
https://sourceforge.net/apps/trac/pombase/wiki/CapturingGeneExpression
i.e
Example Artemis string: /controlled_curation="term=gene expression, RNA level; annotation_extension=during(GO:xxxxxxx); qualifier=increased; evidence=ECO:xxxx; db_xref=; date=YYYYMMDD"
val
in the first field, these are ok:
RNA level
protein level
transcription
translation
RNA degradation
protein degradation
Are any of the things mentioned on the wiki page optional?
For example it says:
Level allowed qualifiers: present, absent, unchanged, increased, decreased, constant
They're allowed but are they mandatory?
Does the annotation have to have an evidence?
What about the Relation, which a guess is in the annotation_extension=? Is the relation optional?
Thanks!
For the annotations I have done nothing is optional :)
Together they give you the what (protein or rna), the when (during) and the how (increased, decreased etc)
as I wrote on the wiki page:
* Each annotation must consist of these 4 bits of data:
* '''Expression type''' Allowed types: RNA level, protein level, transcription, RNA degradation, translation, protein degradation
* '''Relation''' during(GO biological process ID) or in_presence_of(ChEBI ID)
* '''Level''' allowed qualifiers: present, absent, unchanged, increased, decreased, constant
* '''Evidence''' ECO:xxxxxxx
There are examples in the EMBL files of term names that aren't in Antonia's list. For example "alternative transcripts" here:
Should that be a warning in the log?
Yep, anything which does not fit the RNA/protein syntax should be reported. We will change them. I thought I had done them all, but I think I missed about 30.
(Antonia I changed alternative transcript to "genome organization" for now which already had some info of this type but is not controlled syntax.
Although i just noticed < boobed with this one:
FT /controlled_curation="term=genome organization,
FT alternative transcripts db_xref=PMID:11376151;
FT date=20060605"
(missing ";")
Kim will this appear in the logs?)
I am going to do a global replace and commit for "transcripts db_xref" -> "trancripts; db_xref" but we should capture any missing....
val
There was only one FT /controlled_curation="term=genome organization,
FT alternative transcripts db_xref=PMID:11376151;
FT date=20060605"
...now fixed
but there could be others with different text.
think most things are now being reported via other ticketsso I am lowering priority of this one as not urgent, but could do with a final check that all gene expression data has the specified components