I didn't really want to go down the "curating features without coordinates" route for genes (as the info isn't so useful unless we know what object it actually refers to, and most are published).
Howevver, we are likelty o want to use promoters in extensions, when we sometimes don't know the exact sequence of the promoter. We will add these with names like
SPAC1AX.01-promoter
Is it easy to store these features if we have no coordinates?
Val
Feature locations are optional in Chado so no problem there.
ooh, one way we could coordinate-less features for genes would be to capture those phenotypes for genes that haven't been cloned (PMID:7865880; cloned genes annotated in e8278d6fd6c40079)
not at all urgent, and kind of off on a tangent away from promoters, but could chat about it some time if there's any interest ...
Stumbling block: Artemis can't cope with features that don't have coordinates, so we need a different route to get them into Chado. What should we put where?
I've been ignoring this because of the low priority.
I think good first pass would be to have a tab delimited file with columns like (off the top of my head):
where "relations" (or whatever it should be called) would be something like:
promoter_of(SPAC1AX.01)
we need that in Chado to connect things like promoters to the appropriate gene.
We can add columns as we think of things.
The file should be put in pombe-embl somewhere initially.
How about I create a file with a couple of promoters that we know about
from the canto load log, and you see if it works? We can also see what
else we think of adding once there's an actual file to futz with.
m
Last edit: Midori Harris 2013-12-18
Yes, Chado has to load the things represented by the IDs in
the extensions. For example, that's why we have the mini-chebi file, and
why there are occasionally log messages like "can't find term with ID:
CHEBI:17306".
How about I create a file with a couple of promoters that we know about from the canto load log, and you (Kim) see if it works? We can also see what else we think of adding once there's an actual file to futz with.
Yep, an annotation will fail to load (with a warning) if a feature identifier in an extension isn't in Chado.
added to pombe-embl svn:
supporting_files/features_without_coordinates.txt
mini-ontologies/SO_feature_relations.obo
also sending email with more fluff ...
Kim how much work is it to implment this?
This will stop the warnings
warning in d3c28a9773ee0a38: can't find feature using identifier: SPBC16G5.15c-promoter (error can't find feature for: SPBC16G5.15c-promoter)
warning in d3c28a9773ee0a38: can't find feature using identifier: SPAC23C11.16-promoter (error can't find feature for: SPAC23C11.16-promoter)
warning in d3c28a9773ee0a38: can't find feature using identifier: SPAC6G10.12c-promoter (error can't find feature for: SPAC6G10.12c-promoter)
warning in d3c28a9773ee0a38: can't find feature using identifier: SPAC821.08c-promoter (error can't find feature for: SPAC821.08c-promoter)
warning in d3c28a9773ee0a38: can't find feature using identifier: SPBC32F12.09-promoter (error can't find feature for: SPBC32F12.09-promoter)
warning in d3c28a9773ee0a38: can't find feature using identifier: SPAC821.08c-promoter (error can't find feature for: SPAC821.08c-promoter)
warning in d3c28a9773ee0a38: can't find feature using identifier: SPAC6G10.12c-promoter (error can't find feature for: SPAC6G10.12c-promoter)
in the logs. It isn't urgent but as the file is ready we should increase the priority if it is quick/easy
It shouldn't be a big job. It would take less than half a day I think. I've been ignoring it because of the low priority.
Thanks Midori, I've added that to the build script and it loads fine.
hooray!