From: SourceForge.net <no...@so...> - 2012-11-26 14:28:53
|
Chado item #3367967, was opened at 2011-07-15 06:41 Message generated for change (Comment added) made by val_wood You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=2096276&aid=3367967&group_id=65526 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: Accepted >Priority: 5 Private: No Submitted By: Valerie Wood (val_wood) Assigned to: Kim Rutherford (kim_rutherford) Summary: GO/ FYPO export from Chado Initial Comment: when we export the GO data we need to make sure that only synonyms are in the synonym column, not like current GAF ---------------------------------------------------------------------- >Comment By: Valerie Wood (val_wood) Date: 2012-11-26 06:28 Message: Lowering priority, we can raise when the FYPO export is more urgent val ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-11-06 02:25 Message: I think the GAF part is done, but I haven't started on the FYPO export yet. ---------------------------------------------------------------------- Comment By: Valerie Wood (val_wood) Date: 2012-10-09 00:46 Message: I think so, it is the newer way, and I'm not sure but I think GPAD expects it, so it would make both the same ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-10-08 17:04 Message: Do you think a SO ID would be better? It's easy enough to change. ---------------------------------------------------------------------- Comment By: Valerie Wood (val_wood) Date: 2012-10-08 16:37 Message: You probably know this but column 12 (the feature type) can contain the SO feature ID. val ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-10-07 21:40 Message: The ND rows are now in, with no complaints from filter-gene-association.pl ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-10-07 16:42 Message: This is mostly done. The GO filter-gene-association.pl script now reports only 20-ish errors, which we are looking at. I haven't added the ND annotations. I'll do that next. ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-09-30 21:35 Message: There's another problem with dates. Some are stored in the GO way "20121022" (from GAF files) and some are stored in the ISO standard way "2012-10-22" (from the curation tool). The GAF files I'm writing have both formats, which the GAF file checker disapproves of. I'll rationalise things. ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-09-29 02:28 Message: It turns out that I had implement the column 17 stuff months ago and then forgot. So it's already done. ---------------------------------------------------------------------- Comment By: Midori Harris (gomidori) Date: 2012-09-27 04:09 Message: might as well do it then (and see if it freaks 'em out as much as it did when we put stuff in column 16 ;) ) ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-09-27 03:57 Message: Thanks. It should be easy to add it where there is a "column_17" property in Chado. ---------------------------------------------------------------------- Comment By: Midori Harris (gomidori) Date: 2012-09-27 03:55 Message: > do we need to put anything in column 17? It's only desirable for the tiny number of annotations where there's a tag like "column_17=PR:000027503;". Even for them, it's optional on GO's end, and the reason to include it is to provide a bit more specificity about what form of a gene product is doing the business. If you do include it, just write out the "PR:000027503" part. There will probably be more column 17 entries in future, but they'll accumulate slowly. m ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-09-27 03:51 Message: The GAF writing is coming along, there are still some problems to fix but it now puts things like "happens_during(GO:0071276),has_regulation_target(SPBC660.07)" in column 16. You've probably told me before, but do we need to put anything in column 17? This page implies that it's optional: http://www.geneontology.org/GO.format.gaf-2_0.shtml ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-09-25 02:37 Message: The Chado database was missing some dates too - all dates from the input GAF files weren't being stored. That's fixed too and I'm now re-loading. I don't think that's a problem for Ensembl as the dates aren't soon on pombase.org. ---------------------------------------------------------------------- Comment By: Valerie Wood (val_wood) Date: 2012-09-05 06:46 Message: This fits with what I would say. At present i delete psudos before submission. if a psudo warrnts any kind of annotation we would probably remove the pseudo tag and make another feature like a non coding RNA with a regultory role ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-09-05 03:14 Message: Right so. Thanks for that. I won't include the pseudogenes in the GAF output. Unfortunately I've meanwhile found a worse anomaly in Chado - there are lots of annotations without evidence code. I'm trying to track down how that happened. ---------------------------------------------------------------------- Comment By: Midori Harris (gomidori) Date: 2012-09-05 03:09 Message: Your question has prompted me to trawl through a fairly old (March 2006) exchange on GO & SO mailing lists, which can be scary. It starts here: https://mailman.stanford.edu/pipermail/go-discuss/2006-March/001782.html ... goes on and on, and some of it comes out in separate threads in the archive: https://mailman.stanford.edu/pipermail/go-discuss/2006-March/thread.html ... and it boils down to "it's complicated". But I think some usable simple answers would be ... > Does it make sense for pseudogenes to have GO annotation? Probably not; maybe with some rare exceptions, but if we did just say "never" we probably wouldn't lose much. The exceptions would come up if someone finds that a "pseudogene" is transcribed, and the transcript does something, usually regulatory, e.g. acts as antisense RNA for a "live" copy of the gene. But the email exchange included arguments that if that happens the feature shouldn't be called a pseudogene anyway ... but what if the community expect to see it called a pseudogene ... argh argh aaargh. That leads me to the second simple answer: > If so, should the annotation be exported to the GAF file? No. As long as we call something a pseudogene, we should not export any GO annotations for it to the GAF, even if we want to have the annotations for one of the exceptional circumstances. We don't want to piss them off, and we *really* don't want to reopen the can of pseudo-worms! m ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-09-04 21:06 Message: Does it make sense for pseudogenes to have GO annotation? If so, should the annotation be exported to the GAF file?: http://www.pombase.org/spombe/result/SPAC23D3.05c ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-08-02 04:24 Message: Only add NDs for protein coding genes. Add an ND annotation for each aspect that has no other annotation. ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-07-09 03:46 Message: What's the format of the "ND mapping" file? ---------------------------------------------------------------------- Comment By: Valerie Wood (val_wood) Date: 2012-07-09 03:16 Message: When we do the export we need to also create a ND mapping for when a gene product is missing a particular aspect. I checked the weekend before I went away that these were all valid (most are for conserved unknowns and sequence orphans, and as far as I am aware there are no outstanding appers for any of these. The numbers will be quite low (see Venn attached) (in fact the numbers will be slightly less than this because I managed to squeeze a few more ISS annotations in) ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-07-08 02:32 Message: I've added a separate tracker item for the GeneDB style mapping files: https://sourceforge.net/tracker/?func=detail&aid=3541336&group_id=65526&atid=2096276 ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2012-07-06 03:23 Message: Apparently I implemented some of this back in January. I don't remember doing it, but it probably wasn't someone else. The code as is doesn't write out the annotation extensions, so that still needs doing. ---------------------------------------------------------------------- Comment By: Valerie Wood (val_wood) Date: 2012-02-07 08:41 Message: Also we need to replace some "Mappign files" generated from GeneDB They are all simple tab delimited format Mapping files described here: http://www.pombase.org/downloads/data-mapping ---------------------------------------------------------------------- Comment By: Valerie Wood (val_wood) Date: 2012-01-23 14:31 Message: GPAD/GPI documentation http://www.geneontology.org/GO.format.gpi.shtml http://www.geneontology.org/GO.format.gpad.shtml Not sure this documentation is up to date, check before use ---------------------------------------------------------------------- Comment By: Valerie Wood (val_wood) Date: 2012-01-06 14:05 Message: extending this item to cover phenotype annotations Sent this to Chris M when done ---------------------------------------------------------------------- Comment By: Kim Rutherford (kim_rutherford) Date: 2011-12-30 02:32 Message: I'll do the GAF export first as it's easier and probably more immediately useful. It's also easy to test because we can compare the output to the current GeneDB GAF files. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=2096276&aid=3367967&group_id=65526 |