Menu

#438 some "disease" annotations are really to "genes"

hierarchy
closed-fixed
None
5
2015-05-03
2015-03-27
No

there are some annotations made to OMIM loci. is this intentional?
the following are some "+" entries in OMIM, which are actually some kind of genomic locus (may be a gene, may be a broader range like a cytogenic band) that have one or more phenotypes associated with it.
100650,107680,107730,107741,109270,114835,116790,124060,132810,138300,141800,141900,147892,151430,152200,152780,159555,168820,173470,177400,182870,211100,222745,309850,314200

should these annotations be migrated to the relevant disease instead?

the first few examples are:
100650 --> 610251
=> DONE

107680 --> 105200 or 604091 (but there are also other disease/phenotypes here that don't have omim ids, "ApoA-I and apoC-III deficiency, combined" and "Corneal clouding, autosomal recessive")
=> Currently difficult because the entry 107680 refers to the gene but also two different diseases.

107730 --> 144010, 615558
=> DONE

107741 --> 104310, 611771, 269600, 603075 (plus Hyperlipoproteinemia, type III and {Myocardial infarction susceptibility} )
109270 --> A LOT OF THINGS.
=> Here, 107741 still refers to hyperlipidemia type III (there is no phenotype entry for this disease)

This remains difficult as the disease definitions are not clean in all cases. We need to probably keep track of some of these nuances separately, but it is not easy to maintain. Please let me know about any other entries like this, I had thought that we had gotten rid of most of them.

Discussion

  • Peter N. Robinson

    • status: unread --> pending
     
  • Peter N. Robinson

    I think these issues partially reflect the way OMIM has developed in the last five years. Many of the "+" entries were originally gene and disease (combined entries), that were the equivalent of a "#" and "*" entry. Some of the entries mentioned here are really complex, with some of the disease phenotypes being part of the "+" entry, and some of it being part of other "#" entries.
    A major problem for modelling are the susceptibility entries. As the HPO project moves into the field of common disease, this data becomes really valuable. Thus, we are going to need to get a way of ingesting the susceptibility entries from OMIM and many other sources. We definitely need funding for this.

     
  • Peter N. Robinson

    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -6,7 +6,18 @@
    
     the first few examples are:
     100650 --> 610251
    +=> DONE
    +
     107680 --> 105200 or 604091 (but there are also other disease/phenotypes here that don't have omim ids, "ApoA-I and apoC-III deficiency, combined" and "Corneal clouding, autosomal recessive")
    +=> Currently difficult because the entry 107680 refers to the gene but also two different diseases.
    +
    +
     107730 --> 144010, 615558
    +=> DONE
    +
     107741 --> 104310, 611771, 269600, 603075 (plus Hyperlipoproteinemia, type III and {Myocardial infarction susceptibility} )
     109270 --> A LOT OF THINGS.
    +=> Here, 107741 still refers to hyperlipidemia type III (there is no phenotype entry for this disease)
    +
    +
    +This remains difficult as the disease definitions are not clean in all cases. We need to probably keep track of some of these nuances separately, but it is not easy to maintain. Please let me know about any other entries like this, I had thought that we had gotten rid of most of them.
    
    • status: pending --> closed-fixed
    • assigned_to: Peter N. Robinson
     

Log in to post a comment.