#386 New Terms: several new terms relating to splice sites and splice site variants

open
nobody
None
5
2013-06-20
2013-06-04
No

Hi

I would like some extra terms describing splice sites and splice site variants please.

The first case should hopefully be straight forward.

Name : extended intronic splice region (Just a suggestion, could probably be improved?)
Definition : Region of intronic sequence within 10 bases of an exon
Parent : Probably SO:0000835 (primary_transcript_region) as it won't nest under the existing splice_region term (SO:0001902)

I would also like a variant term for a sequence change overlapping the same region.

Name : extended intronic splice region variant
Definition : A sequence variant occurring in the intron within 10 bases of an exon
Parent : Probably SO:0001568 (splicing_variant), again this doesn't really fit under the existing splice_region_variant (SO:0001630) term

Second is a little more complicated, I want to define an extra position as part of the 5' cis splice site. The current five_prime_cis_splice_site term (SO:0000163) covers the 1st and 2nd intronic positions after the end of the exon, I want to include the 5th intronic position too. The only way I can see to do this without breaking the existing definitions is to add 2 new terms. One to define the extra position, the other is inserted into the inheritance tree to aggregate the new term with the existing ones.

The 2 new terms are as follows

Name : extended cis splice site
Definition : Intronic positions associated with cis-splicing. Contains the first and second positions immediately before the exon and the first, second and fifth positions immediately after.

Name : non-continuous five prime cis splice site position
Definition : Fifth intronic position after the intron exon boundary, close to the 5' edge of the intron.

Integrating this into the existing term tree is tricky. We could try the following

current structure

SO:0000162 (splice_site)
        -> SO:0001419 (cis_splice_site)
        -> SO:0001533 (cryptic_splice_site)
        -> SO:0001420 (trans_splice_site)

Suggested new structure

SO:0000162 (splice_site)
        -> SO:NEW (extended_cis_splice_site)
                -> SO:0001419 (cis_splice_site) + children
                -> SO:NEW (non-continuous five prime cis splice site position)
        -> SO:0001533 (cryptic_splice_site)
        -> SO:0001420 (trans_splice_site)

This seems to be the most suitable given the term names however it nests the new +5 position term beneath SO:0000162 (splice_site). Currently SO:0000162 specifies 2 positions immediately adjacent to the splice junction, so this would alter its definition which is not ideal. However SO:0000162 is also the parent to the trans splice site terms which don't abide by these 2 positions either.

Finally, I would also like a variant term defined as a change to the new extended_cis_splice_site term

Name : extended cis splice site variant
Definition : A sequence variant occurring in, or altering the spacing of, the extended cis splice site.
Parent : SO:0001568 (splicing_variant)

Thank you

Andy

Discussion

  • Karen Eilbeck

    Karen Eilbeck - 2013-06-04

    Hi Andy
    These terms look sensible to me. I had thought we had covered the +5 position, but maybe just discussed it.
    The proposed hierarchy may be a little off. It looks like the relations between cis splice and extended cis splice is a part of or contained by rather than an is_a. I will work thru the logic before I make the edits.
    I'll let you know when SO is updated. I may not get to it this week.

    --K

     
  • Andy Menzies

    Andy Menzies - 2013-06-20

    Hi Karen

    You may have talked about it with the Ensembl guys while they were developing Variant Effect Predictor. I was talking to them at that time which is why it may have come up.

    I've looked through the 2.5.1 release and can't find a term covering the +5 position. Its posible you've already added it to the main development branch but I don't know how to access that to look for myself.

    My suggested hierarchy was a best guess and starting point, I didn't think I'd get it right first time.

    Cheers

    Andy

     

Log in to post a comment.