Menu

#207 QC: check whether allele names are unique genome wide

next-load
open
nobody
None
1
2014-04-23
2013-08-05
No

They might not be any, but we can disambiguate if we find out how many there are.

Discussion

  • Kim Rutherford

    Kim Rutherford - 2013-08-10

    Here's the list of allele names that are duplicated and the count of the number of times the name is used:

         name          count
    git6-261                  2
    wee1-50                   2
    end4-507                  2
    T42A                      2
    abc1delta                 2
    crm1-N1                   2
    prp1                      2
    heterozygous diploid      3
    cdc22-M45                 3
    K56R                      3
    homozygous diploid        4
                            159
    no_name                 559
    

    And here are the uniquenames/identifiers of the alleles with duplicated names:

            name                 uniquename
    heterozygous diploid   SPAC17H9.09c:allele-4
    heterozygous diploid   SPAC4A8.12c:allele-2
    heterozygous diploid   SPBC1289.03c:allele-2
    crm1-N1                SPAC1805.17:allele-6
    crm1-N1                SPCC663.03:allele-4
    T42A                   SPBC428.16c:allele-31
    T42A                   SPAC1565.06c:allele-4
    cdc22-M45              SPAC1F7.05:allele-3
    cdc22-M45              SPBC582.03:allele-7
    cdc22-M45              SPBC11B10.09:allele-14
    homozygous diploid     SPAC1D4.13:allele-3
    homozygous diploid     SPAC22H10.07:allele-3
    homozygous diploid     SPAC16E8.09:allele-3
    homozygous diploid     SPBC21.05c:allele-3
    git6-261               SPBC106.10:allele-13
    git6-261               SPAC926.04c:allele-6
    wee1-50                SPAC144.13c:allele-2
    wee1-50                SPCC18B5.03:allele-2
    abc1delta              SPBC2D10.18:allele-1
    abc1delta              SPAC9E9.12c:allele-3
    prp1                   SPBC6B1.07:allele-3
    prp1                   SPAC29E6.02:allele-2
    end4-507               SPAC688.11:allele-2
    end4-507               SPAC4F10.15c:allele-3
    K56R                   SPAC1834.04:allele-2
    K56R                   SPBC8D2.04:allele-2
    K56R                   SPBC1105.11c:allele-2
    
     
  • Valerie Wood

    Valerie Wood - 2013-08-12

    Good. Not too many.

    I think we can resolve the K56R type, by always including the gene name in fron of these name types hht1-k56R etc

    Some appear to be mistakes (I don't think wee1-50 is a true allele of swr1

    The only one I can see which might be a problem right now is abc1 delete...I think that has been used twice as an allele name./ Maybe in this instance abc1 could be made a synonym of allele ybt1(delta) (the approved name for SPAC9E9.12c)

    Looks like a Friday afternoon task for us to do the fixes first and take it from there.

     
  • Valerie Wood

    Valerie Wood - 2014-04-23

    crm1-N1 SPAC1805.17:allele-6
    crm1-N1 SPCC663.03:allele-4
    cdc22-M45 SPAC1F7.05:allele-3
    cdc22-M45 SPBC582.03:allele-7
    cdc22-M45 SPBC11B10.09:allele-14
    git6-261 SPBC106.10:allele-13
    git6-261 SPAC926.04c:allele-6
    wee1-50 SPAC144.13c:allele-2
    wee1-50 SPCC18B5.03:allele-2
    prp1 SPBC6B1.07:allele-3
    prp1 SPAC29E6.02:allele-2
    end4-507 SPAC688.11:allele-2
    end4-507 SPAC4F10.15c:allele-3

    appear to be typos and will be fixed.

    I guess these are not an issue anyway, as they have the unique name.
    I will lower the priority to 1 and we can have a look in the future if any more have accumulated

     
  • Valerie Wood

    Valerie Wood - 2014-04-23
    • Priority: 3 --> 1
     

Log in to post a comment.