Menu

ParentCall: what is it doing, and do I need it?

2021-06-16
2021-06-17
  • Lorenzo Bertola

    Lorenzo Bertola - 2021-06-16

    Hello Pasi,

    I am starting a new linkage mapping project, and this time i'm looking at parentcall2 in more detail, and I'm wondering why is it doing some changes, and whether they are needed or not.

    I have genotype data from double-digest rad-seq-like sequencing. I obtained fastq files and called genotypes with stacks denovo. I have large families (100-300 F1 offspring). I have already filtered to retained only informative markers). My input data looks something like this (2 markers, displaying only the 2 parents and 2 offspring. First marker is an AB x AA cross, second marker is ABxAB cross)
    NOTE: i separate genotype likelihoods with a dash here to make it easier to read

    51697_25 POS 0 1.0 0 0 0 0 0 0 0 0---1.0 0 0 0 0 0 0 0 0 0---1 1 1 1 1 1 1 1 1 1---0 1.0 0 0 0 0 0 0 0 0
    1152_28 POS 0 1.0 0 0 0 0 0 0 0 0---0 1.0 0 0 0 0 0 0 0 0---0 1.0 0 0 0 0 0 0 0 0---1 1 1 1 1 1 1 1 1 1

    After running parentCall2, the same data looks like this:

    51697_25 POS 0 1.0 0 0 0 0 0 0 0 0---1.0 0 0 0 0 0 0 0 0 0---1.0 1.0 0 0 0 0 0 0 0 0---0 1.0 0 0 0 0 0 0 0 0
    1152_28 POS 0 1.0 0 0 0 0 0 0 0 0---0 1.0 0 0 0 0 0 0 0 0---0 1.0 0 0 0 0 0 0 0 0---0.5 1.0 0 0 0.5 0 0 0 0 0

    From my understanding, basically the offspring with no available data have been given genotype likelihoods that match the expected segregation. So for the ABxAA cross they are equally likely to be AA or AB (represented by 1.0 1.0 0 0 0 0 0 0 0 0 ), and for the ABxAB cross they are twice as likely to be AB than they are to be AA or BB (represented by 0.5 1.0 0 0 0.5 0 0 0 0 0).

    Now, I have scanned my input and output files from ParentCall2 module, and the only difference I find is the imputation of the offspring with missing data, as shown above.

    Furthermore, I also tried coding missing data as all 0s, like so:

    51697_25 POS 0 1.0 0 0 0 0 0 0 0 0---1.0 0 0 0 0 0 0 0 0 0---0 0 0 0 0 0 0 0 0 0---0 1.0 0 0 0 0 0 0 0 0
    1152_28 POS 0 1.0 0 0 0 0 0 0 0 0---0 1.0 0 0 0 0 0 0 0 0---0 1.0 0 0 0 0 0 0 0 0---0 0 0 0 0 0 0 0 0 0

    In that case the result of ParentCall is as follows:

    51697_25 POS 0 1.0 0 0 0 0 0 0 0 0---1.0 0 0 0 0 0 0 0 0 0---1 1 1 1 1 1 1 1 1 1 ---0 1.0 0 0 0 0 0 0 0 0
    1152_28 POS 0 1.0 0 0 0 0 0 0 0 0---0 1.0 0 0 0 0 0 0 0 0---0 1.0 0 0 0 0 0 0 0 0---1 1 1 1 1 1 1 1 1 1

    As in, missing data coded as all 0s has been converted to all 1s.

    My questions are:
    - what is the difference between missing data with all 0s and missing data with all 1s?
    - how are the three options used/interpreted by lepmap (0 0 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1 1 1, 0.5 1.0 0 0 0.5 0 0 0 0 0)
    - is it necessary to use parent call for this type of data, considering that the only changes I detect after running parent call is how missing data is coded.

    Thanks,
    Lorenzo

     

    Last edit: Lorenzo Bertola 2021-06-16
  • Pasi Rastas

    Pasi Rastas - 2021-06-17

    Dear Lorenzo,

    ParentCall2 should not make difference in this case. This module is meant to do the filtering you mention in your message (e.g. removing Mendel errors and non-informative markers).

    Cheers,
    Pasi

     

Log in to post a comment.