Menu

Genetic map format

hfan
2026-01-20
2026-01-22
  • hfan

    hfan - 2026-01-20

    Hi Pasi and dear users,

    I wanted to phase my vcf using SHAPEIT and it requires a genetic map. The first few lines of the example file looks like this:

    position COMBINED_rate(cM/Mb) Genetic_Map(cM)
    61795 0.3516610754 0
    63231 0.3500036909 0.0005026053001324
    63244 0.3494018702 0.000507147524445
    63799 0.3501262382 0.000701467586646
    64150 0.3558643956 0.0008263759895016
    64934 0.3567249058 0.0011060483156488
    65288 0.3633379498 0.001234669949878
    66370 10.0482361599 0.0121068614748898

    The genetic map constructed using Lep-MAP3 provides column 1 (position) and column 3 (Genetic_Map(cM)), and I believe column 2 is just (cM_i - cM_i-1)/pos_i - pos_i-1.

    It is also both in the order of physical positions AND genetic distance, whereas in my Lep-MAP3 output, it is sorted in the accumulated genetic distance, and for some of my chromosomes the physical positions are all over the place (see a plot attached, different color suggests different chromosomes). Though there is a clear winner (the teal chr), and I can remove the ones from other chrs, but I still need to remove the teal dots that are not on the main line.

    Is this a solved problem? I hope I am making sense.

    Best,
    Huan

     
  • Pasi Rastas

    Pasi Rastas - 2026-01-20

    Dear Huan,

    Thank you for your question.

    Having this kind of map data is normal. My understanding is that there are transposable elements and other repeats that make some markers map into wrong chromosomes and places. This can be fixed by removing markers in the wrong chromosomes. I don't know how much such markers affect other analysis, but these stand out in linkage maps.

    Addition to these "jumping" markers, within chromosomes the map data is not necessary informative enough to put all markers into exactly correct position even with otherwise error-free data. Moreover, genotyping errors and missing data make the situation worse. To solve this, I have suggested two solutions:

    1) FitStepFunction in Lep-Anchor fits a monotonic function to the map. After this the map follows the "main line".

    2) Evaluating the map in the correct physical marker order. Then the map coordinates must be also strictly increasing .

    Both require (about) correct physical marker order.

    Cheers,
    Pasi

     
  • hfan

    hfan - 2026-01-21

    Hi Pasi,

    The FitStepFunction worked really well! Thank you! I was reinventing the wheel by doing some iterative loess smoothing...

    One quick question, what is the last line of FitStepFunction's printscreen saying about Squeeze map?

    e.g. here is from one of my chromosome:

    java -cp ~/build/lep-anchor-code/bin FitStepFunction map=order.txt

    autodetecting noChromosome=1
    The most abundant contig is chr1 with 54888 markers
    +orientation 37811
    -orientation 3443
    Using + orientation
    Squeeze map 0->57 1377->1377

    Best,
    Huan

     

    Last edit: hfan 2026-01-21
  • Pasi Rastas

    Pasi Rastas - 2026-01-22

    Dear Huan,

    The squeeze removes map positions from the map ends if there are no support for them, basically making the map start at 0cM.

    Cheers,
    Pasi

     
    👍
    1

Log in to post a comment.