I am using LEP-MAP3 to construct a genetic map for a Vitis (232 samples,
with 500Mb genome size, 19 chromosomes) interspecific population. The
resulting map appears unusually long in genetic distance (cM), and I
suspect this may be due to the very large number of SNPs I am using (about
320,000).
I noticed that the example vcf file only contains ~ 2000 SNPs. Could you
please advise on recommended approaches for filtering SNPs to retain only
the most informative markers for accurate mapping? Any guidance on
thresholds or strategies within LEP-MAP3 would be greatly appreciated.
Lep-MAP3 should be very robust on large number of markers. How long the cM distances are? Typically map lengths are in the range of 50-200cM per LG. However, for some species there can be longer LGs.
What I suggest first is to check the pedigree structure. For example calculate IBD (module IBD) values between parents and offspring or between offspring from the same families. Also check for outliers for the number of crossovers per individual, these are output by OrderMarkers2.
If you want to prune markers, prune them by their distance. For example, remove one of each pair of markers with distance <= L. L could be as small as 1 (bp) or up to 1000-10000 (1kb to 10kb). Especially for multi-family data, proximityScale parameter might work better than pruning markers.
Cheers,
Pasi
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Pasi,
I am using LEP-MAP3 to construct a genetic map for a Vitis (232 samples,
with 500Mb genome size, 19 chromosomes) interspecific population. The
resulting map appears unusually long in genetic distance (cM), and I
suspect this may be due to the very large number of SNPs I am using (about
320,000).
I noticed that the example vcf file only contains ~ 2000 SNPs. Could you
please advise on recommended approaches for filtering SNPs to retain only
the most informative markers for accurate mapping? Any guidance on
thresholds or strategies within LEP-MAP3 would be greatly appreciated.
Thank you very much for your time and help.
Best regards,
Cheng
Dear Cheng,
Thank you for your question.
Lep-MAP3 should be very robust on large number of markers. How long the cM distances are? Typically map lengths are in the range of 50-200cM per LG. However, for some species there can be longer LGs.
What I suggest first is to check the pedigree structure. For example calculate IBD (module IBD) values between parents and offspring or between offspring from the same families. Also check for outliers for the number of crossovers per individual, these are output by OrderMarkers2.
If you want to prune markers, prune them by their distance. For example, remove one of each pair of markers with distance <= L. L could be as small as 1 (bp) or up to 1000-10000 (1kb to 10kb). Especially for multi-family data, proximityScale parameter might work better than pruning markers.
Cheers,
Pasi