Menu

Run time

2018-04-17
2018-04-22
  • Roy Francis

    Roy Francis - 2018-04-17

    Hi,
    Could you please provide example run times for these tools?

    ParentCall2
    Filtering2
    SeparateChromosomes2
    JoinSingles2All
    OrderMarkers2

    Just a rough idea like is it 10 mins or 10 hours for x number of markers and y number of samples.
    Thanks

     
  • Pasi Rastas

    Pasi Rastas - 2018-04-18

    Hi Roy,

    ParentCall2 can take some time, typically I run it parallel on each contig/scaffold. If you run it on single core and WGS data, these might take a few days. Filtering2 is bit faster but can take some time as well.

    SeparateChromosomes2 scales as n^2 for n markers (2x markers yields 4x runtime). It can be run on a 1-3 million markers with enough cores (numThreads) in a few days. Smaller datasets don't take that much time. JoinSingles2 is about the same, runtime in mn where n in number of single markers and m markers in the map.

    OrderMarkers2 scales in worst case as n^2 as well, however, as you run it on each linkage group separately, the n is smaller than the number of all markers. If you have <1000 markers (per lg), it typically runs within minutes.

    Cheers,
    Pasi

     

    Last edit: Pasi Rastas 2018-04-18
  • Pasi Rastas

    Pasi Rastas - 2018-04-18

    And to add, sometimes I try first round of mapping on a subset of markers. For example, SeparateChromosomes2 has parameter "subsample" to use only a fraction of markers. Sometimes you can even get faster runtime by using SeparateChromosomes2 with subsample + JoinSingles2. Moreover, SeparateChromosomes2 allows to provide "map" parameter where you can create custom ways of "thinning" the data. Also the OrderMarkers2 runs faster with map file created with subsample.

    Cheers,
    Pasi

     
  • Roy Francis

    Roy Francis - 2018-04-22

    I have too many markers (2 million or so). I am considering evenly thinning down to 100,000 markers or so (1 snp every kb for example). Then create the linkage map with the reduced set. In the end, can I interpolate the rest of the markers for higher density?

     
  • Pasi Rastas

    Pasi Rastas - 2018-04-22

    Dear Roy,

    I have only thinned data in order to increase the data quality or informativeness by combining information on physically nearby markers.

    Probably something like "1 SNPs per kb" works well. Please note that it can be dangerous to fillter markers only by arbitrary data "quality" as it can cause bias and gaps in the maps.

    Cheers,
    Pasi

     

Log in to post a comment.