With the subsample argument, my idea was to use 25% of the total number of SNPs. But my output file has the same number of lines are the original number of SNPs. The output is mostly zeros. Only 4% of the rows have an integer other than zero.
0
0
0
0
0
1513
0
0
0
...
What does this mean?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, this is expected. In your case, on average at least 75% of rows are zeros. This way the map is compatible with the data and you could add the remaining markers to the map if you later want that.
If you want to permanently reduce the number of markers, you could use simple scripting on the input data.
Cheers,
Pasi
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I ran SeparateChromosomes2 on my dataset like so:
With the subsample argument, my idea was to use 25% of the total number of SNPs. But my output file has the same number of lines are the original number of SNPs. The output is mostly zeros. Only 4% of the rows have an integer other than zero.
What does this mean?
Dear Roy,
Yes, this is expected. In your case, on average at least 75% of rows are zeros. This way the map is compatible with the data and you could add the remaining markers to the map if you later want that.
If you want to permanently reduce the number of markers, you could use simple scripting on the input data.
Cheers,
Pasi