Originally created by: sasig...@ucdavis.edu
What steps will reproduce the problem?
1. Using my data with the fst-sliding.pl
2. Using my data with the cmh-test script
3. perl popoolation2_1201-3/cmh-test.pl --input input.sync --output input.cmh --min-count 1 --min-coverage 2 --max-coverage 200000 --population 1-7,8-13
4. perl popoolation2_1201-3/fst-sliding.pl --input input.sync --output input_window.fst --min-count 1 --min-coverage 2 --max-coverage 108000 --min-covered-fraction 1 --window-size 500 --step-size 100 --pool-size 500
The max coverage may seem strange, but this sequencing was done in such a way that we expect very very high coverage for certain regions and it is not a symptom of a problem.
What is the expected output? What do you see instead?
fst-sliding.pl gives me a file full of na which I am actually generally very confused about. In the snp-frequency output there are nas in fields that would have been a comparison between 0 and 1....
cmh literally gives empty output files.
Here is a sample of the input sync file:
0:0:0:0:0:0
gi|354623106|gb|AFFE01008681.1| 1395 G 0:0:0:0:0:0 0:0:0:1:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0
gi|354623106|gb|AFFE01008681.1| 1396 T 0:0:0:0:0:0 0:1:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0
gi|354623106|gb|AFFE01008681.1| 1397 T 0:0:0:0:0:0 0:1:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0
gi|354623106|gb|AFFE01008681.1| 1398 A 0:0:0:0:0:0 1:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0
gi|354623106|gb|AFFE01008681.1| 1399 T 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0
gi|354623106|gb|AFFE01008681.1| 1400 G 0:0:0:0:0:0 0:1:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0
gi|354623106|gb|AFFE01008681.1| 1401 A 0:0:0:0:0:0 1:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0
gi|354623106|gb|AFFE01008681.1| 1402 A 0:0:0:0:0:0 1:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0
gi|354623106|gb|AFFE01008681.1| 1403 C 0:0:0:0:0:0 0:0:1:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0 0:0:0:0:0:0
What version of the product are you using? On what operating system?
popoolatiopn2, operating system is centos and/or osx.
Please provide any additional information below.
Hello,
Was this problem ever solved? I am receiving the exact same problem.
Thanks!
not sure i get the problem - the sync files in the above example are empty (counts of all alleles are zero) thus the cmh and fst etc will be empty or na (you can not compute allele frequency differences with allele counts zero)
Hi Robert,
Thanks for your reply. My sync files are composed of all zeros in the columns, but I know this is incorrect. What would cause the sync file to be full of 0?
This is what my code for creating the sync file looks like:
perl popoolation2_1201/mpileup2sync.pl --fastq-type sanger --min-qual 20 --input all.mpileup --output all_perl.syn
Hello Joanna!
So I'm also having this problem of the sync file being completely empty. Looking at my mpilup file there seems to be data there, but none of it is being transfered to the sync file. Any advice on how you got past this problem?
Hi Kyle,
This issue was specific to the sliding window script. I managed to fix the
issue by changing some of the parameters. I decreased the min-count and
min-covered-fraction from the example in the tutorial. These were my final
parameters:
--min-count 2 --min-coverage 5 --max-coverage 200 --min-covered-fraction 0.2
I think the main issue was that the min-covered-fraction cannot be at 100%
(ie 1), so decreasing this should make it work. Rarely will there be
sufficient coverage at every single base of a sliding window, therefore
--min-covered-fraction should be lower than 1.0
On Tue, Nov 6, 2018 at 1:20 PM Kyle Turner kyevturn@users.sourceforge.net
wrote:
--
Joanna Griffiths | PhD candidate
https://joannasgriffiths.wordpress.com/
Morgan W. Kelly lab | Michael E. Hellberg lab
Department of Biological Sciences
Louisiana State University
Related
Tickets: #16
Thanks for the quick reply! I started a new ticket to address my problem since it goes past just the sliding window script. It seems that the mpileup2sync did not produce any real results. My mpileup file has data but the produced sync file is only zeros. (the picture attached shows a portion of the sync file. The whole file looks like this though)