popoolation2 / Wiki / Tutorial

Comment has been marked as spam.
Undo

View and moderate all "wiki Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Wiki"

Anonymous - 2013-03-07

Originally posted by: perrica...@hotmail.com

Hi, I have just tried to convert the synchronized file into a gene-based synchronized file, and the command runs, but the output file is empty. Any ideas? The input sync file works fine with other commands. Perhaps it is the gtf file, but looking at the example gft file on this page it looks to be fine. Thanks,

Originally posted by: [perrica...@hotmail.com](http://code.google.com/u/102298370779005683846/) Hi, I have just tried to convert the synchronized file into a gene-based synchronized file, and the command runs, but the output file is empty. Any ideas? The input sync file works fine with other commands. Perhaps it is the gtf file, but looking at the example gft file on this page it looks to be fine. Thanks,

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "wiki Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Wiki"

Anonymous - 2013-04-30

Originally posted by: braud.ma...@gmail.com

There is a problem of column in the demo files. The command: samtools view -q 20 -bS map/pop1.sam | samtools sort - map/pop1 does not manage to create the bam file because of the value XA:Z: in the sam file, the error is: Parse error at line 36: missing colon in auxiliary data

What is the problem?

Originally posted by: [braud.ma...@gmail.com](http://code.google.com/u/103830048611429753145/) There is a problem of column in the demo files. The command: samtools view -q 20 -bS map/pop1.sam | samtools sort - map/pop1 does not manage to create the bam file because of the value XA:Z: in the sam file, the error is: Parse error at line 36: missing colon in auxiliary data What is the problem?

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "wiki Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Wiki"

Anonymous - 2013-05-06

Originally posted by: abh...@gmail.com

This is a great tool! I wanted to know if there is any way to get the SNPs in "_rc file" in VCF format? to be used in outlier detecting programs _

Originally posted by: abh...@gmail.com This is a great tool! I wanted to know if there is any way to get the SNPs in "_rc file" in VCF format? to be used in outlier detecting programs _

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "wiki Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Wiki"

Anonymous - 2014-05-06

Originally posted by: elzedliu

I am wondering if you guys can let me know a way to prepare input file for bayescan after popoolation?

Originally posted by: [elzedliu](http://code.google.com/u/elzedliu/) I am wondering if you guys can let me know a way to prepare input file for bayescan after popoolation?

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "wiki Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Wiki"

Anonymous - 2014-12-02

Originally posted by: alicebde...@gmail.com

Hi, I'm trying to use the cmh-test above, but should something in the cmd below instead read p1_p2_p3_p4? I'm confused where the second half of the data comes into this!

perl <popoolation2-path>/cmh-test.pl --input p1_p2_p1_p2.sync --output p1_p2_p1_p2.cmh --min-count 12 --min-coverage 50 --max-coverage 200 --population 1-2,3-4

Thanks

Originally posted by: [alicebde...@gmail.com](http://code.google.com/u/104663551873343286712/) Hi, I'm trying to use the cmh-test above, but should something in the cmd below instead read p1_p2_p3_p4? I'm confused where the second half of the data comes into this! perl <popoolation2-path>/cmh-test.pl --input p1_p2_p1_p2.sync --output p1_p2_p1_p2.cmh --min-count 12 --min-coverage 50 --max-coverage 200 --population 1-2,3-4 Thanks

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "wiki Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Wiki"

Anonymous - 2015-01-02

Originally posted by: amritaya...@gmail.com

hello, I wanted to know if there is any way to get the SNPs in rc file in VCF format?

Originally posted by: [amritaya...@gmail.com](http://code.google.com/u/103227632032722906971/) hello, I wanted to know if there is any way to get the SNPs in rc file in VCF format?

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "wiki Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Wiki"

Anonymous - 2015-01-05

Originally posted by: chengjie...@gmail.com

Hi,I am trying the fst for SNPs between 2 populaitons using the fst-sliding.pl . I am bit confused about the "min-coverage", for example, if I have 2 populations and set the min-coverage of 5, is that means 5 reads are required for each population at this SNP, or 10 reads for the 2 populaitons but can be 8:2, 7:3 ....for each?

Thank you!

Originally posted by: chengjie...@gmail.com Hi,I am trying the fst for SNPs between 2 populaitons using the fst-sliding.pl . I am bit confused about the "min-coverage", for example, if I have 2 populations and set the min-coverage of 5, is that means 5 reads are required for each population at this SNP, or 10 reads for the 2 populaitons but can be 8:2, 7:3 ....for each? Thank you!

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "wiki Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Wiki"

Anonymous - 2015-02-10

Originally posted by: beatrizj...@gmail.com

Hi everyone, Fisrt, thanks for the tool, it has been a great finding for my research. I'm trying to understand how fst_slinding.pl calculates the FST value. In the --help it says how it calculates, but I try to do it by myself and I have been not able to do it. I think it makes a correction by pool-size, but I don't know how? I'm using the script with an step size of 1. Thanks

Originally posted by: [beatrizj...@gmail.com](http://code.google.com/u/106339025452110094548/) Hi everyone, Fisrt, thanks for the tool, it has been a great finding for my research. I'm trying to understand how fst_slinding.pl calculates the FST value. In the --help it says how it calculates, but I try to do it by myself and I have been not able to do it. I think it makes a correction by pool-size, but I don't know how? I'm using the script with an step size of 1. Thanks

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Karol Cichewicz - 2017-01-25

Hi everyone. I'm trying to calculate fst and Fisher's p-values. These scripts take hours to run. I let the fst script go for over 20h on my university cluster, and it couldn't even get through one chromosome of data. The fisher-test.pl takes similarly long, but it doesn't save any information in its output file. My pools/bulks are composed of 45 and 44 individuals with average read depth of ~150. Is it the expected behvior or am I doing something wrong?

Examples of my scripts:
perl fisher-test.pl \ --input Low_vs_High_w_ref.mpileup.sync \ --output Low_vs_High-fisher.fet \ --min-count 5 --min-coverage 50 \ --step-size 1000 --window-size 1000 \ --max-coverage 800 --suppress-noninformative

perl fst-sliding.pl \ --input /Low_vs_High_w_ref.mpileup.sync \ --output /Low_vs_High.fst \ --min-count 10 --min-coverage 50 --max-coverage 800 --window-size 1 --step-size 1 --pool-size 45:44

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Robert Kofler - 2017-01-25
  
  Well thats lots of data. the two scripts fst and fet compute all pairwise comparisions and thats a lot for 45 to 44 individuals... I would try it with less individuals, and once this works scale it up to more.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Karol Cichewicz - 2017-01-25
    
    Thank your for a suggestion, but I think I should explain it better. My data contains pooled sequencing data. Aligned reads of individuals, with separate alignment for each individual, were merged by samtools merge, then realigned using GATK, and variants were called by samtools mpileup. Thus, the sync file does not contain information about variant calling in each individual, but in two pools. How would that be different from running these computations for just two samples? Should I change the --pool-size parameter to 1:1 then? I run Fisher exact test using plink on exactly the same dataset, but preserving the information about individuals, and it was much faster.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James Reeve - 2018-05-29

I'm trying to access the data for the tutorial, however the link appears to be dead. I've tried accessing it with two seperate browsers. How do I access the data?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Rinker - 2019-03-20

Really appreciate having this software availalbe! Currently I'm using it for BSA in fruitfly.

In computing Fst I'm confused as to what "pool-size" means. I've seen it mentioned as referring to both individuals and to number of chromosomes (and can see arguments for both). Also, what is the correct syntax for applying it when pools are unequal?--a poster above uses a ":" delimiter so is that correct? And if so am I safe in assuming that "x:y" would apply to the pools in the order in which they were passed into ".sync" file?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maria Cortazar Chinarro - 2021-01-03

Please, can you upload the igv png images to check whether my results look ok?
I cannot acces to thouse pictures:

"Voila, the result should look something like this:

Note that the result shown above only contain the Fst for a single pairwise comparison (pop1 vs pop2). PoPoolation2 is per default calculating the Fst for all possible pairwise comparisons between populations (when several populations are present in the synchronized file), and these results may just as easily be converted into the .igv format. When loaded into IGV it may look like the following:"
Thank you very much,

Last edit: Maria Cortazar Chinarro 2021-01-03

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jonas A Kengne - 2021-03-02

Hi , I downloaded the online tutorial on popoopulation2 and had some problems in its execution. Could anyone assist pls?

problem

created mpileup successfully

created syn file successfully

fisher-test.pl fails (stops on line 9 and 10)

Exact message from the terminal:

poolseq_test$: perl popoolation2_1201/fisher-test.pl --input p1_p2.sync --output p1_p2.fet --min-count 6 --min-coverage 50 --max-coverage 200 --suppress-noninformative

Can't locate Text/NSP/Measures/2D/Fisher/twotailed.pm in @INC (you may need to install the Text::NSP::Measures::2D::Fisher::twotailed module) (@INC contains: /home/Jako/Documents/poolseq_test/popoolation2_1201 /home/Jako/Documents/poolseq_test/popoolation2_1201/Modules /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.22.1 /usr/local/share/perl/5.22.1 /usr/lib/x86_64-linux-gnu/perl5/5.22 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.22 /usr/share/perl/5.22 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base .) at /home/Jako/Documents/poolseq_test/popoolation2_1201/Modules/FET.pm line 10.

BEGIN failed--compilation aborted at /home/Jako/Documents/poolseq_test/popoolation2_1201/Modules/FET.pm line 10.

Compilation failed in require at popoolation2_1201/fisher-test.pl line 9.

BEGIN failed--compilation aborted at popoolation2_1201/fisher-test.pl line 9.

Looking forward to hearing from you or any other person willing to help.

many thanks in advance

Last edit: Jonas A Kengne 2021-03-02
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Mr Schweitzer - 2021-04-04
  
  Hello Mr. Kengne,
  I had the same problem and I managed to get the perl script running by following this solution in the comments here:
  https://code.google.com/archive/p/popoolation2/issues/19
  
  I hope this helps!
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Atikah - 2021-06-09

Hi everyone,

I am using Popoolation2 sofware to calculate allele frequency and Fst now. But when I calculate allele difference of my populations, it generates several locus with 1 allele count (only one allele state). Commonly SNPs is a biallele or in some cases it can contain 3 alelles or 4 alleles. And in the minimum coverage, I set it 10 but in the rc file it generates the SNP with coverage less than 10. Do you think there is an error that happening when I use the Popoolation2?
I am really appreciate if you can help me figured out what is happening. Thank you.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Atikah - 2021-06-09
  
  I just found out that it was rc SNPs not pop SNPs type.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hafiz Muhammad Anas - 2023-05-07

Hello Community
Is it possible to annotate snps from snpeff eff tool after getting variants? which file i need to annoate?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2023-09-13

Hi everyone,

I noticed that when converting an mpileup file to a sync file, positions on a genome are shifted.
For example, position 10,923,271 in a reference genome used for mapping was shifted to 10,923,753 in a sync file.
Does anyone know how to resolve this error?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Lautaro Ezequiel Bennardo - 2023-11-07

Hello everyone,
I am using popoolation2 to perform the CMH test between 4 pairs of populations. The issue is that it's giving me very high p-values in general (very low actually), less than 10e-60. Does anyone know how I should choose the p-value cutoff? Or if there's any correction I should apply to these values? Thank you!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Mr Schweitzer - 2023-11-07
  
  Hello Mr. Bennardo,
  I think the two most common ways of adressing multiple testing in GWAS-like data is either using a Bonferroni corrected p-value threshold or by using a False Discovery Rate (FDR) approach. The documentation for the R package "qvalue" is very well written:
  https://www.bioconductor.org/packages/release/bioc/manuals/qvalue/man/qvalue.pdf
  
  👍
  1
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

popoolation2 Wiki

Tutorial

Introduction

Requirements

Data

Walkthrough

Prepare the reference genome

Map the reads to the reference genome

Remove ambiguously mapped reads

Create a synchronized file

Calculate allele frequency differences

Fst-values: measure differentiation between populations

Calculate Fst for every SNP

Calculate Fst values using a sliding window approach

Load Fst-values into IGV

Fisher's Exact Test: estimate the significance of allele frequency differences

Load the Fisher's exact test results into the IGV

Cochran-Mantel-Haenszel test: detect consistent allele frequency changes in several biological replicates

Display the cmh-test results in the IGV

Calculate Fst for genes

After PoPoolation2

Related

Discussion