sfscode-users Mailing List for sfs_code (Page 2)

Brought to you by: luricchio, ryan_hernandez

sfscode-users — General discussion about using SFS_CODE.

You can subscribe to this list here.

2008	Jan	Feb	Mar	Apr	May	Jun (1)	Jul	Aug	Sep	Oct	Nov	Dec
2010	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep (1)	Oct	Nov	Dec
2012	Jan	Feb (5)	Mar	Apr	May	Jun	Jul	Aug (3)	Sep	Oct (1)	Nov	Dec (3)
2013	Jan	Feb (4)	Mar	Apr	May	Jun	Jul	Aug	Sep (1)	Oct	Nov (3)	Dec
2014	Jan	Feb	Mar	Apr (1)	May	Jun	Jul	Aug	Sep	Oct (1)	Nov (2)	Dec
2015	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep (11)	Oct (3)	Nov (4)	Dec
2016	Jan	Feb (1)	Mar (1)	Apr	May	Jun	Jul	Aug	Sep	Oct (1)	Nov	Dec
2019	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (1)	Nov	Dec

Flat | Threaded

<< < 1 2 (Page 2 of 2)

[sfscode-users] Males and females proportion

From: Sara G. <sar...@cr...> - 2014-04-16 10:09:40

Dear Dr Hernandez,

My question is regarding the mating scheme of a population. I need to have males and females to contribute in different proportions to the population, but in the sense that once the mating occurs among individuals with the highest fitness, they produce equal numbers of males and females but the proportion of best fit males and females chosen for the next mating is different from 0.5 (equal number of males and females). Could you please let me know if there is any way to simulate this? It seems to me that the option in SFS_code is for setting the proportion of males and females that the mating produce, which is different from the proportion chosen to mate in the next generation.

Best regards and thank you for the help,

Sara.

-----------------

Sara Guirao-Rico
Centre for Research in Agricultural Genomics (CSIC-IRTA-UAB-UB)
Plant and Animal Genomics Research Program
Statistical and Population Genomics Group
Despatx 325, Edifici CRAG, Campus UAB
08193 Bellaterra, SPAIN

Phone: +34 93 563 6600 Ext 3349
Fax: +34 93 563 66 01

Re: [sfscode-users] question about simulation selection with sfs_code

From: Miguel N. <nav...@su...> - 2013-11-27 10:22:09

Hi Ryan,

So (if I got it right this time) the option -W P 1 1 5 1 0 makes 
mutations appear in population 1 with a selection coefficient (gamma=5) 
that is common to all populations. What about using --neutPop option? :

./sfs_code 2 1 -W P 0 1 5 1 0 --neutPop 0 -TS 1 0 1 -TE 1 0 -TE 5 -n 0 10

new non-synonymous mutations appear in population 0 and are advantageous 
(with gamma=5):
-W P 0 1 5 1 0

all mutations behave neutrally in population 0
--neutPop 0

create population 1 from population 0 at time 1 (*2N):
-TS 1 0 1
(new non-synonymous mutations in population 1 are neutral, old 
non-synonymous are advantageous)

terminate population 0 at time 1 (*2N):
-TE 1 0

terminate simulation at time 5 (*2N):
-TE 5

sample 10 individuals in population 1 (at time 5*2N) and none in 
population 0:
-n 0 10

Sorry if I insist, I just want to properly understand what the different 
option do and their interactions.

Best regards,

Miguel


On 11/27/2013 05:11 AM, Ryan Hernandez wrote:
> Hi Miguel,
>
> Unfortunately, I don't think your plan will do what you want.  The scenario you lay out below would still introduce new mutations under selection, there would just be a background level of diversity.  This is the same as doing the simulation with only a single population.  Selection on standing variation refers to the situation where an allele segregates in the population for a while (either neutral or even deleterious) until some time at which it becomes beneficial.  Sadly, SFS_CODE does not yet simulate selection on standing variation.  I have plans to implement such features soon, but have not yet started.
>
> Sorry!  Let me know if you have other questions.
>
> Best,
>
> Ryan
>
> On Nov 26, 2013, at 11:28 AM, Miguel Navascues wrote:
>
>> Dear Ryan and sfs_code users,
>>
>> I would like to simulate some adaptive process from standing variation in one population with sfs_code. My idea is to use a similar trick that the one used to obtain samples at several times. I will simulate one population in which all mutations are neutral, then, at a given time I will create a second population for which mutations had some selective coefficient and terminate the first population. That is (values not necessary realistic in any way) :
>>
>> ./sfs_code 2 1 -W P 0 0 -W P 1 1 5 1 0 -TS 1 0 1 -TE 1 0 -TE 5 -n 0 10
>>
>> new non-synonymous mutations are neutral in population 0:
>> -W P 0 0
>>
>> new non-synonymous mutations are advantageous (with gamma=5) in population 1 :
>> -W P 1 1 5 1 0
>>
>> create population 1 from population 0 at time 1 (*2N):
>> -TS 1 0 1
>>
>> terminate population 0 at time 1 (*2N):
>> -TE 1 0
>>
>> terminate simulation at time 5 (*2N):
>> -TE 5
>>
>> sample 10 individuals in population 1 (at time 5*2N) and none in population 0:
>> -n 0 10
>>
>>
>> So my question is whether I got it right or I misunderstood how sfs_code works.
>>
>> Also, is there a way to do the same but with the option --mutation? The way I understood it with mutation the gamma is fixed for all populations, isn't it? Could it be done in combination with --neutPop?
>>
>> Sincerely,
>>
>> Miguel
>>
>>
>> --
>> Miguel NAVASCUÉS, PhD
>>
>> Chargé de Recherche (CR2) INRA
>>
>> UMR CBGP Centre de Biologie pour la Gestion des Populations
>> Institut National de la Recherche Agronomique
>> Campus International de Baillarguet, CS 30016
>> 34988 Montferrier-sur-Lez (France)
>>
>> phone:  +33(0)4.99.62.33.70
>> fax:    +33(0)4.99.62.33.45
>> e-mail: miguel.navascues AT supagro.inra.fr
>> e-mail: m.navascues AT gmail.com
>> Skype:  m.navascues
>> web:    http://www1.montpellier.inra.fr/cbgp/
>> web:    http://sites.google.com/site/navascuesresearch/
>
>
>


-- 
Miguel NAVASCUÉS, PhD

Chargé de Recherche (CR2) INRA

UMR CBGP Centre de Biologie pour la Gestion des Populations
Institut National de la Recherche Agronomique
Campus International de Baillarguet, CS 30016
34988 Montferrier-sur-Lez (France)

phone:  +33(0)4.99.62.33.70
fax:    +33(0)4.99.62.33.45
e-mail: miguel.navascues AT supagro.inra.fr
e-mail: m.navascues AT gmail.com
Skype:  m.navascues
web:    http://www1.montpellier.inra.fr/cbgp/
web:    http://sites.google.com/site/navascuesresearch/

Re: [sfscode-users] question about simulation selection with sfs_code

From: Ryan H. <rya...@uc...> - 2013-11-27 04:11:48

Hi Miguel,

Unfortunately, I don't think your plan will do what you want.  The scenario you lay out below would still introduce new mutations under selection, there would just be a background level of diversity.  This is the same as doing the simulation with only a single population.  Selection on standing variation refers to the situation where an allele segregates in the population for a while (either neutral or even deleterious) until some time at which it becomes beneficial.  Sadly, SFS_CODE does not yet simulate selection on standing variation.  I have plans to implement such features soon, but have not yet started.  

Sorry!  Let me know if you have other questions.

Best,

Ryan

On Nov 26, 2013, at 11:28 AM, Miguel Navascues wrote:

> Dear Ryan and sfs_code users,
> 
> I would like to simulate some adaptive process from standing variation in one population with sfs_code. My idea is to use a similar trick that the one used to obtain samples at several times. I will simulate one population in which all mutations are neutral, then, at a given time I will create a second population for which mutations had some selective coefficient and terminate the first population. That is (values not necessary realistic in any way) :
> 
> ./sfs_code 2 1 -W P 0 0 -W P 1 1 5 1 0 -TS 1 0 1 -TE 1 0 -TE 5 -n 0 10
> 
> new non-synonymous mutations are neutral in population 0:
> -W P 0 0
> 
> new non-synonymous mutations are advantageous (with gamma=5) in population 1 :
> -W P 1 1 5 1 0
> 
> create population 1 from population 0 at time 1 (*2N):
> -TS 1 0 1
> 
> terminate population 0 at time 1 (*2N):
> -TE 1 0
> 
> terminate simulation at time 5 (*2N):
> -TE 5
> 
> sample 10 individuals in population 1 (at time 5*2N) and none in population 0:
> -n 0 10
> 
> 
> So my question is whether I got it right or I misunderstood how sfs_code works.
> 
> Also, is there a way to do the same but with the option --mutation? The way I understood it with mutation the gamma is fixed for all populations, isn't it? Could it be done in combination with --neutPop?
> 
> Sincerely,
> 
> Miguel
> 
> 
> -- 
> Miguel NAVASCUÉS, PhD
> 
> Chargé de Recherche (CR2) INRA
> 
> UMR CBGP Centre de Biologie pour la Gestion des Populations
> Institut National de la Recherche Agronomique
> Campus International de Baillarguet, CS 30016
> 34988 Montferrier-sur-Lez (France)
> 
> phone:  +33(0)4.99.62.33.70
> fax:    +33(0)4.99.62.33.45
> e-mail: miguel.navascues AT supagro.inra.fr
> e-mail: m.navascues AT gmail.com
> Skype:  m.navascues
> web:    http://www1.montpellier.inra.fr/cbgp/
> web:    http://sites.google.com/site/navascuesresearch/

[sfscode-users] question about simulation selection with sfs_code

From: Miguel N. <nav...@su...> - 2013-11-26 19:28:51

Dear Ryan and sfs_code users,

I would like to simulate some adaptive process from standing variation 
in one population with sfs_code. My idea is to use a similar trick that 
the one used to obtain samples at several times. I will simulate one 
population in which all mutations are neutral, then, at a given time I 
will create a second population for which mutations had some selective 
coefficient and terminate the first population. That is (values not 
necessary realistic in any way) :

./sfs_code 2 1 -W P 0 0 -W P 1 1 5 1 0 -TS 1 0 1 -TE 1 0 -TE 5 -n 0 10

new non-synonymous mutations are neutral in population 0:
-W P 0 0

new non-synonymous mutations are advantageous (with gamma=5) in 
population 1 :
-W P 1 1 5 1 0

create population 1 from population 0 at time 1 (*2N):
-TS 1 0 1

terminate population 0 at time 1 (*2N):
-TE 1 0

terminate simulation at time 5 (*2N):
-TE 5

sample 10 individuals in population 1 (at time 5*2N) and none in 
population 0:
-n 0 10


So my question is whether I got it right or I misunderstood how sfs_code 
works.

Also, is there a way to do the same but with the option --mutation? The 
way I understood it with mutation the gamma is fixed for all 
populations, isn't it? Could it be done in combination with --neutPop?

Sincerely,

Miguel


-- 
Miguel NAVASCUÉS, PhD

Chargé de Recherche (CR2) INRA

UMR CBGP Centre de Biologie pour la Gestion des Populations
Institut National de la Recherche Agronomique
Campus International de Baillarguet, CS 30016
34988 Montferrier-sur-Lez (France)

phone:  +33(0)4.99.62.33.70
fax:    +33(0)4.99.62.33.45
e-mail: miguel.navascues AT supagro.inra.fr
e-mail: m.navascues AT gmail.com
Skype:  m.navascues
web:    http://www1.montpellier.inra.fr/cbgp/
web:    http://sites.google.com/site/navascuesresearch/

[sfscode-users] Unexpected nucleodite diversity variation according to ploidy levels

From: Benoit N. <ben...@gm...> - 2013-09-12 14:03:33

Hello,

I got an unexpected result when I ran sfs_code using the following command
lines :

sfs_code 1 100 --ploidy 2 -o out.txt
sfs_code 1 100 --ploidy 1 -o out.txt

The former gave me a mean nucleotide diversity p = 0.01 (as expected). The
second, however, gave me a mean p = 0.019 (almost twice as high as the
"expected" value).

Do you please have an explanation for this result?

Best regards and thank-you for this nice software!

Benoit
-- 

Benoit Nabholz

Institut des Sciences de l'Evolution. CC64
Université Montpellier II
Place Eugène Bataillon
34095 Montpellier cedex 5
France
Tel : 0033 (0)4 67 14 36 97

Re: [sfscode-users] SFS CODE

From: Filipe G. V. <fgv...@be...> - 2013-02-21 07:15:30

Hi Ryan,

and thanks for the quick reply. So each pair of seqs (0-1, 2-3, 4-5,
..) is an individual and in the males I can just pick the first one.

Also, just a couple of things regarding convertSFS --alignment:
- on top of the FASTA file there is a strange string that I think
shouldn't be there. In my case: arg=13: 1
- why the option "P.I"? couldn't (for eg) "P.I 1 0.1" be set with "P 1 0 I 1 1"?
- the seq names could be a bit more clear. Instead of:
it0pop0ind0locus0
it0pop0ind1locus0
it0pop0ind2locus0
it0pop0ind3locus0

maybe something like:
it0pop0ind0locus0hap0
it0pop0ind0locus0hap1
it0pop0ind1locus0hap0
it0pop0ind1locus0hap1

and allow for another option at the command line for the haplotypes
(like there si for the P I L ITS).

thanks for your help,
FGV


On Wed, Feb 20, 2013 at 9:38 PM, Ryan Hernandez <rh...@gm...> wrote:
> Hi Filipe,
>
> Males have an X and a Y.  They have the same ancestral nucleotide sequence,
> but don't recombine.  So you want to take just the first chromosome from
> males.  Make sense?
>
> Cheers,
>
> Ryan
>
>
> On Wed, Feb 20, 2013 at 7:09 PM, Filipe G. Vieira <fgv...@be...>
> wrote:
>>
>> Dear all,
>>
>> I'm trying to simulate a neutral X chromosome on one population of 200
>> individuals. So, I'm using the command line:
>> ./sfs_code 1 1 --popSize 10000 --sampSize 200 --length 1 1000000
>> --annotate N --sex 1 --theta 0.0015 --rho 0.001
>>
>> All seems ok, but when I get the sequences I get 400 of them; i guess
>> ">it0pop0ind0locus0" and ">it0pop0ind1locus0" correspond to individual
>> 1. However, if the males only have one X shouldn't there be just 300
>> seqs? Or can I just sample 100 seqs from the males?
>>
>> thanks,
>> FGV
>>
>>
>> ------------------------------------------------------------------------------
>> Everyone hates slow websites. So do we.
>> Make your web apps faster with AppDynamics
>> Download AppDynamics Lite for free today:
>> http://p.sf.net/sfu/appdyn_d2d_feb
>> _______________________________________________
>> sfscode-users mailing list
>> sfs...@li...
>> https://lists.sourceforge.net/lists/listinfo/sfscode-users
>
>

Re: [sfscode-users] SFS CODE

From: Ryan H. <rh...@gm...> - 2013-02-21 05:38:33

Hi Filipe,

Males have an X and a Y.  They have the same ancestral nucleotide sequence,
but don't recombine.  So you want to take just the first chromosome from
males.  Make sense?

Cheers,

Ryan


On Wed, Feb 20, 2013 at 7:09 PM, Filipe G. Vieira <fgv...@be...>wrote:

> Dear all,
>
> I'm trying to simulate a neutral X chromosome on one population of 200
> individuals. So, I'm using the command line:
> ./sfs_code 1 1 --popSize 10000 --sampSize 200 --length 1 1000000
> --annotate N --sex 1 --theta 0.0015 --rho 0.001
>
> All seems ok, but when I get the sequences I get 400 of them; i guess
> ">it0pop0ind0locus0" and ">it0pop0ind1locus0" correspond to individual
> 1. However, if the males only have one X shouldn't there be just 300
> seqs? Or can I just sample 100 seqs from the males?
>
> thanks,
> FGV
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_feb
> _______________________________________________
> sfscode-users mailing list
> sfs...@li...
> https://lists.sourceforge.net/lists/listinfo/sfscode-users
>

[sfscode-users] SFS CODE

From: Filipe G. V. <fgv...@be...> - 2013-02-21 03:09:33

Dear all,

I'm trying to simulate a neutral X chromosome on one population of 200
individuals. So, I'm using the command line:
./sfs_code 1 1 --popSize 10000 --sampSize 200 --length 1 1000000
--annotate N --sex 1 --theta 0.0015 --rho 0.001

All seems ok, but when I get the sequences I get 400 of them; i guess
">it0pop0ind0locus0" and ">it0pop0ind1locus0" correspond to individual
1. However, if the males only have one X shouldn't there be just 300
seqs? Or can I just sample 100 seqs from the males?

thanks,
FGV

[sfscode-users] convertSFS_CODE problems

From: Bruno N. <bru...@cr...> - 2013-02-12 18:43:14

Dear Dr Hernandez,

Hope this email finds you well!
I am using sfs_code to simulate 2 populations as they diverge in time, but I am encountering a few issues converting the output from sfs_code into a fasta or ms file for subsequent analysis.

A little background: I want to sample the 2 populations at different times, to compare pairs of populations at low (~0.1% seq. divergence) or relatively high (~1 and 2%) divergence values. I do this by following your recommendations for sampling from extinct lineages (i.e. duplicate each population at different time points and kill the duplicated population). I am simulating a single non-coding locus, low variability and recombination, 100 k base-pairs long, and sampling 21 diploid individuals from each population. I know easier ways to do this simulation, however this is just a starting point for a more complex model I want to implement.

The command line I am using (100 iterations) is

./sfs_code 6 100 -t 0.0004 -r 0.0004 -L 1 100000 -a N -N 1000 -TS 0 0 1 -TS 3 0 2 -TS 3 1 3 -TS 25 0 4 -TS 25 1 5 -TE 3 2 -TE 3 3 -TE 25 4 -TE 25 5 -TE 50 -n 21 -o output.sfs

The above command seems to run well (sfs_code runs without any error or warning message), however I am having problems running the output through convertSFS_CODE. It gives me several errors about memory allocation, which I am unable to decipher. The error messages actually depend on which computer I use (see note at end of this email for the exact error messages I am receiving). So my first question is whether there is a limit (in terms of individuals, populations, sequence length, or divergence times) to run convertSFS_CODE ? I noticed it works for smaller examples, but cannot figure out exactly when does it start giving these errors.

A second question I have is about multiple hits. In an attempt to avoid bothering you with this issue, I have written a script to convert the output of sfs_code into a fasta file. This script returns the same fasta file as convertSFS_CODE, so it seems to work OK, except when there are multiple hits. Specifically, convertSFS_CODE seems to output only 1 mutation when there are several hits on the same site. For instance, the 2 lines below are part of the output of sfs_code, using 3 populations with the command line ./sfs_code_def 3 2 -TS 0 0 1 -TS 1 1 2 -TE 10 :

0,A,3474,7378,9054,TGT,T,1,V,F,0.0,1,0.-1;
0,A,3474,7981,10000,TTT,G,1,F,V,0.0,7,0.2,0.3,0.4,0.5,0.7,0.8,0.9;

They show 2 mutations, occurring on the same site. If I use convertSFS_CODE, only one mutation is present in the final fasta file (the first mutation, which is fixed in population 0). My script, however, naively outputs both mutations, so the output for this site looks like:

convertSFS_CODE > TTTTTTTTTTTT GGGGGGGGGGGG GGGGGGGGGGGG
myscript > TTGGGGTGGGTT GGGGGGGGGGGG GGGGGGGGGGGG

Could you please let me know which of these outputs you would consider "correct", and which rules convertSFS_CODE uses to decide which mutation to output?

Thank you very much for any help with these issues, and my apologies for the long email!

Best regards,
Bruno

* errors received when running convertSFS_CODE, this first example occurs in a mac computer running mac OSX 10.6:

convertSFS_CODE(14634) malloc: *** error for object 0x100100fa8: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Abort trap

And a second example, when running on a linux cluster, unsure which linux version:

*** glibc detected *** /home/bnevado/myapps/sfs_code/bin/convertSFS_CODE: malloc(): memory corruption: 0x0000000001a7a330 ***
*** glibc detected *** /home/bnevado/myapps/sfs_code/bin/convertSFS_CODE: malloc(): memory corruption: 0x0000000001a7a330 ***

Re: [sfscode-users] (no subject)

From: Ryan H. <rh...@gm...> - 2012-12-11 23:31:59

Hi Andre,

Unfortunately, there is a problem with the command line that you've used,
and sfs_code is at fault for not doing a good enough job at checking the
order of the input parameters.  This is my fault, and I will certainly
update this in the next release.  You need to identify the number of loci
first, and then follow it with annotating the loci as non-coding.  If you
look at your output data from the multiple loci simulation, you will see
variants in loci 1-19 that are coding (they will have amino acid codes in
entries 9 and 10, whereas they should be X in a non-coding locus).  The
implication is that you end up having selection only operating on
non-synonymous mutations in 19/20 of your loci (synonymous variants are
neutral).  If you swap the order of your commands to put the -L 20 5000
command first, followed by -a N, then you will get patterns of diversity
that are identical to the situation where you have -L 1 100000.

Again, my apologies for sfs_code not catching this, I hope it has not
caused any harm.

Best,

Ryan


On Tue, Dec 11, 2012 at 6:40 AM, Aberer, Andre <And...@h-...>wrote:

> Dear Ryan, dear sfs_code users,
>
> as stated on the sfs_code main page, for runtime reasons it is
> preferable to simulate a sequence as a bunch of short loci (e.g. 5 Kbp)
> instead of one long monolithic locus. Still one obtains an equivalent
> result either way.
>
> My assumption was, that by default (no --linkage specification) these
> loci are fully linked and these loci are adjacent to each other. Thus,
> if a recombinant emerges from haplotypes A and B and the recombination
> occurs in locus 5 (out of 20 loci), then the recombinant will be
> composed of the genetic material of A for loci 1-4, of B for loci 6-20
> and a mixture depending on the break point for locus 5.
>
> I ran an example with strong selection, the command line was:
> ./sfs_code 1 1 -A -a N -n 25 -N 250 -t  0.001 -r 0.001  -W 1 5 0.2 0.8 -o
> outfile -s $RANDOM
> with either
> * -L 20 5000
> or
> * -L 1 100000
>
> I extracted  the summary statistics of 15,000 simulations with
> ./convert_SFS outfile --ms
> and ran a script that extracts number of segregating sites and the
> nucleotide diversity.
>
> To my surprise, I found that if the locus is fragmented, I get a
> higher number of segregating sites and higher nucleotide diversity
> (see plots attached). For neutrality however, both distributions are
> in accordance.
>
> Is there something wrong with selection or with my assumptions?
>
> --
> Best regards,
> Andre J. Aberer
>
> M.Sc. (Bioinformatics)
> Scientific Computing Group
>
> Heidelberg Institute for Theoretical Studies (HITS gGmbH)
> Schloss-Wolfsbrunnenweg 35
> D-69118 Heidelberg
>
> Tel.:   +49 6221 533 264
> Fax:    +49 6221 533 298
> Email:  and...@h-...
> WWW:    http://www.exelixis-lab.org
>         http://www.h-its.org/english/research/sco/index.php
>
> Amtgericht Mannheim / HRB 337446
> Managing Directors: Dr. h.c. Dr.-Ing. E.h. Klaus Tschira, Prof. Dr.-Ing.
> Andreas Reuter
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> sfscode-users mailing list
> sfs...@li...
> https://lists.sourceforge.net/lists/listinfo/sfscode-users
>
>


-- 
Ryan D. Hernandez, Ph.D.
Assistant Professor
Department of Bioengineering and Therapeutic Sciences
University of California at San Francisco
UCSF MC 2552
Byers Hall Room 503C
1700 4th Street
San Francisco, CA  94158-2330

Phone:    (415) 514-9813
Email:    rya...@uc...
Web:      http://bts.ucsf.edu/hernandez_lab

[sfscode-users] (no subject)

From: Aberer, A. <And...@h-...> - 2012-12-11 14:40:49

Attachments: pies.pdf segSites.pdf

Dear Ryan, dear sfs_code users,

as stated on the sfs_code main page, for runtime reasons it is
preferable to simulate a sequence as a bunch of short loci (e.g. 5 Kbp)
instead of one long monolithic locus. Still one obtains an equivalent
result either way.   

My assumption was, that by default (no --linkage specification) these
loci are fully linked and these loci are adjacent to each other. Thus,
if a recombinant emerges from haplotypes A and B and the recombination
occurs in locus 5 (out of 20 loci), then the recombinant will be
composed of the genetic material of A for loci 1-4, of B for loci 6-20
and a mixture depending on the break point for locus 5.

I ran an example with strong selection, the command line was:
./sfs_code 1 1 -A -a N -n 25 -N 250 -t  0.001 -r 0.001  -W 1 5 0.2 0.8 -o outfile -s $RANDOM
with either 
* -L 20 5000 
or
* -L 1 100000 

I extracted  the summary statistics of 15,000 simulations with
./convert_SFS outfile --ms 
and ran a script that extracts number of segregating sites and the
nucleotide diversity. 

To my surprise, I found that if the locus is fragmented, I get a
higher number of segregating sites and higher nucleotide diversity
(see plots attached). For neutrality however, both distributions are
in accordance.

Is there something wrong with selection or with my assumptions? 

-- 
Best regards,
Andre J. Aberer

M.Sc. (Bioinformatics)
Scientific Computing Group

Heidelberg Institute for Theoretical Studies (HITS gGmbH)
Schloss-Wolfsbrunnenweg 35
D-69118 Heidelberg

Tel.:	+49 6221 533 264
Fax:	+49 6221 533 298
Email:  and...@h-...
WWW:	http://www.exelixis-lab.org
	http://www.h-its.org/english/research/sco/index.php

Amtgericht Mannheim / HRB 337446
Managing Directors: Dr. h.c. Dr.-Ing. E.h. Klaus Tschira, Prof. Dr.-Ing. Andreas Reuter

[sfscode-users] recessive selection in SFS_code

From: Ron Do <dr...@br...> - 2012-10-07 20:17:11

  Dear Ryan,

First, I'd like to say that SFS_code is really great and I have really 
enjoyed using this software.

I was wondering if there is any functionality to add recessive selection 
in SFS_code.  In the documentation, I see only additive or 
multiplicative models of selection.  If there isn't functionality, is it 
possible to have this implemented?

Thanks,
Ron

Re: [sfscode-users] selection with migration

From: Ryan H. <rh...@gm...> - 2012-08-14 05:05:55

Hi Adam,

Thanks for the questions and for answering them!  sfs_code does not
natively allow for back mutations to reverse the selection coefficient.  I
have implemented such a version, though, and will make it available with
the next release.  Such a model was used in the recent PLoS Genetics paper
by Danny Wilson (I was not involved in the study you mention below).

Gamma (2Ns) varies by population, but if you specify a model of selection
in the ancestral population, then the distribution of s remains constant
across populations after their divergence.  If one population goes through
a bottleneck, then of course deleterious mutations can increase in
frequency, but whether this is the correct thing to do depends on your
model for selection.  You can also specify a different distribution of
selection coefficients for an individual population to make selection truly
population-specific.  Lastly, if you turn off selection in one population
(using the --neutPop (-w) <pop> command), then even migrants into the
population carrying deleterious alleles will not experience selection.
 Note that this is a special command, which is different from setting the
distribution of selection coefficients to 0 using -W 0.

Let me know if you have any other questions!

Best,

Ryan

On Mon, Aug 13, 2012 at 1:25 PM, Adam Retchless <ada...@be...
> wrote:

> Hi everyone,
>
> I think I answered my own question by running some simulations and
> looking at the result file.
>
> It looks like SFS_code is not appropriate for simulating migration
> between populations with different selective regimes.
>
> 1) Fitness effects are not preserved following fixation: I saw a
> situation where both a mutation and its reversal had negative selection.
>
> 2) Fitness effects of alleles are determined by the selective regime of
> the population within which they arise, not by the selective regime of
> the population that they currently exist in: I saw that substitutions
> were listed twice (one for each population), and they had the same
> fitness effect and starting generation for both populations, but
> different fixation generation for each population.
>
> have a good one,
> adam
>
> On 8/12/2012 3:24 PM, Adam Retchless wrote:
> > Dear SFS users,
> >
> > I am looking at how SFS_code has been used to simulate bacterial
> > evolution (e.g.
> > http://www.nature.com/nature/journal/v485/n7396/full/nature10995.html),
> > and cannot decide whether SFS_code is capable of addressing local
> > adaptation. I am unclear as to how the fitness effect of a mutation is
> > assigned, and what implications this has for simulations where selection
> > varies between populations.
> >
> > One general question is whether the fitness effect of a mutation is
> > preserved even after it goes extinct, such that a recurrent mutation
> > would have the same fitness effect each time. Or is the effect
> > randomized each time that the mutant arises?
> >
> > A related question is whether there is a single fitness effect for each
> > allele across all populations, or if the fitness effect is
> > population-specific. My impression is that all populations share the
> > same fitness effect, but this would produce odd behavior when there is
> > migration between populations with different selective regimes. For
> > instance, if there is negative selection in one population but no
> > selection in another, then the purifying selection would be negated by
> > migration from the neutral population.
> >
> > Do I understand the system correctly?
> >
> > Thank you,
> > Adam
> >
> >
>
>
> --
> Adam Retchless
> Miller Research Fellow, ESPM
> University of California, Berkeley
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> sfscode-users mailing list
> sfs...@li...
> https://lists.sourceforge.net/lists/listinfo/sfscode-users
>



-- 
Ryan D. Hernandez, Ph.D.
Assistant Professor
Department of Bioengineering and Therapeutic Sciences
University of California at San Francisco
UCSF MC 2552
Byers Hall Room 503C
1700 4th Street
San Francisco, CA  94158-2330

Phone:    (415) 514-9813
Email:    rya...@uc...
Web:      http://bts.ucsf.edu/hernandez_lab

Re: [sfscode-users] selection with migration

From: Adam R. <ada...@be...> - 2012-08-13 20:25:26

Hi everyone,

I think I answered my own question by running some simulations and 
looking at the result file.

It looks like SFS_code is not appropriate for simulating migration 
between populations with different selective regimes.

1) Fitness effects are not preserved following fixation: I saw a 
situation where both a mutation and its reversal had negative selection.

2) Fitness effects of alleles are determined by the selective regime of  
the population within which they arise, not by the selective regime of 
the population that they currently exist in: I saw that substitutions 
were listed twice (one for each population), and they had the same 
fitness effect and starting generation for both populations, but 
different fixation generation for each population.

have a good one,
adam

On 8/12/2012 3:24 PM, Adam Retchless wrote:
> Dear SFS users,
>
> I am looking at how SFS_code has been used to simulate bacterial
> evolution (e.g.
> http://www.nature.com/nature/journal/v485/n7396/full/nature10995.html),
> and cannot decide whether SFS_code is capable of addressing local
> adaptation. I am unclear as to how the fitness effect of a mutation is
> assigned, and what implications this has for simulations where selection
> varies between populations.
>
> One general question is whether the fitness effect of a mutation is
> preserved even after it goes extinct, such that a recurrent mutation
> would have the same fitness effect each time. Or is the effect
> randomized each time that the mutant arises?
>
> A related question is whether there is a single fitness effect for each
> allele across all populations, or if the fitness effect is
> population-specific. My impression is that all populations share the
> same fitness effect, but this would produce odd behavior when there is
> migration between populations with different selective regimes. For
> instance, if there is negative selection in one population but no
> selection in another, then the purifying selection would be negated by
> migration from the neutral population.
>
> Do I understand the system correctly?
>
> Thank you,
> Adam
>
>

-- 
Adam Retchless
Miller Research Fellow, ESPM
University of California, Berkeley

[sfscode-users] selection with migration

From: Adam R. <ada...@be...> - 2012-08-12 22:25:02

Dear SFS users,

I am looking at how SFS_code has been used to simulate bacterial 
evolution (e.g. 
http://www.nature.com/nature/journal/v485/n7396/full/nature10995.html), 
and cannot decide whether SFS_code is capable of addressing local 
adaptation. I am unclear as to how the fitness effect of a mutation is 
assigned, and what implications this has for simulations where selection 
varies between populations.

One general question is whether the fitness effect of a mutation is 
preserved even after it goes extinct, such that a recurrent mutation 
would have the same fitness effect each time. Or is the effect 
randomized each time that the mutant arises?

A related question is whether there is a single fitness effect for each 
allele across all populations, or if the fitness effect is 
population-specific. My impression is that all populations share the 
same fitness effect, but this would produce odd behavior when there is 
migration between populations with different selective regimes. For 
instance, if there is negative selection in one population but no 
selection in another, then the purifying selection would be negated by 
migration from the neutral population.

Do I understand the system correctly?

Thank you,
Adam


-- 
Adam Retchless
Miller Research Fellow, ESPM
University of California, Berkeley

Re: [sfscode-users] Gao Wang - question on simulating sequence under your 2008 paper

From: Wang, G. <wa...@gm...> - 2012-02-15 18:09:56

Dear Dr. Hernandez,

Thank you for being so much detailed! This is quite clear now on selection
coefficient. I wish I can learn more on the use of mutation rate and
recombination. In the command you recommended "-t 0.0003773252":

1) It seems that when \mu = 1.2E-8 and N_anc = 7895, then we can get the
\theta value. Then will we also need to specify the --popSize 7895?
2) since \theta is specified, shall I also specify "-r/--rho" based on
N_anc?
3) I assume \theta and \rho will be updated as population size changes, is
that the case?

Sorry for bothering you so much and thanks a lot!

Kindest regards,
Gao

Student in Statistical Genetics, Baylor College of Medicine
(the same Gao as from: gaow [at] bcm.edu / gaow [at] rice.edu)



On Wed, Feb 15, 2012 at 11:44 AM, Ryan Hernandez <rh...@gm...> wrote:

> Hi Gao,
>
> If you assume that the ancestral population size was the same for Africans
> as for Europeans (they started out as a single population), and further
> assume that the actual fitness effects (s) are the same across populations,
> then the parameters of the selection model in the ancestral population of
> both populations should be the same.  Now, the parameter estimates for
> alpha and beta ended up being slightly different for European Americans vs
> African Americans, but I don't think they were statistically different.
>  Their difference could in part be due to the African demographic model
> fitting the African American data better than the European demographic
> model fit the European American data.  Because the selection parameters are
> sensitive to the demographic model, I decided to stick with the parameters
> estimated from the African Americans.  You are more than welcome to use the
> parameters from the European Americans; results will be highly concordant.
>
> Note that the distribution of selection coefficients updates as the
> population changes size, so you only need to introduce the distribution
> once (in your command below, you entered it after each demographic change).
>
> Ryan
>
> On Wed, Feb 15, 2012 at 9:28 AM, Wang, Gao <wa...@gm...> wrote:
>
>> Dear Dr. Hernandez,
>>
>> Thank you for your prompt reply and the instructions, very helpful!
>> Still, I wonder if it would be possible to further clarify 4):
>>
>> 1. It seems from alpha = 0.184 that the selection coefficient is based on
>> estimates for Africans. I wonder if we use a different set of parameters
>> for selection coefficient, or how would the choice of these parameters be
>> justified?
>> 2. beta = 0.00040244, which, assuming it is the same estimate from
>> African model, I see it as beta = 1/(0.16*7778*2) where 7778 is the
>> ancestry population size. Is it true that in sfscode, \gamma = 2*N_anc*s
>> when simulations are done under some demographic models?
>>
>> Thank you so much and looking forward to hearing from you again!
>>
>>
>> Kindest regards,
>> Gao
>>
>> Student in Statistical Genetics, Baylor College of Medicine
>> (the same Gao as from: gaow [at] bcm.edu / gaow [at] rice.edu)
>>
>>
>>
>> On Wed, Feb 15, 2012 at 1:55 AM, Ryan Hernandez <rya...@uc...>wrote:
>>
>>> Hi Gao,
>>>
>>> There are a few things that aren't quite right in your command line:
>>>
>>> 1)  the sample size should be much smaller than the final simulated size
>>>
>>> 2)  the simulation ends with -TE 0.03624009, but you have demographic
>>> effects that occur after this time (-Td 0.328...).
>>>
>>> 3)  You have the wrong parameterization for the gamma distribution of
>>> selection coefficients.  sfs_code uses the version with mean = alpha/beta.
>>>  Your beta parameters are inverted.
>>>
>>> 4)  Your demographic model corresponds to a 4-epoch model.  Such a model
>>> has not been fit to the data as far as I am aware.  Here is the command you
>>> probably want to use:
>>>
>>> ./sfs_code 1 20 -t 0.0003773252 -Td 0 0.7218 -Td 0.4878 5.2693 -TE
>>> 0.5432 -o sfs_code.txt -W 2 0 0 0 0.184 0.00040244.
>>>
>>> Let me know if you have any questions!
>>>
>>> Ryan
>>>
>>> On Feb 13, 2012, at 8:39 PM, Wang, Gao wrote:
>>>
>>> Dear Dr. Hernandez,
>>>
>>> How are you? I am Gao Wang, a PhD student at Baylor College of Medicine.
>>> I am writing about questions on simulating sequences with SFS CODE using
>>> your 2008 paper [1]. I wanted to follow the complex European demographic
>>> model with purifying selection and my command is as follows:
>>>
>>> sfs_code 1 20 -L 1 1500 --popSize 7947 \
>>>               -Td 0 0.032968 -Td 0.005285 26.79 -Td 0.328237 7.537683 \
>>>               -TW 0 2 0 1 1 0.206 *76.50* -TW 0.005285 2 0 1 1 0.206 *
>>> 2049.5* -TW 0.328237 2 0 1 1 0.206 *15400* \
>>>               -TE 0.03624009 --sampSize 100000 --outfile out.txt
>>> --errfile err.txt --popFreq freq.txt
>>>
>>> Both time and population size are scaled based on instructions from the
>>> manual. I am not sure of my selection coefficient specification. Since the
>>> input should be "gamma=2Ne*S", I assume the selection coefficient should be
>>> scaled differently with time going forward, using the "-TW" options.
>>> However from the output file the selection coefficient I see (which is S,
>>> not gamma, according to the documentation) are very small and do not seem
>>> to follow a Gamma(0.206, 0.146) distribution as described in the 2008
>>> paper. I must have done something wrong but I do not know why it is the
>>> case. I would very much appreciate it if you could give me a hint.
>>> Particularly, I wonder if it would be possible that you could share with me
>>> your SFS CODE command on models in the paper, if you have them.
>>>
>>> Thank you so much in advance. Looking forward to hearing from you!
>>>
>>> [1]
>>> http://www.plosgenetics.org/article/info:doi%2F10.1371%2Fjournal.pgen.1000083
>>>
>>> Kindest regards,
>>> Gao
>>>
>>> Student in Statistical Genetics, Baylor College of Medicine
>>> (the same Gao as from: gaow [at] bcm.edu / gaow [at] rice.edu)
>>>
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Virtualization & Cloud Management Using Capacity Planning
>> Cloud computing makes use of virtualization - but cloud computing
>> also focuses on allowing computing to be delivered as a service.
>> http://www.accelacomm.com/jaw/sfnl/114/51521223/
>> _______________________________________________
>> sfscode-users mailing list
>> sfs...@li...
>> https://lists.sourceforge.net/lists/listinfo/sfscode-users
>>
>>
>
>
> --
> Ryan D. Hernandez, Ph.D.
> Assistant Professor
> Department of Bioengineering and Therapeutic Sciences
> University of California at San Francisco
> UCSF MC 2552
> Byers Hall Room 503C
> 1700 4th Street
> San Francisco, CA  94158-2330
>
> Phone:    (415) 514-9813
> Email:    rya...@uc...
> Web:      http://bts.ucsf.edu/hernandez_lab
>
>

Re: [sfscode-users] Gao Wang - question on simulating sequence under your 2008 paper

From: Ryan H. <rh...@gm...> - 2012-02-15 17:44:51

Hi Gao,

If you assume that the ancestral population size was the same for Africans
as for Europeans (they started out as a single population), and further
assume that the actual fitness effects (s) are the same across populations,
then the parameters of the selection model in the ancestral population of
both populations should be the same.  Now, the parameter estimates for
alpha and beta ended up being slightly different for European Americans vs
African Americans, but I don't think they were statistically different.
 Their difference could in part be due to the African demographic model
fitting the African American data better than the European demographic
model fit the European American data.  Because the selection parameters are
sensitive to the demographic model, I decided to stick with the parameters
estimated from the African Americans.  You are more than welcome to use the
parameters from the European Americans; results will be highly concordant.

Note that the distribution of selection coefficients updates as the
population changes size, so you only need to introduce the distribution
once (in your command below, you entered it after each demographic change).

Ryan

On Wed, Feb 15, 2012 at 9:28 AM, Wang, Gao <wa...@gm...> wrote:

> Dear Dr. Hernandez,
>
> Thank you for your prompt reply and the instructions, very helpful! Still,
> I wonder if it would be possible to further clarify 4):
>
> 1. It seems from alpha = 0.184 that the selection coefficient is based on
> estimates for Africans. I wonder if we use a different set of parameters
> for selection coefficient, or how would the choice of these parameters be
> justified?
> 2. beta = 0.00040244, which, assuming it is the same estimate from African
> model, I see it as beta = 1/(0.16*7778*2) where 7778 is the ancestry
> population size. Is it true that in sfscode, \gamma = 2*N_anc*s when
> simulations are done under some demographic models?
>
> Thank you so much and looking forward to hearing from you again!
>
>
> Kindest regards,
> Gao
>
> Student in Statistical Genetics, Baylor College of Medicine
> (the same Gao as from: gaow [at] bcm.edu / gaow [at] rice.edu)
>
>
>
> On Wed, Feb 15, 2012 at 1:55 AM, Ryan Hernandez <rya...@uc...>wrote:
>
>> Hi Gao,
>>
>> There are a few things that aren't quite right in your command line:
>>
>> 1)  the sample size should be much smaller than the final simulated size
>>
>> 2)  the simulation ends with -TE 0.03624009, but you have demographic
>> effects that occur after this time (-Td 0.328...).
>>
>> 3)  You have the wrong parameterization for the gamma distribution of
>> selection coefficients.  sfs_code uses the version with mean = alpha/beta.
>>  Your beta parameters are inverted.
>>
>> 4)  Your demographic model corresponds to a 4-epoch model.  Such a model
>> has not been fit to the data as far as I am aware.  Here is the command you
>> probably want to use:
>>
>> ./sfs_code 1 20 -t 0.0003773252 -Td 0 0.7218 -Td 0.4878 5.2693 -TE 0.5432
>> -o sfs_code.txt -W 2 0 0 0 0.184 0.00040244.
>>
>> Let me know if you have any questions!
>>
>> Ryan
>>
>> On Feb 13, 2012, at 8:39 PM, Wang, Gao wrote:
>>
>> Dear Dr. Hernandez,
>>
>> How are you? I am Gao Wang, a PhD student at Baylor College of Medicine.
>> I am writing about questions on simulating sequences with SFS CODE using
>> your 2008 paper [1]. I wanted to follow the complex European demographic
>> model with purifying selection and my command is as follows:
>>
>> sfs_code 1 20 -L 1 1500 --popSize 7947 \
>>               -Td 0 0.032968 -Td 0.005285 26.79 -Td 0.328237 7.537683 \
>>               -TW 0 2 0 1 1 0.206 *76.50* -TW 0.005285 2 0 1 1 0.206 *
>> 2049.5* -TW 0.328237 2 0 1 1 0.206 *15400* \
>>               -TE 0.03624009 --sampSize 100000 --outfile out.txt
>> --errfile err.txt --popFreq freq.txt
>>
>> Both time and population size are scaled based on instructions from the
>> manual. I am not sure of my selection coefficient specification. Since the
>> input should be "gamma=2Ne*S", I assume the selection coefficient should be
>> scaled differently with time going forward, using the "-TW" options.
>> However from the output file the selection coefficient I see (which is S,
>> not gamma, according to the documentation) are very small and do not seem
>> to follow a Gamma(0.206, 0.146) distribution as described in the 2008
>> paper. I must have done something wrong but I do not know why it is the
>> case. I would very much appreciate it if you could give me a hint.
>> Particularly, I wonder if it would be possible that you could share with me
>> your SFS CODE command on models in the paper, if you have them.
>>
>> Thank you so much in advance. Looking forward to hearing from you!
>>
>> [1]
>> http://www.plosgenetics.org/article/info:doi%2F10.1371%2Fjournal.pgen.1000083
>>
>> Kindest regards,
>> Gao
>>
>> Student in Statistical Genetics, Baylor College of Medicine
>> (the same Gao as from: gaow [at] bcm.edu / gaow [at] rice.edu)
>>
>>
>>
>
>
> ------------------------------------------------------------------------------
> Virtualization & Cloud Management Using Capacity Planning
> Cloud computing makes use of virtualization - but cloud computing
> also focuses on allowing computing to be delivered as a service.
> http://www.accelacomm.com/jaw/sfnl/114/51521223/
> _______________________________________________
> sfscode-users mailing list
> sfs...@li...
> https://lists.sourceforge.net/lists/listinfo/sfscode-users
>
>


-- 
Ryan D. Hernandez, Ph.D.
Assistant Professor
Department of Bioengineering and Therapeutic Sciences
University of California at San Francisco
UCSF MC 2552
Byers Hall Room 503C
1700 4th Street
San Francisco, CA  94158-2330

Phone:    (415) 514-9813
Email:    rya...@uc...
Web:      http://bts.ucsf.edu/hernandez_lab

Re: [sfscode-users] Gao Wang - question on simulating sequence under your 2008 paper

From: Wang, G. <wa...@gm...> - 2012-02-15 17:28:35

Dear Dr. Hernandez,

Thank you for your prompt reply and the instructions, very helpful! Still,
I wonder if it would be possible to further clarify 4):

1. It seems from alpha = 0.184 that the selection coefficient is based on
estimates for Africans. I wonder if we use a different set of parameters
for selection coefficient, or how would the choice of these parameters be
justified?
2. beta = 0.00040244, which, assuming it is the same estimate from African
model, I see it as beta = 1/(0.16*7778*2) where 7778 is the ancestry
population size. Is it true that in sfscode, \gamma = 2*N_anc*s when
simulations are done under some demographic models?

Thank you so much and looking forward to hearing from you again!

Kindest regards,
Gao

Student in Statistical Genetics, Baylor College of Medicine
(the same Gao as from: gaow [at] bcm.edu / gaow [at] rice.edu)



On Wed, Feb 15, 2012 at 1:55 AM, Ryan Hernandez <rya...@uc...>wrote:

> Hi Gao,
>
> There are a few things that aren't quite right in your command line:
>
> 1)  the sample size should be much smaller than the final simulated size
>
> 2)  the simulation ends with -TE 0.03624009, but you have demographic
> effects that occur after this time (-Td 0.328...).
>
> 3)  You have the wrong parameterization for the gamma distribution of
> selection coefficients.  sfs_code uses the version with mean = alpha/beta.
>  Your beta parameters are inverted.
>
> 4)  Your demographic model corresponds to a 4-epoch model.  Such a model
> has not been fit to the data as far as I am aware.  Here is the command you
> probably want to use:
>
> ./sfs_code 1 20 -t 0.0003773252 -Td 0 0.7218 -Td 0.4878 5.2693 -TE 0.5432
> -o sfs_code.txt -W 2 0 0 0 0.184 0.00040244.
>
> Let me know if you have any questions!
>
> Ryan
>
> On Feb 13, 2012, at 8:39 PM, Wang, Gao wrote:
>
> Dear Dr. Hernandez,
>
> How are you? I am Gao Wang, a PhD student at Baylor College of Medicine. I
> am writing about questions on simulating sequences with SFS CODE using your
> 2008 paper [1]. I wanted to follow the complex European demographic model
> with purifying selection and my command is as follows:
>
> sfs_code 1 20 -L 1 1500 --popSize 7947 \
>               -Td 0 0.032968 -Td 0.005285 26.79 -Td 0.328237 7.537683 \
>               -TW 0 2 0 1 1 0.206 *76.50* -TW 0.005285 2 0 1 1 0.206 *
> 2049.5* -TW 0.328237 2 0 1 1 0.206 *15400* \
>               -TE 0.03624009 --sampSize 100000 --outfile out.txt --errfile
> err.txt --popFreq freq.txt
>
> Both time and population size are scaled based on instructions from the
> manual. I am not sure of my selection coefficient specification. Since the
> input should be "gamma=2Ne*S", I assume the selection coefficient should be
> scaled differently with time going forward, using the "-TW" options.
> However from the output file the selection coefficient I see (which is S,
> not gamma, according to the documentation) are very small and do not seem
> to follow a Gamma(0.206, 0.146) distribution as described in the 2008
> paper. I must have done something wrong but I do not know why it is the
> case. I would very much appreciate it if you could give me a hint.
> Particularly, I wonder if it would be possible that you could share with me
> your SFS CODE command on models in the paper, if you have them.
>
> Thank you so much in advance. Looking forward to hearing from you!
>
> [1]
> http://www.plosgenetics.org/article/info:doi%2F10.1371%2Fjournal.pgen.1000083
>
> Kindest regards,
> Gao
>
> Student in Statistical Genetics, Baylor College of Medicine
> (the same Gao as from: gaow [at] bcm.edu / gaow [at] rice.edu)
>
>
>

Re: [sfscode-users] Gao Wang - question on simulating sequence under your 2008 paper

From: Ryan H. <rh...@gm...> - 2012-02-15 08:00:33

Hi Gao,

There are a few things that aren't quite right in your command line:

1)  the sample size should be much smaller than the final simulated size

2)  the simulation ends with -TE 0.03624009, but you have demographic
effects that occur after this time (-Td 0.328...).

3)  You have the wrong parameterization for the gamma distribution of
selection coefficients.  sfs_code uses the version with mean = alpha/beta.
 Your beta parameters are inverted.

4)  Your demographic model corresponds to a 4-epoch model.  Such a model
has not been fit to the data as far as I am aware.  Here is the command you
probably want to use:

./sfs_code 1 20 -t 0.0003773252 -Td 0 0.7218 -Td 0.4878 5.2693 -TE 0.5432
-o sfs_code.txt -W 2 0 0 0 0.184 0.00040244.

Let me know if you have any questions!

Ryan


On Mon, Feb 13, 2012 at 8:39 PM, Wang, Gao <wa...@gm...> wrote:

> Dear Dr. Hernandez,
>
> How are you? I am Gao Wang, a PhD student at Baylor College of Medicine. I
> am writing about questions on simulating sequences with SFS CODE using your
> 2008 paper [1]. I wanted to follow the complex European demographic model
> with purifying selection and my command is as follows:
>
> sfs_code 1 20 -L 1 1500 --popSize 7947 \
>               -Td 0 0.032968 -Td 0.005285 26.79 -Td 0.328237 7.537683 \
>               -TW 0 2 0 1 1 0.206 *76.50* -TW 0.005285 2 0 1 1 0.206 *
> 2049.5* -TW 0.328237 2 0 1 1 0.206 *15400* \
>               -TE 0.03624009 --sampSize 100000 --outfile out.txt --errfile
> err.txt --popFreq freq.txt
>
> Both time and population size are scaled based on instructions from the
> manual. I am not sure of my selection coefficient specification. Since the
> input should be "gamma=2Ne*S", I assume the selection coefficient should be
> scaled differently with time going forward, using the "-TW" options.
> However from the output file the selection coefficient I see (which is S,
> not gamma, according to the documentation) are very small and do not seem
> to follow a Gamma(0.206, 0.146) distribution as described in the 2008
> paper. I must have done something wrong but I do not know why it is the
> case. I would very much appreciate it if you could give me a hint.
> Particularly, I wonder if it would be possible that you could share with me
> your SFS CODE command on models in the paper, if you have them.
>
> Thank you so much in advance. Looking forward to hearing from you!
>
> [1]
> http://www.plosgenetics.org/article/info:doi%2F10.1371%2Fjournal.pgen.1000083
>
> Kindest regards,
> Gao
>
> Student in Statistical Genetics, Baylor College of Medicine
> (the same Gao as from: gaow [at] bcm.edu / gaow [at] rice.edu)
>
>
>
> ------------------------------------------------------------------------------
> Keep Your Developer Skills Current with LearnDevNow!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-d2d
> _______________________________________________
> sfscode-users mailing list
> sfs...@li...
> https://lists.sourceforge.net/lists/listinfo/sfscode-users
>
>


-- 
Ryan D. Hernandez, Ph.D.
Assistant Professor
Department of Bioengineering and Therapeutic Sciences
University of California at San Francisco
UCSF MC 2552
Byers Hall Room 503C
1700 4th Street
San Francisco, CA  94158-2330

Phone:    (415) 514-9813
Email:    rya...@uc...
Web:      http://bts.ucsf.edu/hernandez_lab

[sfscode-users] Gao Wang - question on simulating sequence under your 2008 paper

From: Wang, G. <wa...@gm...> - 2012-02-14 04:40:16

Dear Dr. Hernandez,

How are you? I am Gao Wang, a PhD student at Baylor College of Medicine. I
am writing about questions on simulating sequences with SFS CODE using your
2008 paper [1]. I wanted to follow the complex European demographic model
with purifying selection and my command is as follows:

sfs_code 1 20 -L 1 1500 --popSize 7947 \
              -Td 0 0.032968 -Td 0.005285 26.79 -Td 0.328237 7.537683 \
              -TW 0 2 0 1 1 0.206 *76.50* -TW 0.005285 2 0 1 1 0.206 *2049.5
* -TW 0.328237 2 0 1 1 0.206 *15400* \
              -TE 0.03624009 --sampSize 100000 --outfile out.txt --errfile
err.txt --popFreq freq.txt

Both time and population size are scaled based on instructions from the
manual. I am not sure of my selection coefficient specification. Since the
input should be "gamma=2Ne*S", I assume the selection coefficient should be
scaled differently with time going forward, using the "-TW" options.
However from the output file the selection coefficient I see (which is S,
not gamma, according to the documentation) are very small and do not seem
to follow a Gamma(0.206, 0.146) distribution as described in the 2008
paper. I must have done something wrong but I do not know why it is the
case. I would very much appreciate it if you could give me a hint.
Particularly, I wonder if it would be possible that you could share with me
your SFS CODE command on models in the paper, if you have them.

Thank you so much in advance. Looking forward to hearing from you!

[1]
http://www.plosgenetics.org/article/info:doi%2F10.1371%2Fjournal.pgen.1000083

Kindest regards,
Gao

Student in Statistical Genetics, Baylor College of Medicine
(the same Gao as from: gaow [at] bcm.edu / gaow [at] rice.edu)

[sfscode-users] mutation rate

From: Qianqian Z. <qz...@bu...> - 2010-09-08 21:22:23

Hi SFS_CODE users,

I have some questions regarding setting of mutation rate in SFS_CODE.

Generally the mutation rate in human genome is about 2.5e-8. However the
default value in SFS_CODE is 0.001. Could anyone please explain the
difference? I notice the equation in the user manual: theta=4*Ne*mu. Is mu
equal to 2.5e-8 and theta equal to 0.001? I didn't find anywhere in the user
manual that explains Ne. Could anyone please give some information?

Furthermore will the mutation rate be affected by the change of population
size? For example during some demography event the population size will
change.

Thanks a lot!

Sincerely,
Qianqian

[sfscode-users] FAQ

From: Ryan H. <rhe...@uc...> - 2008-06-18 15:29:50

As questions regarding the use/misuse of SFS_CODE are posed, I will begin
compiling a frequently asked questions list.  This will either be on the
project website at http://sfscode.sourceforge.net or on the project wiki (I
haven't decided whether or not to use the wiki).

Happy Simulating!

Ryan

1 message has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 2 (Page 2 of 2)