Re: [Pymc-user] generating toy data with PyMC???
Status: Beta
Brought to you by:
fonnesbeck
From: David H. <dav...@gm...> - 2006-02-24 14:39:12
|
I doubt this is the easy way to do what you want. I'd rather build a model with known statistical distribution and generate random variables. For instante : 1. Generate random gender (0 or 1) with a Bernouilli distribution with parameter p, the proportion of men. 2. Generate age, according to gender. Try to find data for a given population and simply draw randomly from it. 3. Generate income (dependent on gender and age) This is the tricky part, there are many distributions that can model correlation, (bivariate gumbel, bivariate normal, copulas), but you'll have to specify a model for the correlation anyway. I doubt a linear correlation would do... 4. Same problem for outcome. You'll find a wide array of distributions to generate randomly from in scipy.stats and in the random module. Cheers, David 2006/2/24, Hugo Koopmans <hug...@gm...>: > Hi there, > > I have done some experiments with PyMC. Has been working very well so far= , > keep up the good work!!! > > Now, I want to use PyMC to generate data to do experiments with missing > values. Therefore I need to generate toy data first. > This toydata for example could consist of the following variables: > age, income and gender and an outcome (e.g. change of buying product X) > now I would like to have an underlying model to generate data from in whi= ch > for instance age and income are correlated in some way and females like t= he > product more then males. Also the correlation between income and age is > stronger for females then for males. > > Would this be possible using PyMC? Did anyone do something like this? Sam= ple > code would be appriciated very much! > > Regards, > > hugo |