Menu

Wishlist

2016-02-24
2016-10-25
  • Roby Joehanes

    Roby Joehanes - 2016-02-24

    If you have wishlists on which distribution to work on, let me know.
    Currently I am thinking about adding skewed distribution or multivariate distribution. But either of these take a lot of work.

     
  • Dragan Kerkez

    Dragan Kerkez - 2016-10-20

    It would be nice to have an arbitrary random number generator. Probability density function (or more functions and their weights) could be submitted through an argument list.

    Mixing distributions: https://en.wikipedia.org/wiki/Mixture_distribution

     

    Last edit: Dragan Kerkez 2016-10-20
  • Roby Joehanes

    Roby Joehanes - 2016-10-21

    Hi Dragan, thanks for the input. I guess I don't understand what you mean. Do you mean random number generated from hierarchical distribution or generated given a probability density function (pdf)? Or something else?

    Generating random numbers through hierarchical distribution is already possible with JDistlib. You just need to specify the correct hierarchy of distribution (given properly setup priors) and generate random numbers accordingly.

    Generating random numbers given a random pdf specification is generally not feasible. At the very least, you will need to specify the quantile function (i.e., the inverse cumulative density function). Given inverse cdf / quantile function F^-1(x), you can generate random number for that distribution using F^-1(random uniform). This is essentially the only strategy for several distributions here in JDistlib. For an arbitrary distribution, however... Well, it is theoretically possible to generate the inverse cdf given pdf specification. The step is twofold: First, construct cdf (i.e., the cumulative function F) using numerical integration / summation. Second, construct inverse cdf using manual search. However, it will suffer much precision loss from both steps and huge computing time to generate a few numbers. So, it is generally not feasible. Even if it is computationally feasible, the quality of the resulting random numbers is poor. The strategy for the second approach is basically discussed here:
    https://www.r-bloggers.com/making-random-draws-from-an-arbitrarily-defined-pdf/

    Or perhaps you mean something else?

     
  • Dragan Kerkez

    Dragan Kerkez - 2016-10-24

    Hi Roby,

    Sorry, I guess I confused you when I mentioned rng. I was not thinking about hierarchical distribution, nor nrg from an arbitrary pdf. So far you have implemented many common distributions. What I have on my mind is a pdf as a result of mixing any of these. For example, I want to have pdf with 2 peeks. That I could make by mixing 2 Normal distributions with different (arbitrary) means and variances. Also I could assign a certain weight to each of these. I belive this is not too much exotic. I personally had a situation when I was expecting a certain random variable to be normally distributed (I was measuring how much time it takes for a database to complete a query), but I got huge distorsion on left part of the histogram which turned out to be another bell curve. In other words, my database was executing a (same) query significantly faster in one of four runs. Perhaps mixing Poisson and Normall doesn't make much sense, but mixing more Normal distributions makes sense. RNG would anyhow be available trough GenericDistribution abstract method random.

    I hope you understand now.
    Kind regards, Dragan

     
  • Roby Joehanes

    Roby Joehanes - 2016-10-24

    Hi Dragan, just wondering... what do you want to do with the mixture distribution (let's say normal mixture)? If you want to query what the mixture pdf is going to be, then it is easy, just sum the pdf of each distribution. You can do that with current JDistlib.

    If you wish to generate a random number out of the pdf, then it is a different story altogether. The two-step procedure I explained above really apply. But fortunately, for normal mixture, there is an algorithm to generate random number from it. There are many different mixture distributions which have pdf, cdf, quantile, and random algorithms worked out. For other types of mixture, I am afraid it is not straightforward.

    Edit: There is a way to generate samples out of random pdf, I suppose---but that is using sampling methods (such as MCMC). But I don't think this is what you were trying to get at.

     

    Last edit: Roby Joehanes 2016-10-24
  • Dragan Kerkez

    Dragan Kerkez - 2016-10-25

    Hi Roby,

    In my previous post I mentioned measuring a duration of a database query. Since I got mixture distribution, I was curious to estimate parameters. My idea was to use Kolmogorov-Smirnov to compare my sample and approximated mixture distribution. However I wasn't able to make one in JDistlib. I could have tried to make my own mixture distribution by inheriting GenericDistribution and implementing all the methods, but that seemed too much complicated. So back then I was thinking how nice would it be to have Normal( double [] mu, double [] sigma). Anyway I don’t need it anymore. I only wanted to contribute the project by mentioning it.

    Thanks, Dragan

     
  • Roby Joehanes

    Roby Joehanes - 2016-10-25

    Hi Dragan,

    I see. To test whether a distribution has multiple peaks, JDistlib already has such routine. It is in DistributionTest.diptest, which implements Hartigan's dip test D.

    cf.: http://stats.stackexchange.com/questions/156808/interpretation-of-hartigans-dip-test

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.