From: Marcus B. <mbr...@cs...> - 2009-03-03 18:10:20
|
As a lurker but someone with a strong interest probability distribution code, I have a few thoughts. 1. Many useful distributions are not over vectors or scalars. E.G., distributions over quaternions and matrices. While practically anything can be shoehorned into a distribution over a vector, it's worth thinking about how you would implement a distribution over other types. 2. Not all distribution functions can be readily normalized and many uses of distributions don't require the normalization constant. It may be worth allowing both of these cases. 3. Computing the cdf, inverse cdf, etc can be hard or practically impossible (except for approximations) for many distributions. Handling this gracefully is important. At the very least, when documenting the base class, give guidance on how to handle this. E.G., when is it appropriate to use an approximation. How should unimplemented functions be handled? Maybe there should be "approximate" versions or an approximation parameters for some of these functions which allows the user to specify whether approximations are allowed or how accurate things need to be. Anyway, vpdl looks like a great start and I'm looking forward to seeing where it goes from here. Cheers, Marcus Matthew Leotta wrote: > Dear All, > > I've just checked in the first part of vpdl (probability distribution > library) into the vxl core. I have not modified any CMakeLists.txt > outside of vpdl. If you want to build it you'll need to manually add > it to core/CMakeLists.txt. At some point I can check in a CMake build > option that defaults to off. > > If you have time, please take a look at the code and let me know if > the design looks reasonable. It is quite incomplete at this point, > but it should give you the idea. I'm only working on the distribution > classes and I welcome help in contributing the design of builders, > samplers, or any other essential parts. When I get community approval > for this design I'll added more distributions and write the book > chapter. > > The general design is like this: vpdl_distribution<T,n> is the > templated base class for distributions. The template parameters are > T, the floating point type (float or double) and n, the dimension. > For n > 1 the distributions work with vnl_vector_fixed<T,n> and > vnl_matrix_fixed<T,n,n> types. For n == 1 they work with T directly > for scalar computations. The special case of n == 0 (the default) > works with vnl_vector<T> and vnl_matrix<T> for dimension specified at > run time. While vpdl_distribution<T,n> should be used as the base > class, it is inherited from vpdl_base_traits<T,n>. vpdl_base_traits > is partially specialized to create typedefs and functions that reduce > the need for specialization in later derived classes. For example, > vpdl_distribution<T,n>::index(v,i) is a static member function that > provides access to the i-th element of vector v even if v is really a > scalar of type T (in which case it returns the scalar value). > > I currently have two working distributions with test cases: > vpdl_gaussian_sphere and vpdl_gaussian_indep. Both are restricted > version of Gaussian distributions (with hyper-spherical and > independent covariances). The general Gaussian is a work in progress, > and I think it will used the eigenvector representation of mul/vpdfl. > > I've inlined everything in the Gaussians so that no .txx file is > needed. The vpdl_distribution does use .txx with many instantiations > in the Templates subdirectory. Does anyone have any preference on the > use of .txx files here? It seems like a very large number of files > would be needed in the Templates directory if I use .txx files. Many > of these instantiations might rarely be used. However, if everything > is inlined it could lead to more code bloat. > > This design integrates bsta more tightly than originally considered. > If the virtual functions do not create too large of a performance hit, > then I might not need to create separate classes with wrappers. I > will need to do performance tests to know for sure. > > Matt > > ------------------------------------------------------------------------------ > Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA > -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise > -Strategies to boost innovation and cut costs with open source participation > -Receive a $600 discount off the registration fee with the source code: SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > Vxl-maintainers mailing list > Vxl...@li... > https://lists.sourceforge.net/lists/listinfo/vxl-maintainers > |