From: Daniel P. <dp...@gm...> - 2013-04-08 19:24:08
|
It's the normal practice when dealing with Gaussians to get rid of small counts. Dan On Mon, Apr 8, 2013 at 3:23 PM, Xavier Anguera <xan...@gm...> wrote: > I agree my "hack" is not the solution. > I see that when performing an EM training there is a check for very small > occupancy or weight, and eliminates a Gaussian if it is so. I am though not > happy with such an approach and had commented out that line some time ago > (I am implementing a MAP adaptation function that needs to deal with these > cases) > > X. > > > On Mon, Apr 8, 2013 at 9:19 PM, Daniel Povey <dp...@gm...> wrote: > >> Hm, thanks, but I don't think this is the right way to fix the problem. >> Update code should always take into account the possibility that >> occupancies will be zero. It's expected that exp() on very negative values >> will produce zero. >> Dan >> >> >> On Mon, Apr 8, 2013 at 3:15 PM, Xavier Anguera <xan...@gm...>wrote: >> >>> Hi Dan, >>> the segmentation fault comes from a division by 0 when using the >>> occupancy of the Gaussians, that has been computed by adding together all >>> the posterior probabilities for each Gaussian and a set of features. When >>> all posteriors for a given Gaussian and all features is 0, there is a >>> division by 0. >>> I am pasting the "hack" I wrote to prevent this. I believe though that >>> maybe the exp() function should be revisited. Tell me what you think. >>> >>> Real VectorBase<Real>::ApplySoftMax() { >>> Real max = this->Max(), sum = 0.0; >>> for (MatrixIndexT i = 0; i < dim_; i++) { >>> data_[i] = exp(data_[i] - max); >>> if(data_[i] < FLT_MIN ) >>> data_[i] = FLT_MIN; //very small value >>> sum += data_[i]; >>> } >>> >>> this->Scale(1.0 / sum); >>> return max + log(sum); >>> } >>> >>> >>> On Mon, Apr 8, 2013 at 8:58 PM, Daniel Povey <dp...@gm...> wrote: >>> >>>> Firstly, this should give you numerical problems but not a segmentation >>>> fault. >>>> You'll have to look in the code and see if it's behaving as expected. >>>> E.g. is it due to a number so small that it cannot be represented in >>>> floating point, or is it larger than that and unexpectedly becoming zero? >>>> It might be an issue with your algorithm design. >>>> Let me know if that function needs to be fixed. >>>> >>>> Dan >>>> >>>> >>>> On Mon, Apr 8, 2013 at 2:54 PM, Xavier Anguera <xan...@gm...>wrote: >>>> >>>>> Hi, >>>>> when using the function template<typename Real> Real >>>>> VectorBase<Real>::ApplySoftMax() in kaldi-vector.cc file, I noticed that >>>>> very small likelihoods are rounded to a posterior probability of 0.0 >>>>> Is this an expected behavior? I am trying to perform an EM training of >>>>> a simple GMM and I keep bumping into segmentation fault due to this. >>>>> >>>>> Thanks >>>>> >>>>> Xavi Anguera >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Minimize network downtime and maximize team effectiveness. >>>>> Reduce network management and security costs.Learn how to hire >>>>> the most talented Cisco Certified professionals. Visit the >>>>> Employer Resources Portal >>>>> http://www.cisco.com/web/learning/employer_resources/index.html >>>>> _______________________________________________ >>>>> Kaldi-developers mailing list >>>>> Kal...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>>> >>>>> >>>> >>> >> > |