From: Peter W. <pet...@ve...> - 2009-04-20 19:59:01
|
Hi Rita, Yes, removing Dither would work. But then every system using dither would be non-repeatable. Is there any reason that Dither should be really random? Is there some subtle problem that will be caused by every utterance being dithered by the same sequence of numbers? P Rita Singh wrote: > Hi Peter, > > The other alternative is to set log 0 to a reasonable noise floor > value (such as -4), and disable dither when you do that. Could you try > that out and let us know what you observe? > > -Rita > > > On Mon, Apr 20, 2009 at 3:13 PM, Peter Wolf <pet...@ve...> wrote: > >> Well, there might be... but a few unit tests will find them. Floating >> point numbers are wonderfully sensitive :-) >> >> BTW I have checked in the fix, and a corresponding unit test to GitHub. >> >> Joe Wölfel wrote: >> >>> I agree. It's difficult to test anything that includes random >>> behavior. I guess my question is will this be enough? Are there >>> other sources of randomness that may affect results? >>> >>> >>> On 20 avr. 09, at 14:26, Peter Wolf wrote: >>> >>> >>>> Hello Sphinx4 developers, >>>> >>>> I was just tracking down a bug where recognizing the same utterance >>>> twice in a row did not produce identical scores. They should, since I >>>> was using BatchCMN. >>>> >>>> I discovered that the problem was in Dither which uses a Random number >>>> generator. The generator was not reset at every DataStartSignal, so >>>> recognition behavior was determined by the utterances that came before. >>>> >>>> I am writing very sensitive tests that prove correctness in my app by >>>> comparing scores. So I need atomic behavior from Sphinx4. I would >>>> argue that repeatable behavior is required for many uses. >>>> >>>> So, here is my proposed fix to Dither (in bold). >>>> >>>> Comments, thoughts? Am I missing something important? >>>> >>>> /** >>>> * Returns the next Data object being processed by this Dither, or >>>> if it is >>>> * a Signal, it is returned without modification. >>>> * >>>> * @return the next available Data object, returns null if no Data >>>> object is >>>> * available >>>> * @throws DataProcessingException >>>> * if there is a processing error >>>> * @see Data >>>> */ >>>> public Data getData() throws DataProcessingException { >>>> Data input = getPredecessor().getData(); >>>> getTimer().start(); >>>> if (input != null && input instanceof DoubleData) { >>>> applyDither(((DoubleData) input).getValues()); >>>> } >>>> * else if( input instanceof DataStartSignal ) { >>>> /** >>>> * reset the dither at the beginning of each utterance >>>> * this ensures that a given combination of >>>> utterance/parameters/model >>>> * will always behave the same way >>>> * >>>> * otherwise it will be affected by utterances that came >>>> before >>>> */ >>>> random = new Random(0); >>>> }* >>>> getTimer().stop(); >>>> return input; >>>> } >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> Stay on top of everything new and different, both inside and >>>> around Java (TM) technology - register by April 22, and save >>>> $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. >>>> 300 plus technical and hands-on sessions. Register today. >>>> Use priority code J9JMT32. http://p.sf.net/sfu/p >>>> _______________________________________________ >>>> Cmusphinx-devel mailing list >>>> Cmu...@li... >>>> https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel >>>> >>> >> >> ------------------------------------------------------------------------------ >> Stay on top of everything new and different, both inside and >> around Java (TM) technology - register by April 22, and save >> $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. >> 300 plus technical and hands-on sessions. Register today. >> Use priority code J9JMT32. http://p.sf.net/sfu/p >> _______________________________________________ >> Cmusphinx-devel mailing list >> Cmu...@li... >> https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel >> >> >> > > |