From: <fe...@in...> - 2014-03-19 12:59:40
|
Hi Daniel, Thanks, we are working on that. Best, Felipe Espic Quoting Daniel Povey <dp...@gm...>: > The scoring is normally based on words so the confusion matrix is output as > a sequence of words. There are ways to do what you want, involving the > program ali-to-phones, that would involve aligning the training data with > steps/align_fmllr.sh or align_si.sh and comparing with the best alignment > from the decode, then putting it into compute-wer and asking it to output > the detailed information. But I don't have time right now to explain it in > detail. > Dan > > > > On Mon, Mar 17, 2014 at 1:39 PM, <fe...@in...> wrote: > >> Hi Daniel, >> >> Thanks for you quick reply. >> >> We want to use confusion matrices to see which phonemes (or types of >> phonemes) are misclassified. >> >> Is there any other way you can suggest to do this? >> >> Thanks, >> >> Felipe Espic >> >> >> >> >> Quoting Daniel Povey <dp...@gm...>: >> >> Hi, >>> There is no explicit support for multi-stream ASR in Kaldi, you'll have to >>> try to understand the codebase and code something yourself [although if >>> you >>> build separate models with the same tree, you can use the DecodableSum >>> class to help you decode with scores summed over the models; you'll need >>> to >>> write code for this though.] >>> Regarding a phone confusion matrix- if you build a system to decode >>> phones, >>> I think the program compute-wer has an option to output confusion data, >>> but >>> I doubt it is in the format you want. However, I would advise against >>> this. Phone confusion matrices are a little old fashioned. >>> Dan >>> >>> >>> >>> On Mon, Mar 17, 2014 at 9:20 AM, <fe...@in...> wrote: >>> >>> Dear Sirs, >>>> >>>> I am with the Speech Processing and Transmission Lab at the University >>>> of Chile. >>>> We are working on multistream speech recognition in Kaldi, then we >>>> have a couple of questions: >>>> >>>> - We want to create a confusion matrix by phoneme to assess the >>>> performance of only acoustic features. How we could address this in >>>> Kaldi? I think we have to make a phoneme recognizer (w/o word position >>>> dependency), thus we read these posts >>>> http://sourceforge.net/p/kaldi/discussion/1355348/thread/51258bf4/ >>>> and http://sourceforge.net/p/kaldi/discussion/1355348/thread/2294d269/ >>>> from 2013, but we did not find any specific solution. >>>> >>>> - Is there any recipe for multistream ASR in Kaldi ? Any help with this? >>>> >>>> >>>> Best Regards, >>>> >>>> Felipe Espic >>>> >>>> >>>> >>>> ------------------------------------------------------------ >>>> ------------------ >>>> Learn Graph Databases - Download FREE O'Reilly Book >>>> "Graph Databases" is the definitive new guide to graph databases and >>>> their >>>> applications. Written by three acclaimed leaders in the field, >>>> this first edition is now available. Download your free book today! >>>> http://p.sf.net/sfu/13534_NeoTech >>>> _______________________________________________ >>>> Kaldi-developers mailing list >>>> Kal...@li... >>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>> >>>> >> >> |