-----BEGIN PGP SIGNED MESSAGE-----
> That is with a small vocabulary, right? Working with a small
> vocabulary is of course very different because the words are easier to
I have got the same results even with the large vocabulary, just with a
slightly increased number of errors. (i.e. - I used Rick's acoustic models
without training, turtle LM and full dictionary instead of the small
turtle.dic). So this is not a problem, IMHO.
> Essentially the point of my message was to say that if we can't get
> something reasonable without help from the language model then
> probably a lot of work on the language model would not be worthwhile,
> because the final system would still be unusable.
Sure, but I think, we are at this point. I have got even better results than
Jessica, even though I am not a native English speaker and my accent is
> There's a difference between a language model which doesn't help you
> and one which actively gets in the way.
Yes, but not from the technical point of view - if you use LM with the
recognizer, it contributes to the resulting score of the generated sentence.
Either in the positive or negative way, but the system has no way to tell -
"OK, these results from the LM are crap, let's ignore it". So it either helps
or get's in the way, there is no "neutral" position, unless you weight it to
zero by using the appropriate switch. But in that case, the LM is not used at
> I don't know what structure
> of language model is used, but if it is impossible to run with no
> language model, then perhaps one could construct one which says that an
> utterances is an arbitrary number of words and a word is anything from
> the dictionary. In other words, it would say that anything goes,
> which would be better than only including certain utterances that are
> not the ones you're using. Another possibility would be to use any
> language model but give it zero weight.
You may try to run in all-phone mode, in that case only the phonemes are
recognized and you can see, whether your system matches the utterance at
least on the phoneme level without using vocabularies and LM. But it does not
work in continuous mode, you have to pre-record the file and then use it with
Secondly, if you give LM the weight of zero, it is not used, so yes, that's
another possibility for testing it.
I think, that there is even an option to not use LM at all, but you have to
check sphinx switches to see, how to do that.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
-----END PGP SIGNATURE-----