Menu

Grammar as Lenguage Model

Help
sprieto
2015-03-17
2015-03-27
  • sprieto

    sprieto - 2015-03-17

    Hi,

    We would like to use a grammar as LM for speaker recognition. For example:

    $digit = one | two | three | four | five | six | seven | eight | nine | zero;
    $number = $digit | $digit $digit | $digit $digit $digit;

    Can Kaldi support this type of grammars?

    Thanks in advance.

     

    Last edit: sprieto 2015-03-17
  • sprieto

    sprieto - 2015-03-26

    Hi,

    We tried to use Thrax to create the grammar but finally we decided to create it manually.

    We have already a big recognizer with an acoustic model trained with 150 hours and big lexicon that contains 120000 words and it works very well. We also created manually G.fst of the new grammar. We would like to combine this G.fst with the acoustic model to obtain a new recognizer based in the new grammar.

    Firstly we created HCLG.fst using all the files in data/lang of the big recognizer (including words.txt with 120000 words) and this new grammar and it works.

    After that, we tried to create all the lang files (words.txt,L.fst,phones.txt...) with the words of the new grammar. We compiled the HCLG.fst and the results aren't good.

    We also followed the steps from http://sourceforge.net/p/kaldi/discussion/1355348/thread/2539caaa/, changing the phones.txt,disambig.txt, nonsilence.txt, silence.txt with the files of the big recognizer. And the results are not good.

    So, the only way we found to create a new recognizer based in the grammar needs the same wordlist of the big recognizer.

    Any help?

    Thanks in advance

     
    • Nagendra Kumar Goel

      Even if you start with a huge wordlist, what you eventually use depends on
      what is there in the grammar. So there I as no need to explicitly reduce
      lexicon size. Just make sure it covers the words in grammar.

      Nagendra
      On Mar 26, 2015 4:41 AM, "sprieto" sprieto@users.sf.net wrote:

      Hi,

      We tried to use Thrax to create the grammar but finally we decided to
      create it manually.

      We have already a big recognizer with an acoustic model trained with 150
      hours and big lexicon that contains 120000 words and it works very well. We
      also created manually G.fst of the new grammar. We would like to combine
      this G.fst with the acoustic model to obtain a new recognizer based in the
      new grammar.

      Firstly we created HCLG.fst using all the files in data/lang of the big
      recognizer (including words.txt with 120000 words) and this new grammar and
      it works.

      After that, we tried to create all the lang files
      (words.txt,L.fst,phones.txt...) with the words of the new grammar. We
      compiled the HCLG.fst and the results aren't good.

      We also followed the steps from
      http://sourceforge.net/p/kaldi/discussion/1355348/thread/2539caaa/,
      changing the phones.txt,disambig.txt, nonsilence.txt, silence.txt with the
      files of the big recognizer. And the results are not good.

      So, the only way we found to create a new recognizer based in the grammar
      needs the same wordlist of the big recognizer.

      Any help?

      Thanks in advance

      Grammar as Lenguage Model
      https://sourceforge.net/p/kaldi/discussion/1355348/thread/3309b68a/?limit=25#8ebb


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
      • Daniel Povey

        Daniel Povey - 2015-03-26

        The only thing that needs to be exactly the same is the phones.txt. If
        this is different then the numbering of the phones gets messed up and you
        get nonsense.
        Dan

        On Thu, Mar 26, 2015 at 6:52 AM, Nagendra Kumar Goel ngoel17@users.sf.net
        wrote:

        Even if you start with a huge wordlist, what you eventually use depends on
        what is there in the grammar. So there I as no need to explicitly reduce
        lexicon size. Just make sure it covers the words in grammar.

        Nagendra
        On Mar 26, 2015 4:41 AM, "sprieto" sprieto@users.sf.net wrote:

        Hi,

        We tried to use Thrax to create the grammar but finally we decided to
        create it manually.

        We have already a big recognizer with an acoustic model trained with 150
        hours and big lexicon that contains 120000 words and it works very well. We
        also created manually G.fst of the new grammar. We would like to combine
        this G.fst with the acoustic model to obtain a new recognizer based in the
        new grammar.

        Firstly we created HCLG.fst using all the files in data/lang of the big
        recognizer (including words.txt with 120000 words) and this new grammar and
        it works.

        After that, we tried to create all the lang files
        (words.txt,L.fst,phones.txt...) with the words of the new grammar. We
        compiled the HCLG.fst and the results aren't good.

        We also followed the steps from
        http://sourceforge.net/p/kaldi/discussion/1355348/thread/2539caaa/,
        changing the phones.txt,disambig.txt, nonsilence.txt, silence.txt with the
        files of the big recognizer. And the results are not good.

        So, the only way we found to create a new recognizer based in the grammar
        needs the same wordlist of the big recognizer.

        Any help?
        Thanks in advance

        Grammar as Lenguage Model

        https://sourceforge.net/p/kaldi/discussion/1355348/thread/3309b68a/?limit=25#8ebb

        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355348/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/


        Grammar as Lenguage Model
        http://sourceforge.net/p/kaldi/discussion/1355348/thread/3309b68a/?limit=25#8ebb/1d89


        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355348/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/

         
  • sprieto

    sprieto - 2015-03-27

    It works very good now. Thanks a lot.