Menu

The restriction of kaldi to recognize offline

Mao-Chang
2014-10-05
2014-10-14
  • Mao-Chang

    Mao-Chang - 2014-10-05

    Hi everyone,
    I'm a student in Taiwan. For my work, I have to use Kaldi. I'm working on a project about recognizing in realtime and offline with DNN. I'm asking if somebody knows that it need a big storage to speech recognize offline? So that I will know I should bring a hard drive or not.

    Thank you for help

    Mao-Chang

     
    • Daniel Povey

      Daniel Povey - 2014-10-05

      I'm not sure exactly what you mean when you use the terms "off-line" and
      "real-time" here - perhaps you could clarify what these terms mean to you.
      And perhaps instead of "bring" you mean "buy"? You basically need a Linux
      system and you need some familiarity with UNIX in order to do this. The
      hard drive space requirement depends on how much data you want to train on,
      but for small databases, a few tens of gigabytes may be enough. For large
      databases, to run DNNs you'd need a cluster of computers with GPUs, and I'm
      guessing from your question that that is not something you have.

      Dan

      On Sun, Oct 5, 2014 at 2:08 AM, Mao-Chang kevin79577@users.sf.net wrote:

      Hi everyone,
      I'm a student in Taiwan. For my work, I have to use Kaldi. I'm working on
      a project about recognizing in realtime and offline with DNN. I'm asking if
      somebody knows that it need a big storage to speech recognize offline? So
      that I will know I should bring a hard drive or not.

      Thank you for help

      Mao-Chang

      The restriction of kaldi to recognize offline
      https://sourceforge.net/p/kaldi/discussion/1355347/thread/45ff6708/?limit=25#2af3


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355347/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
      • Mao-Chang

        Mao-Chang - 2014-10-06

        Sorry! Maybe I should clarify how my project is going. I will carry a system with me. The system can collect sound and recognize with Kaldi in real-time. The DNN in Kaldi has been trained completely, so the DNN in system only need to test and recognize. The data system collect won't send to server run the result. I wonder if this process need a big storage or not.
        Some of your message is help for me.
        Hope this will be clear enough.

        Mao-Chang

         
        • Daniel Povey

          Daniel Povey - 2014-10-06

          Hm. So you mean real-time decoding on a device without being connected to
          a server.
          It doesn't require very much storage, except that required to store the DNN
          (a few megabytes, maybe), and to store the audio if you want to store this
          for logging purposes.
          Building usable real-time systems is challenging.

          I suggest as a starting point you look at the program
          online2-wav-nnet2-latgen-faster
          and the corresponding example scripts in egs/*/s5/local/online/run_nnet2.sh

          Dan

          On Sun, Oct 5, 2014 at 9:26 PM, Mao-Chang kevin79577@users.sf.net wrote:

          Sorry! Maybe I should clarify how my project is going. I will carry a
          system with me. The system can collect sound and recognize with Kaldi in
          real-time. The DNN in Kaldi has been trained completely, so the DNN in
          system only need to test and recognize. The data system collect won't send
          to server run the result. I wonder if this process need a big storage or
          not.
          Some of your message is help for me.
          Hope this will be clear enough.

          Mao-Chang

          The restriction of kaldi to recognize offline
          https://sourceforge.net/p/kaldi/discussion/1355347/thread/45ff6708/?limit=25#2af3/0a94/46ad


          Sent from sourceforge.net because you indicated interest in
          https://sourceforge.net/p/kaldi/discussion/1355347/

          To unsubscribe from further messages, please visit
          https://sourceforge.net/auth/subscriptions/

           
          • Mao-Chang

            Mao-Chang - 2014-10-13

            Thanks a lot! After a discussion with my group members, I got another question. The DNN model won't require much storage. What about the output of DNN after every training data which is big vocabulary. So that it can compare with the test data for recognizing. I am asking that if it needs big storage for output of DNN trained with big data.

            Mao-Chang

             
  • Mao-Chang

    Mao-Chang - 2014-10-14

    I have asked someone who use HTK told me that it only need less than 50 MB to store all the model for a data set. So Kaldi is also need about 50 MB ?

    Mao-Chang