Menu

about distributed recognition

Help
chris
2007-07-19
2012-09-22
  • chris

    chris - 2007-07-19

    Hi all,

    I am interested in experimenting distributed recognition. As far as I understood it means that recording and feature extraction are performed on a client and sent to a server to do the actual recognition.
    Have you heard about any free API doing it? Is there any standard for this kind of systems? Is it be part of a VoIP standard? Any known experiment with HTK or Sphinx? Any pointer would be greatly appreciated!

    Cheers,

     
    • David Huggins-Daines

      Yes, Motorola and IBM did a bunch of work on this around the turn of the century. There are some ETSI standards that were published. Search for "ES 202 050", "ES 202 211", and "ES 202 212" on http://pda.etsi.org/pda/

      With respect to VoIP in particular, there is an RFC: http://www.rfc-editor.org/rfc/rfc4060.txt

      See also RFC 3557, which I think says mostly the same stuff as RFC 4060.

      Sphinx doesn't implement this stuff. It should not be difficult to do so based on the standards, though.

       
    • chris

      chris - 2007-07-20

      Oh my God,very very appreciate,you helps me a lot and I will try later.I like sphinx group,you are so kind David!

       
    • Holger Brandl

      Holger Brandl - 2007-08-06

      Hi Chris,

      > Have you heard about any free API doing it?
      Yep. You could try to use sphinx4 together with the cajo library. This gives a you a free distributed speech recognition system with minimal efforts. The crucial point probably the selection of the proper interface: I would suggest to do the feature extraction on client side, and to use cajo to distribute the feature vectors only.

      Cheers,
      Holger

       
      • Sylvain

        Sylvain - 2007-12-24

        Hi Holger,

        I am very interested in trying out your proposal to distribute Sphinx 4 into a feature extraction module on client and decoding on server.

        As far as I understand, CAJO would do multiple remote calls, for each frame in current sphinx 4 design. It would be very heavy in bandwidth because of CAJO additional information to complete a remote call, right? We could also put all frames into a big data object, but the server would have to wait for the end to start decoding.

        We could also send each frame by TCP or UDP socket. What do you think would be the advantage of using CAJO?

        Thanks a lot for your help.

        Sylvain

         
    • chris

      chris - 2007-09-04

      Hi Holger,
      Thank you very much,in fact I want to use sphinx3,and I do not know very much about java,is there any library written with c++ or c language?and I will try cajo first,you help me a lot.

      BestWishes
      Chris

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.