Menu

Phonemes in sphinx

Help
alladi
2008-04-27
2012-09-22
  • alladi

    alladi - 2008-04-27

    Hi,

    I found a method in sphinx 4 result.getBestPronunciationResult() which lists the phonemes that were recognized (from the final result). Is it possible retrieve the list of all matching phonemes. I am interested in the intermediate state and not the final result, a list of all possible phonemes (optionally, with a score assigned to each)?

    Thanks
    Sriram

     
    • Andre Lessa

      Andre Lessa - 2008-04-28

      Hi Sriram,

      Yes, that's possible. Here's how.

      System.out.println("final result: " + result.getBestPronunciationResult() + "\n");

      List nBest = result.getResultTokens();
      Token nBestToken;
      for (int i = 0; i < nBest.size(); i++) {
      nBestToken = (Token)nBest.get(i);
      System.out.println("partial result: " + nBestToken.getWordUnitPath());
      System.out.println(" word path: " + nBestToken.getWordPath(false, true) + \n");
      System.out.println(" score: " + nBestToken.getScore() + \n");
      }

      Check out the "Token" documentation to find out what other methods you can call.

      http://cmusphinx.sourceforge.net/sphinx4/javadoc/edu/cmu/sphinx/decoder/search/Token.html

      Cheers,
      Andre

       
    • alladi

      alladi - 2008-04-28

      Hi Andre,

      Thanks for the reply. If i am not wrong, your code displays all possible word combinations that were evaluated. Do you think it is possible to list the phonemes that were used to generate these words?

      Thanks for your help.
      Sriram

       
      • Andre Lessa

        Andre Lessa - 2008-04-28

        The phonemes are also listed. The output string looks like this:

        with[W,IH,TH] the[DH,AH] press[P,R,EH,S]

        You just need to parse it.

         
    • alladi

      alladi - 2008-04-28

      Hi Andre,

      Yeah, I saw those, but, probably my understanding of the system isn't correct. So sphinx would first create a single list of phonemes and then fit words with these? Or does it create multiple lists of phonemes, score them and pick the best before proceeding with fitting the words?

      Thanks
      Sriram

       
      • Andre Lessa

        Andre Lessa - 2008-04-28

        Sphinx doesn't really have the concept of a initial list of phonemes. It builds the list of partial results as it walks through the graph by considering mainly the acoustic model score and the language model scores. In other words, the vocabulary that's provided to the language model is the one that really drives what will become the final best result.

        So the bottom line is that the phonemes exposed by the function I gave you earlier are not really what the acoustic model has recognized. Instead, they're the phonetic representation of the word tokens that received the highest scores and almost became the best result.

        Without a language model helping drive the process, chances are your results would be very inaccurate.

        Long story shorter, your second statement is more in line with what really happens.

        Cheers,
        Andre

         
    • alladi

      alladi - 2008-04-28

      Thanks for the explanation. Ill try to use the list of phonemes and proceed.

      Thanks for your time and help.
      Sriram

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.