Menu

sphinx3 script file

Help
2006-11-15
2012-09-22
  • jackcuijarod

    jackcuijarod - 2006-11-15

    As we know in Sphinx2 an pocket sphinx ,there has the script file "sphinx2-simple" ,when we run it ,the system can execute programe that can reconize word through user's mic,but in spinx3 the file is different ,who can tell me how can I use sphinx3 to reconize through the mic

    thank you very much

     
    • Tanel Alumäe

      Tanel Alumäe - 2006-11-16

      Check scripts/sphinx3-simple

      Or just execute sphinx3_livedecode <arguments file>

      Hoe this helps.

       
    • jackcuijarod

      jackcuijarod - 2006-11-18

      thank you for your reply, I had tried it ,but I could not reconize the word
      the screen display

      "starting recoding" press enter to end recoding

      when I press "enter" key ,it display ".." "..." when I press enter again ,I run over the program

      can you tell your Email thank you again

       
      • Tanel Alumäe

        Tanel Alumäe - 2006-11-21

        Try saying smth like "one two three" after pressing the Enter key. At least it works for me with limited accuracy. If it doesn't work for you, check the with some other recording programs that your speech actually reaches the system. Also, by default it seems to listen at the first dound device (/dev/dsp). If you have more than one sound devices and the microphone is connected to the second device (/dev/dsp1), you might have to hack the source.

         
    • jackcuijarod

      jackcuijarod - 2006-11-22

      dear sir , thank for your help ,could I ask your another quesiton , may be is my liunux system sound device problem . but I found after I run sphinx3-livedecode It means I press second "enter" the program is over ,and It produce the sound file "Out.raw" Did you see this file in your system , I think it is a recode fuction ,and what is your screen output after you input your voice

      this is my Email "jack-cui-jarod@163.com"

      thank again ,you are a kind man

       
      • Tanel Alumäe

        Tanel Alumäe - 2006-11-22

        Hello,

        This is what I get after pressing ENTER:

        press ENTER to start recording

        press ENTER to finish recording
        ad_oss.c 255: can't set input gain/recording level for this device.

        ....
        .... (lots of them)

        Backtrace( 1061022Z192556)
        FV:
        1061022Z192556> WORD SFrm EFrm AScr(UnNorm) LMScore AScr+LScr AScale
        fv: 1061022Z192556> <sil> 0 108 3276723 -74100 3202623 3511827
        fv:
        1061022Z192556> ONE 109 140 -939000 -148286 -1087286 -449778
        fv: 1061022Z192556> <sil> 141 157 -62537 -74100 -136637 79260
        fv:
        1061022Z192556> Q 158 187 -1773374 -148286 -1921660 -1043005
        fv: 1061022Z192556> <sil> 188 204 -710048 -74100 -784148 -378463
        fv:
        1061022Z192556> AREA 205 240 -1138601 -148286 -1286887 -449418
        fv: 1061022Z192556> <sil> 241 330 5068937 -74100 4994837 5414515
        FV:
        1061022Z192556> TOTAL 3722100 -741258

        FWDVIT: ONE Q AREA (* 1061022Z192556)
        FWDXCT: * 1061022Z192556 S 6542531 T 3723665 A 3722100 L 1565 0 3276723 -8101 <sil> 109 -939000 -16344 ONE 141 -62537 -8101 <sil> 158 -1773374 -16344 Q 188 -710048 -8101 <sil> 205 -1138601 -16344 AREA 241 5068937 -8101 <sil> 331 0 -16344 </s> 331

        INFO: stat.c(154): 331 frm; 71 cdsen/fr, 144 cisen/fr, 227 cdgau/fr, 246 cigau/fr, Sen 0.22, CPU 0.22 Clk [Ovrhd 0.17 CPU 0.17 Clk]; 206 hmm/fr, 1 wd/fr, Search: 0.01 CPU 0.01 Clk (* 1061022Z192556)
        INFO: fast_algo_struct.c(398): HMMHist0..0: 331(100)
        INFO: lm.c(944): 0 tg(), 0 tgcache, 0 bo; 0 fills, 0 in mem (0.0%)INFO: lm.c(948): 1182 bg(), 1173 bo; 1 fills, 1 in mem (50.0%)
        Hypothesis:
        ONE Q AREA

        I said "one two three" but the decoder recognized "ONE Q AREA" which is OK as my English is not very good ;)

        Yes, I get the out.raw file, this is the recording of what I just said. You can open this with a sound editor (import as raw, mono, 16 bit, 16kHz). If there isn't any visible or audible waveform, it means that the sound doesn't reach the decoder.

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.