Menu

s3_max_frames question

Help
skatz_teyp
2007-06-18
2012-09-22
  • skatz_teyp

    skatz_teyp - 2007-06-18

    hello, im using sphinx 3.0.6 for my project... as i used the engine, i noticed it has a default value for s3_max_frames to 15000 which means 150 secs of audio input right? so i changed it to 30000... now i can decode audio files with less than 5 minutes of length but if i try to make it larger, say 50000, i get an error... my application stops and exits.... why is this happening? what happens when i change the value of s3_max_frames? is there a limitation for it?... and as a side question, using live decode, if i change it to 50000, i get an error "Bad lw2 argument (294127368) to lm_tg_score" at about frame number 32769 (not exact), why does this also happen?

     
    • Nickolay V. Shmyrev

      Actually it's not recommended to decode big chunks of text at once. Algorithm will lag on such a big amount of data. For example here you have long int overflow. Of course you can make every int bigger but it will not help you a lot.

      Split your big text on chunks and decode each one separately. Refer to sphinx3-decode code or use it directly. Similar comment:

      https://sourceforge.net/forum/message.php?msg_id=4349959

       
      • skatz_teyp

        skatz_teyp - 2007-06-19

        ummm... if i split a file, say i have this 10 min file and split it to, say, five 2-minute files, a problem could occur if i split the file at the point where a word is being spoken so i will it will end up at the being the first few syllables of the word at the first file and the other syllables at the other file having it a wrong recognition... or will sphinx3_continuous (as mentioned on the link you posted) do the job for it?

        and also, aside from sphinx3_decode, i used sphinx3_livedecode but modified slightly at recording of samples... instead of getting the samples from waveIn, i directly get the sample data from the wave file at make it as input for the ld_process_raw functions... it goes well also in small files but gets an error if use my bigger files....

        thx for the reply...

         
        • Nickolay V. Shmyrev

          cont_ad will track silence region and split your utterance during silence. sphinx3_continuous does exactly that. Nobody can speak without pauses for 10 minutes. ad_read should return 0 if there was silence in all samples so you can check it and if it's 0 for several iterations you can start decoding of an utterance. The rest will be handled later.

          So you should probably look in main_continuous.c source. Both livedecode and livepretend aren't designed to with long input.

           
          • skatz_teyp

            skatz_teyp - 2007-06-19

            thx... i've just tried sphinx3_continuous and it works correctly... since livedecode can't do the job, i'll shift to using sphinx3_continuous now...

            thanks again... i'll try to look at the source code now... i can read C but not that much though... im just pinvoking it using C#...

             

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.