Dear 'PocketSphinx' team,
I would like to transcribe some speech file into text with PocketSphinx. I observed that the speech recognition quality of the first speech section of the sound file is inferior, but later it improves. If there is a silence section after the first speech section then its recognition quality is also inferior, but all other sections will be appropriately recognised after the first the silence section on. Can I eliminate this problem with some parameter settings?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I would like to ask a question, that '-cmninit' parameter of 'PosckSphinx' what kind of value limits may be, if the '-cmn' parameter is 'prior'. And what impact it has on that decoding's beginning of a bad. And this decoding after the first silence section totally outstanding will be.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear 'PocketSphinx' team,
I would like to transcribe some speech file into text with PocketSphinx. I observed that the speech recognition quality of the first speech section of the sound file is inferior, but later it improves. If there is a silence section after the first speech section then its recognition quality is also inferior, but all other sections will be appropriately recognised after the first the silence section on. Can I eliminate this problem with some parameter settings?
-cmninit parameter allows you to configure the input levels, you can check the discussion details at
https://github.com/watsonbox/pocketsphinx-ruby/issues/10
I would like to ask a question, that '-cmninit' parameter of 'PosckSphinx' what kind of value limits may be, if the '-cmn' parameter is 'prior'. And what impact it has on that decoding's beginning of a bad. And this decoding after the first silence section totally outstanding will be.