Need Help in setting PocketSphinx for 8K Sample Rate file and 4KZ

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Need Help in setting PocketSphinx for 8K Sample Rate file and 4KZ

Forum: Help

Creator: PRAVEEN KUMAR

Created: 2020-08-23

Updated: 2020-08-23

PRAVEEN KUMAR - 2020-08-23

Dear Sphinx Team,

I need your help on setting up pocketsphinx for audio file we are trying to process through PocketSphinx for text.

My audio file format and description are as follows :
File Type : WAV
Channels : 1
Sample Rate : 8000
Precision : 14-bit
Bit rate : 64K
Sample Encoding : 8-bit u-law
Bandwidth : 4KHz

With this format, PocketSphinx is not able to recognize the file and not able to process.
I converted the file to 16000 Sample rate using SOX and SoundFile python libary. After that File format changed as below :

Channels : 1
Sample Rate : 16000
Precision : 16-bit
Bit rate : 256K
Sample Encoding : 16-bit Signed Integer PCM
Bandwidth : 4KHz

Withi this format, PocketSphinx is able to process file and able to provide Text of the speech in Audio file. However, the text is totally different than the actual speech. I would say 0% accuracy.
Could you kindly advise what should be my approach (setitng pocketsphinx or converting file) for getting text from Audio file with accuracy using PocketSphinx.

Please let me know if you need any information from my side to provide help on this.

Thanks!!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.