Influence of codec and bitrate

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Influence of codec and bitrate

Forum: Speech Recognition Theory

Creator: Alexander

Created: 2012-03-05

Updated: 2012-09-22

Alexander - 2012-03-05

Hello!
I got an idea to try Sphinx on Asterisk produced audio. I'm afraid of
follwoing issue. Alaw codec has similar sampling rate to one of Sphinx
acoustic models - WSJ. It is 8 KHz but:
- they have different quality
- different codecs are used. if let say model and audio channel have the same quality - will codec matter in such case?
Any idea?
Thank you in advance

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-03-05

they have different quality - different codecs are used. if let say model
and audio channel have the same quality - will codec matter in such case?

Codec does matter. Not about alaw but most industrial codecs use lossy
compression and they usually degrade the ASR accuracy by few percents. There
are also frame drop issues in VoIP channels which degrade accuracy too.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

eliasmajic - 2012-03-05

Take a look here:
http://cmusphinx.sourceforge.net/wiki/sphinxinaction

It lists some telephony implementations using both
pocketsphinx(ast-unimrcp) :
http://code.google.com/p/unimrcp/

and

sphinx4(cairo) : http://www.speechforge.org/

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alexander - 2012-03-07

Thank you for information!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.