Speech recognition theroy defines 25ms as the size of a frame and frame shift is 10ms. This means that every 10ms, I receive a frame of size 25 ms and subject it to the process of MFCC extraction to create .mfc files. But, in sphinx_fe.exe configuration, I found only the following parameter: -frate 100 Frame rate
Also, in the tutorial, I found the following
for each frame, typically of 10 milliseconds length, we extract 39 numbers that represent the speech.
As per the tutorial, frate means "number of frames per second", which is 100 and each frame is of size 10ms.
I am confused as to which one is correct. Can someone clarify please... Thank you.
Last edit: Balaji 2019-02-05
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Speech recognition theroy defines 25ms as the size of a frame and frame shift is 10ms. This means that every 10ms, I receive a frame of size 25 ms and subject it to the process of MFCC extraction to create .mfc files. But, in sphinx_fe.exe configuration, I found only the following parameter:
-frate 100 Frame rateAlso, in the tutorial, I found the following
As per the tutorial, frate means "number of frames per second", which is 100 and each frame is of size 10ms.
I am confused as to which one is correct. Can someone clarify please... Thank you.
Last edit: Balaji 2019-02-05