I'd like to get a solid understanding of the inner workings of Sphinx and similar software. So I'm looking for a good textbook that covers the following:
all the basics (FFT etc.)
MFCC
hidden Markov models (HMM)
dynamic time warping (DTW)
I saw Nickolay recommend Spoken Language Processing (Huang) and Fundamentals of Speech Recognition (Rabiner) on other occations. But I can't find a table of contents for them, so I'm not sure whether they cover everything.
In addition, it would be great if the book was well-written, that is, not too hard to follow. I have a Master's degree in computer science but only rudimentary knowledge of signal processing.
Do you have any recommendations?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'd like to get a solid understanding of the inner workings of Sphinx and similar software. So I'm looking for a good textbook that covers the following:
I saw Nickolay recommend Spoken Language Processing (Huang) and Fundamentals of Speech Recognition (Rabiner) on other occations. But I can't find a table of contents for them, so I'm not sure whether they cover everything.
In addition, it would be great if the book was well-written, that is, not too hard to follow. I have a Master's degree in computer science but only rudimentary knowledge of signal processing.
Do you have any recommendations?
The SLP textbook by Huang should cover all your listed topics
Thank you! I'll buy that, then.