I want to do a simple speech segmentation: i.e. write a software to detect the pause/silent between sentences (longer pause than between words).
I think it would be accurate enough to use "energy" and "zero-crossing rate".
I found that many literatures link to the paper:
L. Rabiner and M. Sambur. An Algorithm for Determining the Endpoints of Isolated Utterances, The Bell System Technical Journal, Vol. 54, No. 2, Feb. 1975, pp. 297--315
Unfortunately, I could not found a pdf version of that paper.
Could anyone give me a copy of that paper? Or do you know a better way to do that?
Thanks,
Nguyen
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
although i don't know where to find the paper you could try to use the sphin4 frontend for energy based segmentation. Just plugin a SpeechMarker after a SpeechClassifier and you're ready to go. (If you don't understand anything read the frontend configuration pages at the sp4-website)
Holger
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I want to do a simple speech segmentation: i.e. write a software to detect the pause/silent between sentences (longer pause than between words).
I think it would be accurate enough to use "energy" and "zero-crossing rate".
I found that many literatures link to the paper:
L. Rabiner and M. Sambur. An Algorithm for Determining the Endpoints of Isolated Utterances, The Bell System Technical Journal, Vol. 54, No. 2, Feb. 1975, pp. 297--315
Unfortunately, I could not found a pdf version of that paper.
Could anyone give me a copy of that paper? Or do you know a better way to do that?
Thanks,
Nguyen
Hi Comobile,
although i don't know where to find the paper you could try to use the sphin4 frontend for energy based segmentation. Just plugin a SpeechMarker after a SpeechClassifier and you're ready to go. (If you don't understand anything read the frontend configuration pages at the sp4-website)
Holger