From: Sunil V. <ve...@me...> - 2001-08-16 17:24:02
|
I'm writing some code to capture the confidence score during recognition. I'm unable to determine the meaning of the numbers I'm getting. I've searched the docs, via voice dev site, discussion forums, and the web for more information on this. The only thing I can find is: > http://www-4.ibm.com/software/speech/dev/faq_windows.html > > What is Phrase Score? > > In brief, the score is not a confidence one, it is an average acoustic score per second. The acoustic score depends on the quality of the match and the length of the speech aligned with the word ... usually longer words get higher scores. There is no clamping of the range ... negative values are possible, as are values above 100. ISVs will have to experiment to determine what are reasonable/unreasonable values for the particular words in there applications. > However, this still doesn't give an indication how to interpret the number. Ideally, I'd like a probability (range from 0.0-1.0). Is there a way to convert these scores? Is there a different call I should be using? FYI: the relevant code segment I'm using to get the score is: smCallback recognizedText { foreach w [smWord -spell -startTime -endTime -score [getFirmWords]] { puts [format "%s\t%d\t%d\t%d" [lindex $w 0] [expr [lindex $w 1] - $startTime] [expr [lindex $w 2]-$startTime] [lindex $w 3]] } } and it produces output which looks like: are 2983 3163 -5 the 3582 3652 -6 best 3652 4031 -11 this 5108 5278 -10 is 5278 5388 0 tied 5388 5597 -16 to 5597 5657 -7 avoid 5657 6086 -14 have 12621 12781 -17 Many thanks, --Sunil |