Hi,
I want to use sphinx to search for a word or more in an audio file. this
search should give me the times in which the word appeared in the file.
first, I tried to put only the word to search for in the grammar file and saw
really bad results which showed that the word was always being repeated (which
was not the case!). so we thought that, by giving only this word to the
grammar file, we are somehow forcing the engine to find it even if it doesn't
exist in the audio file. We thought that giving other words to the grammar
file ( about 20 words) would fix this problem. although, it yielded good
results for some words and files, it still didn't do well at all for other
cases. to illustrate in an example. let's say I wanted to search for "hello"
in a given file. instead of specifying the grammar as being (hello) , we
added other word so it became (hello | how | morning | concert | politics|
noon)**.
how can this be solved? Is there a better way than adding random words to the
grammar files? can we, for example specify some efficient threshold value for
the probability of the best path found by the recognizer?
thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have the same problem and i have been looking a lot in the forums.
According to nShmyrev :
The default steps to improve accuracy are - add more pronoucation variants to
the dictionary, improve your language model to restrict search space, tune
penalties, use some online and offile adaptation technique (most important
thing not implemented by sphinx4).
U can create a new language model using the online tool http://www.speech.cs.
cmu.edu/tools/lmtool.html for
which u need to provide sentences not single words if u wish to have an
accurate recognition. But this will make u highly dependent of the content of
the audio files u are using.
U can also change some of the properties in ur config file such as the
following:
But still u are content dependent. So i cannot conclude if its impossible to
get an accurate recognition for a single word in any audio file. Maybe some of
the experts can help! :)
I hope i helped u!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2010-05-27
As far as "Audio Search" goes, I would like to invite you to try out our Audio
Search web services: http://nexiwave.com. We offer
exactly what you need (well, to milliseconds level;))...
You can also find the free trial link on the home page, which lets you upload
your audio and search within the audio from a web page (the requests are just
forwarded to the web service engine...)
Ben
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I want to use sphinx to search for a word or more in an audio file. this
search should give me the times in which the word appeared in the file.
first, I tried to put only the word to search for in the grammar file and saw
really bad results which showed that the word was always being repeated (which
was not the case!). so we thought that, by giving only this word to the
grammar file, we are somehow forcing the engine to find it even if it doesn't
exist in the audio file. We thought that giving other words to the grammar
file ( about 20 words) would fix this problem. although, it yielded good
results for some words and files, it still didn't do well at all for other
cases. to illustrate in an example. let's say I wanted to search for "hello"
in a given file. instead of specifying the grammar as being (hello) , we
added other word so it became (hello | how | morning | concert | politics|
noon)**.
how can this be solved? Is there a better way than adding random words to the
grammar files? can we, for example specify some efficient threshold value for
the probability of the best path found by the recognizer?
thanks.
Hello asaleh,
I have the same problem and i have been looking a lot in the forums.
According to nShmyrev :
The default steps to improve accuracy are - add more pronoucation variants to
the dictionary, improve your language model to restrict search space, tune
penalties, use some online and offile adaptation technique (most important
thing not implemented by sphinx4).
U can create a new language model using the online tool http://www.speech.cs.
cmu.edu/tools/lmtool.html for
which u need to provide sentences not single words if u wish to have an
accurate recognition. But this will make u highly dependent of the content of
the audio files u are using.
U can also change some of the properties in ur config file such as the
following:
<property name="absoluteBeamWidth" value="1000">
<property name="relativeBeamWidth" value="1E-90">
<property name="absoluteWordBeamWidth" value="20">
<property name="relativeWordBeamWidth" value="1E-60">
<property name="wordInsertionProbability" value=".7">
<property name="languageWeight" value="9.0"> </property></property></property></property></property></property>
But still u are content dependent. So i cannot conclude if its impossible to
get an accurate recognition for a single word in any audio file. Maybe some of
the experts can help! :)
I hope i helped u!
cgravier, do I know u from somewhere? ;)
As far as "Audio Search" goes, I would like to invite you to try out our Audio
Search web services: http://nexiwave.com. We offer
exactly what you need (well, to milliseconds level;))...
The API doc is here: http://www.nexiwave.com/PC-NG-
AA/api/nexiwave.audio.search.SaaS.api.pdf
You can also find the free trial link on the home page, which lets you upload
your audio and search within the audio from a web page (the requests are just
forwarded to the web service engine...)
Ben