Menu

audio search

asaleh
2010-05-04
2012-09-22
  • asaleh

    asaleh - 2010-05-04

    Hi,
    I want to use sphinx to search for a word or more in an audio file. this
    search should give me the times in which the word appeared in the file.
    first, I tried to put only the word to search for in the grammar file and saw
    really bad results which showed that the word was always being repeated (which
    was not the case!). so we thought that, by giving only this word to the
    grammar file, we are somehow forcing the engine to find it even if it doesn't
    exist in the audio file. We thought that giving other words to the grammar
    file ( about 20 words) would fix this problem. although, it yielded good
    results for some words and files, it still didn't do well at all for other
    cases. to illustrate in an example. let's say I wanted to search for "hello"
    in a given file. instead of specifying the grammar as being (hello) , we
    added other word so it became
    (hello | how | morning | concert | politics|
    noon)
    **.
    how can this be solved? Is there a better way than adding random words to the
    grammar files? can we, for example specify some efficient threshold value for
    the probability of the best path found by the recognizer?
    thanks.

     
  • Carl Gravier

    Carl Gravier - 2010-05-09

    Hello asaleh,

    I have the same problem and i have been looking a lot in the forums.

    According to nShmyrev :
    The default steps to improve accuracy are - add more pronoucation variants to
    the dictionary, improve your language model to restrict search space, tune
    penalties, use some online and offile adaptation technique (most important
    thing not implemented by sphinx4).

    U can create a new language model using the online tool http://www.speech.cs.
    cmu.edu/tools/lmtool.html
    for
    which u need to provide sentences not single words if u wish to have an
    accurate recognition. But this will make u highly dependent of the content of
    the audio files u are using.
    U can also change some of the properties in ur config file such as the
    following:

    <property name="absoluteBeamWidth" value="1000">
    <property name="relativeBeamWidth" value="1E-90">
    <property name="absoluteWordBeamWidth" value="20">
    <property name="relativeWordBeamWidth" value="1E-60">
    <property name="wordInsertionProbability" value=".7">
    <property name="languageWeight" value="9.0"> </property></property></property></property></property></property>

    But still u are content dependent. So i cannot conclude if its impossible to
    get an accurate recognition for a single word in any audio file. Maybe some of
    the experts can help! :)
    I hope i helped u!

     
  • asaleh

    asaleh - 2010-05-10

    cgravier, do I know u from somewhere? ;)

     
  • Anonymous

    Anonymous - 2010-05-27

    As far as "Audio Search" goes, I would like to invite you to try out our Audio
    Search web services: http://nexiwave.com. We offer
    exactly what you need (well, to milliseconds level;))...

    The API doc is here: http://www.nexiwave.com/PC-NG-
    AA/api/nexiwave.audio.search.SaaS.api.pdf

    You can also find the free trial link on the home page, which lets you upload
    your audio and search within the audio from a web page (the requests are just
    forwarded to the web service engine...)

    Ben

     

Log in to post a comment.