Keyword spotting/ word spotting

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Keyword spotting/ word spotting

Forum: Help

Created: 2011-11-12

Updated: 2012-09-22

sol - 2011-11-12

I am looking for a specific key word recognizer method/code. I need something
that will be able to detect just a few words - at most 5. It needs to be able
to listen and process continuously and work online; it needs to give real time
feedback to the user, which means that once the user says one of the keywords,
it needs to have a response time of less than a second and a half. Two of our
words are "uh" and "um", so we would need to add those to the dictionary if
it's not already included, which hopefully will not alter the program's
effectiveness. The keyword recognizer also needs to be speaker independent. It
needs to be able to work with a program on our computer. Once the keyword
spotting program recognizes that the speaker has uttered a keyword, it needs
to be able to send a signal to our other program. Our program will then power
a vibrating motor for about a second. Do you guys know of any software that
could do this for us? This is for a project we're working on in one of our
classes at Dartmouth College.

Thanks!
Solomon

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-11-16

Hello Solomon

I am looking for a specific key word recognizer method/code.

Keyword spotting task is different from speech recognition task and requires
an implementation of the specific algorithms. We have a very initial
implementation for sphinx4 in audio aligner branch:

http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/branches/long-audio-ali
gner/KeyWordSpotting/

But you will need more work on it. I suggest you to check it out first and try
to plug into your application. Then we could work on making performance
better.

There are some other keyword spotters in the net, for example from Brno:

http://speech.fit.vutbr.cz/software/kwsviewer-interactive-viewer-keyword-
spotting-output

Two of our words are "uh" and "um", so we would need to add those to the
dictionary if it's not already included, which hopefully will not alter the
program's effectiveness.

Those types of words are very hard to detect reliably. You will need to
develop a specific algorithm to detect them with low false alarm rate. You
might initially go without them and consider them later down the road.

Once the keyword spotting program recognizes that the speaker has uttered a
keyword, it needs to be able to send a signal to our other program.

It's just a technical detail.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

sol - 2011-11-17

Thanks for your response. How would my group and I go about downloading the
audio aligner branch of the KeywordSpotting project that you directed us to?
Is there a direct download link in the browser? We don't have svn to check out
the project, so is there a way to get around using it or do we need to
download that first?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-11-17

Is there a direct download link in the browser?

http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/branches/long-audio-ali
gner/KeyWordSpotting/?view=tar

We don't have svn to check out the project

It's better to install subversion

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yves Raimond - 2011-11-25

Hello!

I have been playing with this branch on some BBC audio, and everything seems
to work OK. Right now, it looks like it supports only one keyword, e.g.
'quantum number' or 'benjamin britten'.

I am guessing it needs an update to NoSkipGrammar to handle multiple keywords.
I guess the grammar it would need to support multiple keywords would look
like:

-----> kw1_1 ...---> kw1_n ----\
InitialNode /----> kw2_1 ... ---> kw2_n-----> FinalNode

Before I try to write a patch for that, is there something similar already
existing I could look at?

Kind regards,
Yves

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-11-26

Hello

That would be a nice addition. You need to modify the NoSkipGrammar and the
public interface for it. If you could rework it to better fit into keyword
spotting task that would be just great.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yves Raimond - 2011-11-28

Hello!

I may have missed something, but the following patch seems to make this branch
work for multiple keyword spotting:

http://pastebin.com/9vkLrEum

Best,
y

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.