CMU Sphinx / Forums / Help: Single Word Classification and Corpus

Gizmoguy - 2016-10-18

Hi all,

I would like to add limited vocabulary speech recognition to a larger project. For my purposes I need to classify a speech sample but it is not necessary to decode it to phonemes or a word. As I do not know much about speech recognition, my initial approach would be clustering MFCC features, but I need a speech corpus of single words with multiple speakers for each word, which I have so far been unable to find.

If anyone can provide information on a good technique for my purpose or a freely-available corpus, I would be grateful.

Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-10-18
  
  This is called voice activity detection or VAD
  
  You can download a database for training here http://www.openslr.org/17/
  
  You can read paper about it here https://arxiv.org/pdf/1510.08484v1.pdf
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Gizmoguy - 2016-10-18
    
    Thanks for the prompt reply.
    
    It is not speech/non-speech I need to classify, but I need to classify the same word being spoken across different speakers.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nickolay V. Shmyrev - 2016-10-18
      
      This is called keyword spotting, you can use pocketsphinx for that
      
      http://cmusphinx.sourceforge.net/wiki/tutoriallm#keyword_lists
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Single Word Classification and Corpus

Speech Recognition Toolkit

Forums

Help

Single Word Classification and Corpus document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Single Word Classification and Corpus