CMU Sphinx / Forums / Help: complete newbie needs help

sour - 2007-08-29

hey guys,
i need some help. i installed and followed the whole robust group tutorial. now i would like to build up my own dictionary with 5 words but i have no idea what to do.
could you help me?!

i already have a new directory where all the files were copied and i changed the .dic, the .phone, the .train.transcription files. but i have no idea what i am doing, actually.

please help!!!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- sour - 2007-09-03
  
  i tried and there appears following failure:
  
  > Usage: ./sphinx2_batch
  > Segmentation fault (core dumped)
  
  help please...
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2007-09-03
    
    You should understand that we are not magicians, we can help you with such little info. We need:
    
    Information about your OS
    
    Information about your compiler
    
    Exact version of sphinx2 you are using
    
    Command line options your are using
    
    Data you are trying with.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Michael Shineberg - 2007-08-30
  
  I require help from a programming whiz out ther to help me develop a tutorial that I think will eventually bring in big dollars.
  
  I already produce a CD set of lessons called "Speak Australian" (SA).
  it is very effective to help Australian migrants to modify their speech patterns & enable them to get jobs or better jobs.
  
  I plan to make my tutorial interactive using speach recognition to evaluate enunciation & produce a score, just like "Typing Tutor".
  
  This may seem like a daunting project but when divided into its componenets it is not so:
  The main varients in speech are pitch modulation, speed of words & pauses, & accent(volume on a syllable) all of which (according my limited understanding of audio recording) are all quite measureable.
  
  The market for tutorials to improve spoken Australian is huge ...especially with foreign help desks used by many companies these days. The principle is also applicable to other versions of English & other languages...a really huge market.
  
  Is anyone out there???????
  I would like to meet with you.
  
  Best wishes,
  Michael (director Shine Institute
  email MS@shine-institute.com
  phone (australia) +61 (0)425 264 669 {delete zero for OS calls}
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- sour - 2007-08-30
  
  i am trying to decode but it can't find the *.match file, although i thought it's created during the first decoding process. please help!!
  
  failure:
  
  /usr/src/tutorial/test$ sudo scripts_pl/decode/slave.pl
  MODULE: DECODE Decoding using models previously trained
  Decoding 20 segments starting at 0 (part 1 of 1)
  Using files: 0% Finished
  Can't open /usr/src/tutorial/test/result/test-1-1.match
  SENTENCE ERROR: 100.000% (20/20)
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2007-08-30
    
    Look into decode log in logdir/decode. Upload it somewhere and give us a link
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- sour - 2007-08-30
  
  i already resolved this problem by copying sphinx2_batch into the bin folder, because it wasn't there. thank you anyway!
  
  another question: if i change my _train.transcription and _train.fileids by adding new soundfiles from different people, what do I have to do?
  I thought I just have to makefeats (works), then runall.pl and decode after. but there is the problem, if i "runall" appears following problem:
  
  MODULE: 10 Vector Quantization
  FATAL_ERROR: "corpus.c", line 262: input string too long. Truncated.
  Something failed: (/usr/src/tutorial/test/scripts_pl/10.vector_quantize/slave.VQ.pl)
  
  what can i do?! please help.
  
  and if it works, how can i open a "programm" like the demo which uses my dictionary?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2007-08-30
    
    >sphinx2_batch
    
    are you using obsolete versions, it should be sphinx3 instead.
    
    >Something failed: (/usr/src/tutorial/test/scripts_pl/10.vector_quantize/slave.VQ.pl)
    
    Look into the logs again for more information
    
    > and if it works, how can i open a "programm" like the demo which uses my dictionary?
    
    run sphinx2_batch with appropriate options to point to your acoustic and language model. Documentation on options is available.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- sour - 2007-08-31
  
  > are you using obsolete versions, it should be sphinx3 instead.
  
  i am using sphinx2 not sphinx3
  
  > Look into the logs again for more information
  
  i cant find the problem in the log
  
  > Documentation on options is available.
  
  where can i find that documentation?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2007-08-31
    
    >i cant find the problem in the
    
    >10 Vector Quantization
    >FATAL_ERROR: "corpus.c", line 262: input string too long. Truncated.
    >Something failed: (/usr/src/tutorial/test/scripts_pl/10.vector_quantize/slave.VQ.pl)
    
    Ah, sorry, the description is already here. That "input string is too long" means that format of your .fileids of .transcription is broken. Please check that you have last newline and no spaces before the end of line. Sphinxtrain is sensitive for that
    
    > where can i find that documentation?
    
    http://cmusphinx.sourceforge.net/sphinx2/doc/sphinx2.html
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- sour - 2007-08-31
  
  found the "failure". there was missing a new line at the end in the $_train.fileids - file....
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- sour - 2007-08-31
  
  oh, i am stuck again. how do i get the decode folder into my new folder name with new vocab?
  i tried: perl ../sphinx2/scripts/setup_tutorial.pl name
  but i got:
  > Building task name
  > Task not previously defined. User has to provide language model and supporting variables
  
  but i have a language model in the name/etc folder...i made it with the LM tool on the website.
  but if i look at the previous an4 folder there is a .dmp file which i dont have at my new folder name....is it important?
  how am i able to proceed?
  
  thanks in advance for your help!
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- sour - 2007-08-31
  
  ok, got it. after hours....:D
  
  but i still dont know how to try out my new vocab
  "run sphinx2_batch with appropriate options to point to your acoustic and language model" => how can i change the options? its an executable...
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2007-09-01
    
    Something like:
    
    sphinx2_batch -lmfn <your language model> -dictfn <yourdictionary> -hmmdir <yourhmmdir>
    
    no?
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

complete newbie needs help

Speech Recognition Toolkit

Forums

Help

complete newbie needs help document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

complete newbie needs help