I have written a JSGF file and wish to autogenerate prompts from it. How could someone go about loading that file into (preferably) PocketSphinx and generating valid sequences of words for training/testing purposes? I am unafraid of diving into the code; however I do not wish to needlessly reinvent the wheel and would like a little guidance.
Your intention is not clear. Could you provide a little example of what you want in terms of input-processing-output?
Thank you for the quick response,
I wish to create a speaker-dependent acoustic model for use in a command system. Using the Finite State Grammar created from the .JSGF file itself, the intention is to dynamically create large lists of prompts for consumption by SphinxTrain without having to write them manually.
Pocketsphinx already properly parses the .JSGF file into an internal representation of allowable phrases. I wish to write a C function that traverses that internal representation a given number of times, collects the words encountered, and dumps out sentences to an output file. Those would be guaranteed to be the kind of sentences spoken during regular use, increasing overall accuracy. My hope is that Pocketsphinx's libraries could be used for that purpose. The API seems to contain functions for graph traversal, but they seem to be usually called during recognition.
Please pardon the double post. Here is a more concise illustration:
.JSGF file --> PocketSphinx loads it & turns it into an internal grammar graph --> Loop across the graph (with appropriate cutoff values for repeating elements) and collect word sequences --> Each pass across the graph adds a new sentence to the transcription file.
Unfortunately there is no such tool in sphinxbase yet. You need to create it yourself, it's not very complex
First you convert JSGF to finite state graph (FSG) with jsgf_build_fsg call, then you randomly traverse the graph dumping words along the path with fsg_model_arcs, fsg_arciter_next. You can find description of the functions in sphinxbase API reference.
If you submit such a tool, it would be appreciated.
I am planning a proper code layout. In libsphinxbase/jsgf.h, there is the function:
fsg_model_t jsgf_build_fsg(jsgf_t grammar, jsgf_rule_t rule,
logmath_t lmath, float32 lw);
What are the meanings and intended uses of the last two args, *lmath and lw?
Weights in FSG graph are stored in log scale. Logmath object is used for conversion, you can create it with logmath_init and free later. Weights are prescaled with language weight to avoid scaling in runtime. You can use 1.0 as language weight if you are not going to use FSG in decoder together with acoustis model.
I just wanted to let you know of a workable solution using a Python script that I found and modified last night. I shall be testing under different Python versions for robustness. Github link will be posted here, once my current sysadmin workload will have been dealt with.
Thank you very much for your time and your lovely software,
I have just finished much of my sysadmin work. As promised, here is the Github link. However, it only works on Python 2.7, which is not too much of an issue since everyone has it:
Enjoy, and thanks again!
Great, thanks a lot. This is very useful.
Log in to post a comment.