I was going to expand my dictionary but encountered a problem. The LM tool at http://www.speech.cs.cmu.edu/tools/lmtool.html seems to be down. Do you know when the site will be done with maintenance? Thanks!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, is there a way to automate the generation of the tar files ? ie using curl, or sending some http post request with my corpus ? Would be great to have a REST ful API - or even better instructions on how to do this locally - using an internet connection when my robot wants to extend its vocabulary would be painfu, whats worse though is having to do it manually.
Thanks.
Marcus
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The page that returns with the results from the compilation has tags that
should make it easy to pick out the tar file.
Apart from just regexing it, something like beautifulsoup should make it
easy.
I'm note sure I make that obvious in the description...
quick_lm will give you a language model but not the dictionary. I know.
It's an issue. But the (rather old) code is licensed.
And I haven't had the perceived time to stick in an open source
pronunciation generator.
cmudict is on github; the lm compiler is not (I agree it should be). The
rest of the stuff is cgi code and also a binary for pronunciations. The
latter components are old and need to be replaced....
This article lists two other options for building language models. I haven't used them myself, but it sounds like they will give you the same result, albeit with a bit more effort.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The site and the computer were hacked. I'm in the process of
reconstituting the website and tool.
Unfortunately the damage to lmtool appears to have been made around 15
sept 2021 while the computer itself was disabled in early February. But the
code has been recovered. What's missing is the web page interfaces. We're
still looking for them in the backups. It's really close at this point (he
claims).
I apologize for slow progress. My time had been taken by seemingly
more pressing commitments. But I'm now free of those and have been spending
much of my time on this issue.
I should note that the repair is not just rolling things back but it
means closely checking what's there to make sure that no lurking damage
exists.
Questions for you all:
Do you normally use the web page interface? Or do you have script(s)
that directly access the tool? If the latter, I will release that fix first.
Is there something that 's been bugging you about lmtool? Now would be
a good time to tell us. If reasonable, it can be an upgrade.
Thank you all for your patience.
It's been gratifying to learn that people use this tool and that they
consider it as something that supports their work.
This article https://cmusphinx.github.io/wiki/tutoriallm/ lists two
other options for building language models. I haven't used them myself, but
it sounds like they will give you the same result, albeit with a bit more
effort.
It's been several years since I last used the tools, so I can't give you much feedback. I used them for two things:
I seem to remember that the LM generation scripts were somehow hosted on the site, allowing me to download and study them. It's surprisingly hard to find good (and simple!) code that generates language models, and those scripts served as a great starting point for writing my own, application-specific implementation.
I then used the output of the web interface as a kind of ground truth to test my own code.
I guess that's not a typical use case, so feel free to disregard this post! :-)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's been several years since I last used the tools, so I can't give you
much feedback. I used them for two things:
I seem to remember that the LM generation scripts were somehow
hosted on the site, allowing me to download and study them. It's
surprisingly hard to find good (and simple!) code that generates language
models, and those scripts served as a great starting point for writing my
own, application-specific implementation.
I then used the output of the web interface as a kind of ground
truth to test my own code.
I guess that's not a typical use case, so feel free to disregard this
post! :-)
We use the LM Tool from time to time when we add a new book to our app (articulate.xyz it’s an English pronunciation app). The LM and dictionary for the book gets sent to the app for users to practice reading.
We use both the web tool and scripts (depending on what we happen to be generating)
The code for quick_lm had been posted to the website. It's actually in the
cgi folder.
I will post the code to github as well.
Alex
On Tue, Apr 19, 2022 at 1:32 PM Daniel Wolf lupomuc@users.sourceforge.net
wrote:
It's been several years since I last used the tools, so I can't give you
much feedback. I used them for two things:
I seem to remember that the LM generation scripts were somehow
hosted on the site, allowing me to download and study them. It's
surprisingly hard to find good (and simple!) code that generates language
models, and those scripts served as a great starting point for writing my
own, application-specific implementation.
I then used the output of the web interface as a kind of ground
truth to test my own code.
I guess that's not a typical use case, so feel free to disregard this
post! :-)
I was going to expand my dictionary but encountered a problem. The LM tool at http://www.speech.cs.cmu.edu/tools/lmtool.html seems to be down. Do you know when the site will be done with maintenance? Thanks!
It tried to generate an lm file but I got the following error [ERRO] Problems with your corpus; cannot continue. Please check diagnostics [0 0]
You can use other tools mentioned in the tutorial:
http://cmusphinx.sourceforge.net/wiki/tutorial
For LM training see
http://cmusphinx.sourceforge.net/wiki/tutoriallm
The server hiccuped. It will happen; if you email to the address on the page, it will be taken care of...
Last edit: Nickolay V. Shmyrev 2014-12-10
The new Lm Model is working now
use this one
http://www.speech.cs.cmu.edu/tools/lmtool-new.html
This new tool is now producing the same error reported above, even for the example corpus file on
http://www.speech.cs.cmu.edu/tools/lmtool.html
Any idea what is wrong?
Dear Jesse. It happens that LM service goes offline, it will return back soon. Please be patient.
Meanwhile you can use any offline language modeling tools like SRILM to create language model from your corpus.
Dear Admin,
Has the LM tool service gone offline??
When will it be back ?
Regards
New LMTool link is working
http://www.speech.cs.cmu.edu/tools/lmtool-new.html
Hi, is there a way to automate the generation of the tar files ? ie using curl, or sending some http post request with my corpus ? Would be great to have a REST ful API - or even better instructions on how to do this locally - using an internet connection when my robot wants to extend its vocabulary would be painfu, whats worse though is having to do it manually.
Thanks.
Marcus
Tutorial contains links to offline software replacing lmtool - quick_lm.pl script and phonetisaurus.
The page that returns with the results from the compilation has tags that
should make it easy to pick out the tar file.
Apart from just regexing it, something like beautifulsoup should make it
easy.
I'm note sure I make that obvious in the description...
quick_lm will give you a language model but not the dictionary. I know.
It's an issue. But the (rather old) code is licensed.
And I haven't had the perceived time to stick in an open source
pronunciation generator.
On Wed, Jul 3, 2019 at 4:06 AM Nickolay V. Shmyrev nshmyrev@users.sourceforge.net wrote:
The LMTool site (http://www.speech.cs.cmu.edu/tools/lmtool-new.html) appears to be down. It would be great if it could be got up again!
The machine was hacked (and stuff wpied). The system will be restored using
other copies, but it's taking time.
Apologies for the mess.
Alex
On Fri, Mar 25, 2022 at 8:52 AM Daniel Wolf lupomuc@users.sourceforge.net
wrote:
Folks,
Thank you all for your patience!
lmtool is back up and appears stable. Please try it at
http://www.speech.cs.cmu.edu/tools/lmtool-new.html.
lextoo should be up shortly.
If you encounter any problems please post or email me directly.
Thank you,
Alex Rudnicky
On Fri, Mar 25, 2022 at 8:52 AM Daniel Wolf lupomuc@users.sourceforge.net
wrote:
Oh, that sucks. Good luck restoring the contents!
Is there a public repo with the source code? I did some searching, but all links pointed to the website only.
cmudict is on github; the lm compiler is not (I agree it should be). The
rest of the stuff is cgi code and also a binary for pronunciations. The
latter components are old and need to be replaced....
On Mon, Mar 28, 2022 at 1:48 AM Daniel Wolf lupomuc@users.sourceforge.net
wrote:
How can I build language model and dictionary if http://www.speech.cs.cmu.edu/tools/lmtool-new.html doesn't work?
This article lists two other options for building language models. I haven't used them myself, but it sounds like they will give you the same result, albeit with a bit more effort.
Folks,
reconstituting the website and tool.
sept 2021 while the computer itself was disabled in early February. But the
code has been recovered. What's missing is the web page interfaces. We're
still looking for them in the backups. It's really close at this point (he
claims).
more pressing commitments. But I'm now free of those and have been spending
much of my time on this issue.
means closely checking what's there to make sure that no lurking damage
exists.
Questions for you all:
that directly access the tool? If the latter, I will release that fix first.
a good time to tell us. If reasonable, it can be an upgrade.
Thank you all for your patience.
It's been gratifying to learn that people use this tool and that they
consider it as something that supports their work.
-- Alex Rudnicky
On Mon, Apr 18, 2022 at 2:22 PM Daniel Wolf lupomuc@users.sourceforge.net
wrote:
It's been several years since I last used the tools, so I can't give you much feedback. I used them for two things:
I guess that's not a typical use case, so feel free to disregard this post! :-)
The code for quick_lm had been posted to the website. It's actually in the
cgi folder.
I will post the code to github as well.
Alex
On Tue, Apr 19, 2022 at 1:32 PM Daniel Wolf lupomuc@users.sourceforge.net
wrote:
We use the LM Tool from time to time when we add a new book to our app (articulate.xyz it’s an English pronunciation app). The LM and dictionary for the book gets sent to the app for users to practice reading.
We use both the web tool and scripts (depending on what we happen to be generating)
Hope that’s useful.
Paul
Please excuse any spelling: Sent from my iPhone
http://www.speech.cs.cmu.edu/tools/lmtool-new.html is down down down.
@air when clicking Compile Knowledge Base, it directing to a Not Found Page.