This list is closed, nobody may subscribe to it.
2002 |
Jan
(29) |
Feb
(11) |
Mar
(134) |
Apr
(55) |
May
(91) |
Jun
(77) |
Jul
(46) |
Aug
(147) |
Sep
(96) |
Oct
(103) |
Nov
(130) |
Dec
(103) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(207) |
Feb
(132) |
Mar
(87) |
Apr
(183) |
May
(89) |
Jun
(44) |
Jul
(41) |
Aug
(22) |
Sep
(39) |
Oct
(37) |
Nov
(47) |
Dec
(8) |
2004 |
Jan
(22) |
Feb
(54) |
Mar
(52) |
Apr
(7) |
May
(47) |
Jun
(19) |
Jul
|
Aug
(43) |
Sep
(7) |
Oct
(5) |
Nov
|
Dec
|
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2007 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2008 |
Jan
|
Feb
|
Mar
(6) |
Apr
(20) |
May
|
Jun
(4) |
Jul
(2) |
Aug
(1) |
Sep
|
Oct
(17) |
Nov
(11) |
Dec
(14) |
2009 |
Jan
(3) |
Feb
(56) |
Mar
(19) |
Apr
(81) |
May
(24) |
Jun
|
Jul
|
Aug
(5) |
Sep
(57) |
Oct
(24) |
Nov
(34) |
Dec
(56) |
2010 |
Jan
(15) |
Feb
(42) |
Mar
(79) |
Apr
(45) |
May
(27) |
Jun
(41) |
Jul
(69) |
Aug
(14) |
Sep
(18) |
Oct
(19) |
Nov
(33) |
Dec
(25) |
2011 |
Jan
(4) |
Feb
(22) |
Mar
(53) |
Apr
(47) |
May
(29) |
Jun
(18) |
Jul
(18) |
Aug
(2) |
Sep
(6) |
Oct
(32) |
Nov
(18) |
Dec
|
2012 |
Jan
(8) |
Feb
(12) |
Mar
(62) |
Apr
(37) |
May
(22) |
Jun
(8) |
Jul
(3) |
Aug
(3) |
Sep
(2) |
Oct
(2) |
Nov
(28) |
Dec
(19) |
2013 |
Jan
(28) |
Feb
(24) |
Mar
(8) |
Apr
(17) |
May
(6) |
Jun
(14) |
Jul
(6) |
Aug
(5) |
Sep
(17) |
Oct
(12) |
Nov
(24) |
Dec
(16) |
2014 |
Jan
(13) |
Feb
(9) |
Mar
(2) |
Apr
(3) |
May
|
Jun
(2) |
Jul
(6) |
Aug
(6) |
Sep
(28) |
Oct
(4) |
Nov
(1) |
Dec
(4) |
2015 |
Jan
(5) |
Feb
(11) |
Mar
(6) |
Apr
(10) |
May
(9) |
Jun
(13) |
Jul
(5) |
Aug
(10) |
Sep
(13) |
Oct
(4) |
Nov
(14) |
Dec
(14) |
2016 |
Jan
(32) |
Feb
(33) |
Mar
(37) |
Apr
(6) |
May
(7) |
Jun
(11) |
Jul
(3) |
Aug
(2) |
Sep
(9) |
Oct
(11) |
Nov
(16) |
Dec
(3) |
2017 |
Jan
(3) |
Feb
(15) |
Mar
(41) |
Apr
(10) |
May
(8) |
Jun
(4) |
Jul
|
Aug
(63) |
Sep
|
Oct
(7) |
Nov
(2) |
Dec
(3) |
2018 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(3) |
Jun
(15) |
Jul
|
Aug
|
Sep
(2) |
Oct
(2) |
Nov
(1) |
Dec
|
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(1) |
Aug
(6) |
Sep
(3) |
Oct
(2) |
Nov
|
Dec
|
2020 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
From: Kevin L. <kev...@gm...> - 2018-06-30 23:47:20
|
You have always been one of the good ones. Kevin > On Jun 30, 2018, at 6:23 PM, James Salsman <jsa...@gm...> wrote: > > Dr. Rudnicky, > > If I had a command-and-control and/or dictation test suite I would > measure the empirical question myself, but I don't, so I hope you, or > anyone here who does have such test suites, will take a closer look at > the question. I've never been interested in command-and-control or > dictation. There are (more labor-intensive?) ways to measure the > change for synthesis, too, which would be absolutely necessary along > with the other aspects Kevin mentioned, before changing it for > everyone. > > Dr. Horacio Franco at SRI, after having set the formula for "Goodness > of Pronunciation" (often "GoP") around 1996 with Drs. Jared Bernstein > et al., which was taken up in turn by Prof. Steve Young's group at > Cambridge e.g. in Dr. Silke Witt's 1999/2000 thesis, is now > contemplating extraction of articulatory features early in the signal > processing path: > https://www.researchgate.net/publication/325570699_Articulatory_Features_for_ASR_of_Pathological_Speech > > We really need to help everyone who ever used that formula, which I > suspect has a physical units' type mismatch error, because of larger > societal effects such as these: > > https://www.theguardian.com/australia-news/2017/aug/08/computer-says-no-irish-vet-fails-oral-english-test-needed-to-stay-in-australia > > https://www.ft.com/content/dc7faee2-51e5-11e8-b3ee-41e0209208ec > > These are very serious manifestations of a software bug, affecting the > immigration status of tens of thousands of people. > > Best regards, > Jim > >> On Sat, Jun 30, 2018 at 4:13 PM, Alexander Rudnicky <ai...@cs...> wrote: >> Uh, I have been maintaining it, more or less. It's now at (I think) version 0.7b. The versions get bumped whenever I feel I've put in some big chunk of fixing. There's minor maintenance ongoing. >> >> Apart from adding words (harvested from words that show up in lmtool uploads that are backed off to LtoS, making sure that the 10k most-frequent words are in there, cleaning out some of the crap, etc) and fixing stuff people send in. I occasionally put time into regularizations of one kind or another. Also, trying to figure out how to deal with Nickolay's bootleg version. And so on. >> >> I've had to fix your script at http://www.speech.cs.cmu.edu/cgi-bin/cmudict, since perl had evolved too much. >> >> There are also ongoing requests for the same resource in other languages (e.g. Spanish). Someday. >> >> It's been more or less on a hobby level, though at some point I did put together an NSF proposal to get more (people) resources for it. But then got busy with some other stuff. I should get back top that... >> >> Alex >> >> >> >> >> -----Original Message----- >> From: Kevin Lenzo <kev...@gm...> >> Sent: Saturday, June 30, 2018 5:15 PM >> To: Alexander Rudnicky <ai...@cs...> >> Cc: James Salsman <jsa...@gm...>; cmusphinx-devel <cmu...@li...> >> Subject: Re: [Cmusphinx-devel] diphthong removal >> >> Just following up here, I wish this really important resource would be updated. CMU did something really important here, and it is still use A LOT, but hasn’t seen a serious consideration and version update in almost 20 years. It would be nice if CMU recognized its leadership here and really pushed ahead with publicly available lexica. >> >> oak >> >>> On Jun 30, 2018, at 4:29 PM, Kevin Lenzo <kev...@gm...> wrote: >>> >>> Hi Alex, >>> >>> I’m sure you’re aware that the CMU Pronouncing Dictionary is used for more than ASR. For example, Festival and FestVox both use it, and many voices available on Android today come from that. >>> >>> If we are considering changes to the dictionary itself, it is worth considering the linguistic value, as well as the multiple clients of this data. To make such a change would have unnecessary impact on these downstream uses of the data, favoring one perhaps temporary result of a particular client’s application. >>> >>> Do whatever you want with the data — that’s the whole point of the liberal licensing scheme; but consider carefully the impact on the overall use and importance of the fundamental resource. >>> >>> oak >>> >>>> On Jun 30, 2018, at 10:16 AM, Alexander Rudnicky <ai...@cs...> wrote: >>>> >>>> Folks, >>>> >>>> Cmudict is intended for use in ASR. The set of phonetic units evolved to fit that need. The original phone set had stuff like DX, the dental flap, that eventually disappeared. An important reason was the realization that context is highly predictive of phonetic realization and that it's not necessary to explicitly code allophones (assuming you can even to it right). >>>> >>>> The diphthong issue is a bit different of course. I would hazard that the diphthong provides a better context scope for reco. So diphthongs might still be useful as a unit. To me this seems enough of a reason. But acoustic modeling has evolved. So who knows? >>>> >>>> I agree with James, though. It's an empirical question. Try it both ways and see what happens. There is also TTS and other applications that might benefit; ideally there's a deterministic mapping from the base ASR representation to the others. >>>> >>>> Alex >>>> >>>> -----Original Message----- >>>> From: James Salsman <jsa...@gm...> >>>> Sent: Saturday, June 30, 2018 9:54 AM >>>> To: cmusphinx-devel <cmu...@li...>; Kevin >>>> Lenzo <kev...@gm...> >>>> Subject: Re: [Cmusphinx-devel] diphthong removal >>>> >>>> Kevin, >>>> >>>> Thank you for your reply. Your name is still at the bottom of the CMUDICT page, so I figure it's worth as much trying to convince you of this, whether I end up using it locally or otherwise. >>>> >>>>> In IPA, the diphthongs are indicated by a tie marker. [Replacing >>>>> them with consecutive phonemes] loses information >>>> >>>> I would like to measure how much information is involved there. I'm not convinced it's positive. It's certainly much smaller than the amount of information available from nondiphthong phonemes about which diphones a learner is able to articulate. >>>> >>>> There is a physiological argument that diphthongs, as the only nonstationary articulations, are transitions instead of positions in articulation space. Whether this actually provides information is very questionable, because such data compression is either lossy or lossless, not enhancing. Therefore at best, no information could be added. Such coding mistakes are common in natural languages. >>>> >>>> However, the /ʊ/ vs. /ɜ/, /ɛ/ vs. /e/, /ɔ/ and /o/ substitutions available in the phoneme set would unquestionably represent a loss of some information, and I'd like to figure out how to measure how much. >>>> I think it's pretty easy -- just make two -dict's and measure the relative accuracy, speed, etc. on some sufficiently broad test suite. >>>> People on this list do that kind of thing all the time, right? >>>> >>>> Best regards, >>>> Jim >>>> >>>>> On Fri, Jun 29, 2018 at 7:26 PM, Kevin Lenzo <kev...@gm...> wrote: >>>>> James, >>>>> >>>>> I don’t control this or update it these days, but there are phonological reasons for not doing this in the dictionary. There is a significant phonetic distinction between a diphthong and two consecutive phones. In IPA, the diphthongs are indicated by a tie marker. Your suggestion loses information and so is not a good idea for the general dictionary. >>>>> >>>>> Kevin >>>>> >>>>>> On Jun 29, 2018, at 4:13 PM, James Salsman <jsa...@gm...> wrote: >>>>>> >>>>>> Kevin, >>>>>> >>>>>> I need to use diphones for learner analytics to complete >>>>>> http://j.mp/irslides >>>>>> >>>>>> How do you feel about eliminating diphthongs from CMUDICT? >>>>>> >>>>>> AW -> AA UH >>>>>> AY -> AA IY >>>>>> ER -> UH R (substitutes /ʊ/ for /ɜ/) EY -> EH IY (substitutes /ɛ/ >>>>>> for /e/) OW -> AO UH (substitutes /ɔ/ for /o/) OY -> AO IY >>>>>> >>>>>> I'm going to do it, locally in something called Cloud Firestore >>>>>> anyway, but I wonder whether there are any good reasons to support >>>>>> diphthongs at all. Like ligatures in typography, a lot of simple >>>>>> algorithms don't stay simple if they need to process the composites. >>>>>> And it's certainly less elegant and less parsimonious, and probably >>>>>> a violation of some rule of normal forms to support them. >>>>>> >>>>>> Best regards, >>>>>> Jim >>>> >>>> --------------------------------------------------------------------- >>>> --------- Check out the vibrant tech community on one of the world's >>>> most engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>> _______________________________________________ >>>> Cmusphinx-devel mailing list >>>> Cmu...@li... >>>> https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel >> |
From: Alexander R. <ai...@cs...> - 2018-06-30 22:40:37
|
Uh, I have been maintaining it, more or less. It's now at (I think) version 0.7b. The versions get bumped whenever I feel I've put in some big chunk of fixing. There's minor maintenance ongoing. Apart from adding words (harvested from words that show up in lmtool uploads that are backed off to LtoS, making sure that the 10k most-frequent words are in there, cleaning out some of the crap, etc) and fixing stuff people send in. I occasionally put time into regularizations of one kind or another. Also, trying to figure out how to deal with Nickolay's bootleg version. And so on. I've had to fix your script at http://www.speech.cs.cmu.edu/cgi-bin/cmudict, since perl had evolved too much. There are also ongoing requests for the same resource in other languages (e.g. Spanish). Someday. It's been more or less on a hobby level, though at some point I did put together an NSF proposal to get more (people) resources for it. But then got busy with some other stuff. I should get back top that... Alex -----Original Message----- From: Kevin Lenzo <kev...@gm...> Sent: Saturday, June 30, 2018 5:15 PM To: Alexander Rudnicky <ai...@cs...> Cc: James Salsman <jsa...@gm...>; cmusphinx-devel <cmu...@li...> Subject: Re: [Cmusphinx-devel] diphthong removal Just following up here, I wish this really important resource would be updated. CMU did something really important here, and it is still use A LOT, but hasn’t seen a serious consideration and version update in almost 20 years. It would be nice if CMU recognized its leadership here and really pushed ahead with publicly available lexica. oak > On Jun 30, 2018, at 4:29 PM, Kevin Lenzo <kev...@gm...> wrote: > > Hi Alex, > > I’m sure you’re aware that the CMU Pronouncing Dictionary is used for more than ASR. For example, Festival and FestVox both use it, and many voices available on Android today come from that. > > If we are considering changes to the dictionary itself, it is worth considering the linguistic value, as well as the multiple clients of this data. To make such a change would have unnecessary impact on these downstream uses of the data, favoring one perhaps temporary result of a particular client’s application. > > Do whatever you want with the data — that’s the whole point of the liberal licensing scheme; but consider carefully the impact on the overall use and importance of the fundamental resource. > > oak > >> On Jun 30, 2018, at 10:16 AM, Alexander Rudnicky <ai...@cs...> wrote: >> >> Folks, >> >> Cmudict is intended for use in ASR. The set of phonetic units evolved to fit that need. The original phone set had stuff like DX, the dental flap, that eventually disappeared. An important reason was the realization that context is highly predictive of phonetic realization and that it's not necessary to explicitly code allophones (assuming you can even to it right). >> >> The diphthong issue is a bit different of course. I would hazard that the diphthong provides a better context scope for reco. So diphthongs might still be useful as a unit. To me this seems enough of a reason. But acoustic modeling has evolved. So who knows? >> >> I agree with James, though. It's an empirical question. Try it both ways and see what happens. There is also TTS and other applications that might benefit; ideally there's a deterministic mapping from the base ASR representation to the others. >> >> Alex >> >> -----Original Message----- >> From: James Salsman <jsa...@gm...> >> Sent: Saturday, June 30, 2018 9:54 AM >> To: cmusphinx-devel <cmu...@li...>; Kevin >> Lenzo <kev...@gm...> >> Subject: Re: [Cmusphinx-devel] diphthong removal >> >> Kevin, >> >> Thank you for your reply. Your name is still at the bottom of the CMUDICT page, so I figure it's worth as much trying to convince you of this, whether I end up using it locally or otherwise. >> >>> In IPA, the diphthongs are indicated by a tie marker. [Replacing >>> them with consecutive phonemes] loses information >> >> I would like to measure how much information is involved there. I'm not convinced it's positive. It's certainly much smaller than the amount of information available from nondiphthong phonemes about which diphones a learner is able to articulate. >> >> There is a physiological argument that diphthongs, as the only nonstationary articulations, are transitions instead of positions in articulation space. Whether this actually provides information is very questionable, because such data compression is either lossy or lossless, not enhancing. Therefore at best, no information could be added. Such coding mistakes are common in natural languages. >> >> However, the /ʊ/ vs. /ɜ/, /ɛ/ vs. /e/, /ɔ/ and /o/ substitutions available in the phoneme set would unquestionably represent a loss of some information, and I'd like to figure out how to measure how much. >> I think it's pretty easy -- just make two -dict's and measure the relative accuracy, speed, etc. on some sufficiently broad test suite. >> People on this list do that kind of thing all the time, right? >> >> Best regards, >> Jim >> >>> On Fri, Jun 29, 2018 at 7:26 PM, Kevin Lenzo <kev...@gm...> wrote: >>> James, >>> >>> I don’t control this or update it these days, but there are phonological reasons for not doing this in the dictionary. There is a significant phonetic distinction between a diphthong and two consecutive phones. In IPA, the diphthongs are indicated by a tie marker. Your suggestion loses information and so is not a good idea for the general dictionary. >>> >>> Kevin >>> >>>> On Jun 29, 2018, at 4:13 PM, James Salsman <jsa...@gm...> wrote: >>>> >>>> Kevin, >>>> >>>> I need to use diphones for learner analytics to complete >>>> http://j.mp/irslides >>>> >>>> How do you feel about eliminating diphthongs from CMUDICT? >>>> >>>> AW -> AA UH >>>> AY -> AA IY >>>> ER -> UH R (substitutes /ʊ/ for /ɜ/) EY -> EH IY (substitutes /ɛ/ >>>> for /e/) OW -> AO UH (substitutes /ɔ/ for /o/) OY -> AO IY >>>> >>>> I'm going to do it, locally in something called Cloud Firestore >>>> anyway, but I wonder whether there are any good reasons to support >>>> diphthongs at all. Like ligatures in typography, a lot of simple >>>> algorithms don't stay simple if they need to process the composites. >>>> And it's certainly less elegant and less parsimonious, and probably >>>> a violation of some rule of normal forms to support them. >>>> >>>> Best regards, >>>> Jim >> >> --------------------------------------------------------------------- >> --------- Check out the vibrant tech community on one of the world's >> most engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> Cmusphinx-devel mailing list >> Cmu...@li... >> https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel |
From: James S. <jsa...@gm...> - 2018-06-30 22:23:33
|
Dr. Rudnicky, If I had a command-and-control and/or dictation test suite I would measure the empirical question myself, but I don't, so I hope you, or anyone here who does have such test suites, will take a closer look at the question. I've never been interested in command-and-control or dictation. There are (more labor-intensive?) ways to measure the change for synthesis, too, which would be absolutely necessary along with the other aspects Kevin mentioned, before changing it for everyone. Dr. Horacio Franco at SRI, after having set the formula for "Goodness of Pronunciation" (often "GoP") around 1996 with Drs. Jared Bernstein et al., which was taken up in turn by Prof. Steve Young's group at Cambridge e.g. in Dr. Silke Witt's 1999/2000 thesis, is now contemplating extraction of articulatory features early in the signal processing path: https://www.researchgate.net/publication/325570699_Articulatory_Features_for_ASR_of_Pathological_Speech We really need to help everyone who ever used that formula, which I suspect has a physical units' type mismatch error, because of larger societal effects such as these: https://www.theguardian.com/australia-news/2017/aug/08/computer-says-no-irish-vet-fails-oral-english-test-needed-to-stay-in-australia https://www.ft.com/content/dc7faee2-51e5-11e8-b3ee-41e0209208ec These are very serious manifestations of a software bug, affecting the immigration status of tens of thousands of people. Best regards, Jim On Sat, Jun 30, 2018 at 4:13 PM, Alexander Rudnicky <ai...@cs...> wrote: > Uh, I have been maintaining it, more or less. It's now at (I think) version 0.7b. The versions get bumped whenever I feel I've put in some big chunk of fixing. There's minor maintenance ongoing. > > Apart from adding words (harvested from words that show up in lmtool uploads that are backed off to LtoS, making sure that the 10k most-frequent words are in there, cleaning out some of the crap, etc) and fixing stuff people send in. I occasionally put time into regularizations of one kind or another. Also, trying to figure out how to deal with Nickolay's bootleg version. And so on. > > I've had to fix your script at http://www.speech.cs.cmu.edu/cgi-bin/cmudict, since perl had evolved too much. > > There are also ongoing requests for the same resource in other languages (e.g. Spanish). Someday. > > It's been more or less on a hobby level, though at some point I did put together an NSF proposal to get more (people) resources for it. But then got busy with some other stuff. I should get back top that... > > Alex > > > > > -----Original Message----- > From: Kevin Lenzo <kev...@gm...> > Sent: Saturday, June 30, 2018 5:15 PM > To: Alexander Rudnicky <ai...@cs...> > Cc: James Salsman <jsa...@gm...>; cmusphinx-devel <cmu...@li...> > Subject: Re: [Cmusphinx-devel] diphthong removal > > Just following up here, I wish this really important resource would be updated. CMU did something really important here, and it is still use A LOT, but hasn’t seen a serious consideration and version update in almost 20 years. It would be nice if CMU recognized its leadership here and really pushed ahead with publicly available lexica. > > oak > >> On Jun 30, 2018, at 4:29 PM, Kevin Lenzo <kev...@gm...> wrote: >> >> Hi Alex, >> >> I’m sure you’re aware that the CMU Pronouncing Dictionary is used for more than ASR. For example, Festival and FestVox both use it, and many voices available on Android today come from that. >> >> If we are considering changes to the dictionary itself, it is worth considering the linguistic value, as well as the multiple clients of this data. To make such a change would have unnecessary impact on these downstream uses of the data, favoring one perhaps temporary result of a particular client’s application. >> >> Do whatever you want with the data — that’s the whole point of the liberal licensing scheme; but consider carefully the impact on the overall use and importance of the fundamental resource. >> >> oak >> >>> On Jun 30, 2018, at 10:16 AM, Alexander Rudnicky <ai...@cs...> wrote: >>> >>> Folks, >>> >>> Cmudict is intended for use in ASR. The set of phonetic units evolved to fit that need. The original phone set had stuff like DX, the dental flap, that eventually disappeared. An important reason was the realization that context is highly predictive of phonetic realization and that it's not necessary to explicitly code allophones (assuming you can even to it right). >>> >>> The diphthong issue is a bit different of course. I would hazard that the diphthong provides a better context scope for reco. So diphthongs might still be useful as a unit. To me this seems enough of a reason. But acoustic modeling has evolved. So who knows? >>> >>> I agree with James, though. It's an empirical question. Try it both ways and see what happens. There is also TTS and other applications that might benefit; ideally there's a deterministic mapping from the base ASR representation to the others. >>> >>> Alex >>> >>> -----Original Message----- >>> From: James Salsman <jsa...@gm...> >>> Sent: Saturday, June 30, 2018 9:54 AM >>> To: cmusphinx-devel <cmu...@li...>; Kevin >>> Lenzo <kev...@gm...> >>> Subject: Re: [Cmusphinx-devel] diphthong removal >>> >>> Kevin, >>> >>> Thank you for your reply. Your name is still at the bottom of the CMUDICT page, so I figure it's worth as much trying to convince you of this, whether I end up using it locally or otherwise. >>> >>>> In IPA, the diphthongs are indicated by a tie marker. [Replacing >>>> them with consecutive phonemes] loses information >>> >>> I would like to measure how much information is involved there. I'm not convinced it's positive. It's certainly much smaller than the amount of information available from nondiphthong phonemes about which diphones a learner is able to articulate. >>> >>> There is a physiological argument that diphthongs, as the only nonstationary articulations, are transitions instead of positions in articulation space. Whether this actually provides information is very questionable, because such data compression is either lossy or lossless, not enhancing. Therefore at best, no information could be added. Such coding mistakes are common in natural languages. >>> >>> However, the /ʊ/ vs. /ɜ/, /ɛ/ vs. /e/, /ɔ/ and /o/ substitutions available in the phoneme set would unquestionably represent a loss of some information, and I'd like to figure out how to measure how much. >>> I think it's pretty easy -- just make two -dict's and measure the relative accuracy, speed, etc. on some sufficiently broad test suite. >>> People on this list do that kind of thing all the time, right? >>> >>> Best regards, >>> Jim >>> >>>> On Fri, Jun 29, 2018 at 7:26 PM, Kevin Lenzo <kev...@gm...> wrote: >>>> James, >>>> >>>> I don’t control this or update it these days, but there are phonological reasons for not doing this in the dictionary. There is a significant phonetic distinction between a diphthong and two consecutive phones. In IPA, the diphthongs are indicated by a tie marker. Your suggestion loses information and so is not a good idea for the general dictionary. >>>> >>>> Kevin >>>> >>>>> On Jun 29, 2018, at 4:13 PM, James Salsman <jsa...@gm...> wrote: >>>>> >>>>> Kevin, >>>>> >>>>> I need to use diphones for learner analytics to complete >>>>> http://j.mp/irslides >>>>> >>>>> How do you feel about eliminating diphthongs from CMUDICT? >>>>> >>>>> AW -> AA UH >>>>> AY -> AA IY >>>>> ER -> UH R (substitutes /ʊ/ for /ɜ/) EY -> EH IY (substitutes /ɛ/ >>>>> for /e/) OW -> AO UH (substitutes /ɔ/ for /o/) OY -> AO IY >>>>> >>>>> I'm going to do it, locally in something called Cloud Firestore >>>>> anyway, but I wonder whether there are any good reasons to support >>>>> diphthongs at all. Like ligatures in typography, a lot of simple >>>>> algorithms don't stay simple if they need to process the composites. >>>>> And it's certainly less elegant and less parsimonious, and probably >>>>> a violation of some rule of normal forms to support them. >>>>> >>>>> Best regards, >>>>> Jim >>> >>> --------------------------------------------------------------------- >>> --------- Check out the vibrant tech community on one of the world's >>> most engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> _______________________________________________ >>> Cmusphinx-devel mailing list >>> Cmu...@li... >>> https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel > |
From: Kevin L. <kev...@gm...> - 2018-06-30 21:29:52
|
Hi James — > On Jun 30, 2018, at 9:54 AM, James Salsman <jsa...@gm...> wrote: > > Kevin, > > Thank you for your reply. Your name is still at the bottom of the > CMUDICT page, so I figure it's worth as much trying to convince you of > this, whether I end up using it locally or otherwise Once you’re gone, you can’t go back and edit the pages, so you can’t remove yourself :) I get a lot of contact on this and some other things even now, though I lost the ability to edit them a long time ago. Thanks asking though — my opinions are perhaps the least significant these days though. >> In IPA, the diphthongs are indicated by a tie marker. [Replacing >> them with consecutive phonemes] loses information > > I would like to measure how much information is involved there. I'm > not convinced it's positive. It's certainly much smaller than the > amount of information available from nondiphthong phonemes about which > diphones a learner is able to articulate. There is perhaps a place between what is measurable in this experiment, and what is necessary to represent linguistic information for realization. While recognition, and even perhaps understanding as a consequence, may depend on hearing a variety of realizations and figuring out they are likely to be one in particular, the generation of spoken output must be consistent and correct for a specific realization. If your application does better with this split, do the split — easy peasy. > There is a physiological argument that diphthongs, as the only > nonstationary articulations, are transitions instead of positions in > articulation space. Whether this actually provides information is very > questionable, because such data compression is either lossy or > lossless, not enhancing. Therefore at best, no information could be > added. Such coding mistakes are common in natural languages. Valid points for conference papers. > However, the /ʊ/ vs. /ɜ/, /ɛ/ vs. /e/, /ɔ/ and /o/ substitutions > available in the phoneme set would unquestionably represent a loss of > some information, and I'd like to figure out how to measure how much. > I think it's pretty easy -- just make two -dict's and measure the > relative accuracy, speed, etc. on some sufficiently broad test suite. > People on this list do that kind of thing all the time, right? Aye. > Best regards, > Jim Best, oak > On Fri, Jun 29, 2018 at 7:26 PM, Kevin Lenzo <kev...@gm...> wrote: >> James, >> >> I don’t control this or update it these days, but there are phonological reasons for not doing this in the dictionary. There is a significant phonetic distinction between a diphthong and two consecutive phones. In IPA, the diphthongs are indicated by a tie marker. Your suggestion loses information and so is not a good idea for the general dictionary. >> >> Kevin >> >>> On Jun 29, 2018, at 4:13 PM, James Salsman <jsa...@gm...> wrote: >>> >>> Kevin, >>> >>> I need to use diphones for learner analytics to complete http://j.mp/irslides >>> >>> How do you feel about eliminating diphthongs from CMUDICT? >>> >>> AW -> AA UH >>> AY -> AA IY >>> ER -> UH R (substitutes /ʊ/ for /ɜ/) >>> EY -> EH IY (substitutes /ɛ/ for /e/) >>> OW -> AO UH (substitutes /ɔ/ for /o/) >>> OY -> AO IY >>> >>> I'm going to do it, locally in something called Cloud Firestore >>> anyway, but I wonder whether there are any good reasons to support >>> diphthongs at all. Like ligatures in typography, a lot of simple >>> algorithms don't stay simple if they need to process the composites. >>> And it's certainly less elegant and less parsimonious, and probably a >>> violation of some rule of normal forms to support them. >>> >>> Best regards, >>> Jim |
From: Kevin L. <kev...@gm...> - 2018-06-30 21:15:11
|
Just following up here, I wish this really important resource would be updated. CMU did something really important here, and it is still use A LOT, but hasn’t seen a serious consideration and version update in almost 20 years. It would be nice if CMU recognized its leadership here and really pushed ahead with publicly available lexica. oak > On Jun 30, 2018, at 4:29 PM, Kevin Lenzo <kev...@gm...> wrote: > > Hi Alex, > > I’m sure you’re aware that the CMU Pronouncing Dictionary is used for more than ASR. For example, Festival and FestVox both use it, and many voices available on Android today come from that. > > If we are considering changes to the dictionary itself, it is worth considering the linguistic value, as well as the multiple clients of this data. To make such a change would have unnecessary impact on these downstream uses of the data, favoring one perhaps temporary result of a particular client’s application. > > Do whatever you want with the data — that’s the whole point of the liberal licensing scheme; but consider carefully the impact on the overall use and importance of the fundamental resource. > > oak > >> On Jun 30, 2018, at 10:16 AM, Alexander Rudnicky <ai...@cs...> wrote: >> >> Folks, >> >> Cmudict is intended for use in ASR. The set of phonetic units evolved to fit that need. The original phone set had stuff like DX, the dental flap, that eventually disappeared. An important reason was the realization that context is highly predictive of phonetic realization and that it's not necessary to explicitly code allophones (assuming you can even to it right). >> >> The diphthong issue is a bit different of course. I would hazard that the diphthong provides a better context scope for reco. So diphthongs might still be useful as a unit. To me this seems enough of a reason. But acoustic modeling has evolved. So who knows? >> >> I agree with James, though. It's an empirical question. Try it both ways and see what happens. There is also TTS and other applications that might benefit; ideally there's a deterministic mapping from the base ASR representation to the others. >> >> Alex >> >> -----Original Message----- >> From: James Salsman <jsa...@gm...> >> Sent: Saturday, June 30, 2018 9:54 AM >> To: cmusphinx-devel <cmu...@li...>; Kevin Lenzo <kev...@gm...> >> Subject: Re: [Cmusphinx-devel] diphthong removal >> >> Kevin, >> >> Thank you for your reply. Your name is still at the bottom of the CMUDICT page, so I figure it's worth as much trying to convince you of this, whether I end up using it locally or otherwise. >> >>> In IPA, the diphthongs are indicated by a tie marker. [Replacing them >>> with consecutive phonemes] loses information >> >> I would like to measure how much information is involved there. I'm not convinced it's positive. It's certainly much smaller than the amount of information available from nondiphthong phonemes about which diphones a learner is able to articulate. >> >> There is a physiological argument that diphthongs, as the only nonstationary articulations, are transitions instead of positions in articulation space. Whether this actually provides information is very questionable, because such data compression is either lossy or lossless, not enhancing. Therefore at best, no information could be added. Such coding mistakes are common in natural languages. >> >> However, the /ʊ/ vs. /ɜ/, /ɛ/ vs. /e/, /ɔ/ and /o/ substitutions available in the phoneme set would unquestionably represent a loss of some information, and I'd like to figure out how to measure how much. >> I think it's pretty easy -- just make two -dict's and measure the relative accuracy, speed, etc. on some sufficiently broad test suite. >> People on this list do that kind of thing all the time, right? >> >> Best regards, >> Jim >> >>> On Fri, Jun 29, 2018 at 7:26 PM, Kevin Lenzo <kev...@gm...> wrote: >>> James, >>> >>> I don’t control this or update it these days, but there are phonological reasons for not doing this in the dictionary. There is a significant phonetic distinction between a diphthong and two consecutive phones. In IPA, the diphthongs are indicated by a tie marker. Your suggestion loses information and so is not a good idea for the general dictionary. >>> >>> Kevin >>> >>>> On Jun 29, 2018, at 4:13 PM, James Salsman <jsa...@gm...> wrote: >>>> >>>> Kevin, >>>> >>>> I need to use diphones for learner analytics to complete >>>> http://j.mp/irslides >>>> >>>> How do you feel about eliminating diphthongs from CMUDICT? >>>> >>>> AW -> AA UH >>>> AY -> AA IY >>>> ER -> UH R (substitutes /ʊ/ for /ɜ/) EY -> EH IY (substitutes /ɛ/ >>>> for /e/) OW -> AO UH (substitutes /ɔ/ for /o/) OY -> AO IY >>>> >>>> I'm going to do it, locally in something called Cloud Firestore >>>> anyway, but I wonder whether there are any good reasons to support >>>> diphthongs at all. Like ligatures in typography, a lot of simple >>>> algorithms don't stay simple if they need to process the composites. >>>> And it's certainly less elegant and less parsimonious, and probably a >>>> violation of some rule of normal forms to support them. >>>> >>>> Best regards, >>>> Jim >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ >> Cmusphinx-devel mailing list >> Cmu...@li... >> https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel |
From: Kevin L. <kev...@gm...> - 2018-06-30 20:29:35
|
Hi Alex, I’m sure you’re aware that the CMU Pronouncing Dictionary is used for more than ASR. For example, Festival and FestVox both use it, and many voices available on Android today come from that. If we are considering changes to the dictionary itself, it is worth considering the linguistic value, as well as the multiple clients of this data. To make such a change would have unnecessary impact on these downstream uses of the data, favoring one perhaps temporary result of a particular client’s application. Do whatever you want with the data — that’s the whole point of the liberal licensing scheme; but consider carefully the impact on the overall use and importance of the fundamental resource. oak > On Jun 30, 2018, at 10:16 AM, Alexander Rudnicky <ai...@cs...> wrote: > > Folks, > > Cmudict is intended for use in ASR. The set of phonetic units evolved to fit that need. The original phone set had stuff like DX, the dental flap, that eventually disappeared. An important reason was the realization that context is highly predictive of phonetic realization and that it's not necessary to explicitly code allophones (assuming you can even to it right). > > The diphthong issue is a bit different of course. I would hazard that the diphthong provides a better context scope for reco. So diphthongs might still be useful as a unit. To me this seems enough of a reason. But acoustic modeling has evolved. So who knows? > > I agree with James, though. It's an empirical question. Try it both ways and see what happens. There is also TTS and other applications that might benefit; ideally there's a deterministic mapping from the base ASR representation to the others. > > Alex > > -----Original Message----- > From: James Salsman <jsa...@gm...> > Sent: Saturday, June 30, 2018 9:54 AM > To: cmusphinx-devel <cmu...@li...>; Kevin Lenzo <kev...@gm...> > Subject: Re: [Cmusphinx-devel] diphthong removal > > Kevin, > > Thank you for your reply. Your name is still at the bottom of the CMUDICT page, so I figure it's worth as much trying to convince you of this, whether I end up using it locally or otherwise. > >> In IPA, the diphthongs are indicated by a tie marker. [Replacing them >> with consecutive phonemes] loses information > > I would like to measure how much information is involved there. I'm not convinced it's positive. It's certainly much smaller than the amount of information available from nondiphthong phonemes about which diphones a learner is able to articulate. > > There is a physiological argument that diphthongs, as the only nonstationary articulations, are transitions instead of positions in articulation space. Whether this actually provides information is very questionable, because such data compression is either lossy or lossless, not enhancing. Therefore at best, no information could be added. Such coding mistakes are common in natural languages. > > However, the /ʊ/ vs. /ɜ/, /ɛ/ vs. /e/, /ɔ/ and /o/ substitutions available in the phoneme set would unquestionably represent a loss of some information, and I'd like to figure out how to measure how much. > I think it's pretty easy -- just make two -dict's and measure the relative accuracy, speed, etc. on some sufficiently broad test suite. > People on this list do that kind of thing all the time, right? > > Best regards, > Jim > >> On Fri, Jun 29, 2018 at 7:26 PM, Kevin Lenzo <kev...@gm...> wrote: >> James, >> >> I don’t control this or update it these days, but there are phonological reasons for not doing this in the dictionary. There is a significant phonetic distinction between a diphthong and two consecutive phones. In IPA, the diphthongs are indicated by a tie marker. Your suggestion loses information and so is not a good idea for the general dictionary. >> >> Kevin >> >>> On Jun 29, 2018, at 4:13 PM, James Salsman <jsa...@gm...> wrote: >>> >>> Kevin, >>> >>> I need to use diphones for learner analytics to complete >>> http://j.mp/irslides >>> >>> How do you feel about eliminating diphthongs from CMUDICT? >>> >>> AW -> AA UH >>> AY -> AA IY >>> ER -> UH R (substitutes /ʊ/ for /ɜ/) EY -> EH IY (substitutes /ɛ/ >>> for /e/) OW -> AO UH (substitutes /ɔ/ for /o/) OY -> AO IY >>> >>> I'm going to do it, locally in something called Cloud Firestore >>> anyway, but I wonder whether there are any good reasons to support >>> diphthongs at all. Like ligatures in typography, a lot of simple >>> algorithms don't stay simple if they need to process the composites. >>> And it's certainly less elegant and less parsimonious, and probably a >>> violation of some rule of normal forms to support them. >>> >>> Best regards, >>> Jim > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ > Cmusphinx-devel mailing list > Cmu...@li... > https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel |
From: Alexander R. <ai...@cs...> - 2018-06-30 14:30:40
|
Folks, Cmudict is intended for use in ASR. The set of phonetic units evolved to fit that need. The original phone set had stuff like DX, the dental flap, that eventually disappeared. An important reason was the realization that context is highly predictive of phonetic realization and that it's not necessary to explicitly code allophones (assuming you can even to it right). The diphthong issue is a bit different of course. I would hazard that the diphthong provides a better context scope for reco. So diphthongs might still be useful as a unit. To me this seems enough of a reason. But acoustic modeling has evolved. So who knows? I agree with James, though. It's an empirical question. Try it both ways and see what happens. There is also TTS and other applications that might benefit; ideally there's a deterministic mapping from the base ASR representation to the others. Alex -----Original Message----- From: James Salsman <jsa...@gm...> Sent: Saturday, June 30, 2018 9:54 AM To: cmusphinx-devel <cmu...@li...>; Kevin Lenzo <kev...@gm...> Subject: Re: [Cmusphinx-devel] diphthong removal Kevin, Thank you for your reply. Your name is still at the bottom of the CMUDICT page, so I figure it's worth as much trying to convince you of this, whether I end up using it locally or otherwise. > In IPA, the diphthongs are indicated by a tie marker. [Replacing them > with consecutive phonemes] loses information I would like to measure how much information is involved there. I'm not convinced it's positive. It's certainly much smaller than the amount of information available from nondiphthong phonemes about which diphones a learner is able to articulate. There is a physiological argument that diphthongs, as the only nonstationary articulations, are transitions instead of positions in articulation space. Whether this actually provides information is very questionable, because such data compression is either lossy or lossless, not enhancing. Therefore at best, no information could be added. Such coding mistakes are common in natural languages. However, the /ʊ/ vs. /ɜ/, /ɛ/ vs. /e/, /ɔ/ and /o/ substitutions available in the phoneme set would unquestionably represent a loss of some information, and I'd like to figure out how to measure how much. I think it's pretty easy -- just make two -dict's and measure the relative accuracy, speed, etc. on some sufficiently broad test suite. People on this list do that kind of thing all the time, right? Best regards, Jim On Fri, Jun 29, 2018 at 7:26 PM, Kevin Lenzo <kev...@gm...> wrote: > James, > > I don’t control this or update it these days, but there are phonological reasons for not doing this in the dictionary. There is a significant phonetic distinction between a diphthong and two consecutive phones. In IPA, the diphthongs are indicated by a tie marker. Your suggestion loses information and so is not a good idea for the general dictionary. > > Kevin > >> On Jun 29, 2018, at 4:13 PM, James Salsman <jsa...@gm...> wrote: >> >> Kevin, >> >> I need to use diphones for learner analytics to complete >> http://j.mp/irslides >> >> How do you feel about eliminating diphthongs from CMUDICT? >> >> AW -> AA UH >> AY -> AA IY >> ER -> UH R (substitutes /ʊ/ for /ɜ/) EY -> EH IY (substitutes /ɛ/ >> for /e/) OW -> AO UH (substitutes /ɔ/ for /o/) OY -> AO IY >> >> I'm going to do it, locally in something called Cloud Firestore >> anyway, but I wonder whether there are any good reasons to support >> diphthongs at all. Like ligatures in typography, a lot of simple >> algorithms don't stay simple if they need to process the composites. >> And it's certainly less elegant and less parsimonious, and probably a >> violation of some rule of normal forms to support them. >> >> Best regards, >> Jim ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Cmusphinx-devel mailing list Cmu...@li... https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel |
From: James S. <jsa...@gm...> - 2018-06-30 13:54:24
|
Kevin, Thank you for your reply. Your name is still at the bottom of the CMUDICT page, so I figure it's worth as much trying to convince you of this, whether I end up using it locally or otherwise. > In IPA, the diphthongs are indicated by a tie marker. [Replacing > them with consecutive phonemes] loses information I would like to measure how much information is involved there. I'm not convinced it's positive. It's certainly much smaller than the amount of information available from nondiphthong phonemes about which diphones a learner is able to articulate. There is a physiological argument that diphthongs, as the only nonstationary articulations, are transitions instead of positions in articulation space. Whether this actually provides information is very questionable, because such data compression is either lossy or lossless, not enhancing. Therefore at best, no information could be added. Such coding mistakes are common in natural languages. However, the /ʊ/ vs. /ɜ/, /ɛ/ vs. /e/, /ɔ/ and /o/ substitutions available in the phoneme set would unquestionably represent a loss of some information, and I'd like to figure out how to measure how much. I think it's pretty easy -- just make two -dict's and measure the relative accuracy, speed, etc. on some sufficiently broad test suite. People on this list do that kind of thing all the time, right? Best regards, Jim On Fri, Jun 29, 2018 at 7:26 PM, Kevin Lenzo <kev...@gm...> wrote: > James, > > I don’t control this or update it these days, but there are phonological reasons for not doing this in the dictionary. There is a significant phonetic distinction between a diphthong and two consecutive phones. In IPA, the diphthongs are indicated by a tie marker. Your suggestion loses information and so is not a good idea for the general dictionary. > > Kevin > >> On Jun 29, 2018, at 4:13 PM, James Salsman <jsa...@gm...> wrote: >> >> Kevin, >> >> I need to use diphones for learner analytics to complete http://j.mp/irslides >> >> How do you feel about eliminating diphthongs from CMUDICT? >> >> AW -> AA UH >> AY -> AA IY >> ER -> UH R (substitutes /ʊ/ for /ɜ/) >> EY -> EH IY (substitutes /ɛ/ for /e/) >> OW -> AO UH (substitutes /ɔ/ for /o/) >> OY -> AO IY >> >> I'm going to do it, locally in something called Cloud Firestore >> anyway, but I wonder whether there are any good reasons to support >> diphthongs at all. Like ligatures in typography, a lot of simple >> algorithms don't stay simple if they need to process the composites. >> And it's certainly less elegant and less parsimonious, and probably a >> violation of some rule of normal forms to support them. >> >> Best regards, >> Jim |
From: Evandro G. <eg...@gm...> - 2018-06-30 10:40:54
|
Hey, Kevin, Your email address wasn't in the subscriber's list, so the message bounced. Forwarding it to the list... --Evandro ---------- Forwarded message ---------- From: Kevin Lenzo <kev...@gm...> To: James Salsman <jsa...@gm...> Cc: cmusphinx-devel <cmu...@li...> Bcc: Date: Fri, 29 Jun 2018 18:26:12 -0700 Subject: Re: diphthong removal James, I don’t control this or update it these days, but there are phonological reasons for not doing this in the dictionary. There is a significant phonetic distinction between a diphthong and two consecutive phones. In IPA, the diphthongs are indicated by a tie marker. Your suggestion loses information and so is not a good idea for the general dictionary. Kevin > On Jun 29, 2018, at 4:13 PM, James Salsman <jsa...@gm...> wrote: > > Kevin, > > I need to use diphones for learner analytics to complete http://j.mp/irslides > > How do you feel about eliminating diphthongs from CMUDICT? > > AW -> AA UH > AY -> AA IY > ER -> UH R (substitutes /ʊ/ for /ɜ/) > EY -> EH IY (substitutes /ɛ/ for /e/) > OW -> AO UH (substitutes /ɔ/ for /o/) > OY -> AO IY > > I'm going to do it, locally in something called Cloud Firestore > anyway, but I wonder whether there are any good reasons to support > diphthongs at all. Like ligatures in typography, a lot of simple > algorithms don't stay simple if they need to process the composites. > And it's certainly less elegant and less parsimonious, and probably a > violation of some rule of normal forms to support them. > > Best regards, > Jim |
From: James S. <jsa...@gm...> - 2018-06-29 23:14:05
|
Kevin, I need to use diphones for learner analytics to complete http://j.mp/irslides How do you feel about eliminating diphthongs from CMUDICT? AW -> AA UH AY -> AA IY ER -> UH R (substitutes /ʊ/ for /ɜ/) EY -> EH IY (substitutes /ɛ/ for /e/) OW -> AO UH (substitutes /ɔ/ for /o/) OY -> AO IY I'm going to do it, locally in something called Cloud Firestore anyway, but I wonder whether there are any good reasons to support diphthongs at all. Like ligatures in typography, a lot of simple algorithms don't stay simple if they need to process the composites. And it's certainly less elegant and less parsimonious, and probably a violation of some rule of normal forms to support them. Best regards, Jim |
From: James S. <jsa...@gm...> - 2018-06-05 09:25:39
|
Adnan, GMMs are Gaussian mixture models, but they are called "mixture Gaussians" or "mixture Gaussian distributions" in that 1997 paper. Here are later papers building on that one which use the term you're expecting, and which I should have included: http://www.cs.cmu.edu/afs/.cs.cmu.edu/Web/People/jsherwan/pubs/icslp2004.pdf http://www.cs.cmu.edu/~archan/papers/eurospeech2005.pdf Best regards, Jim On Tue, Jun 5, 2018 at 2:32 AM, adnan ali <adn...@uo...> wrote: > Thanks James for reply. > I really appreciate the response. but this is not relevant to my research. I > need to study GMM and HMM role in CMUSphinx. that how it works and what are > the differences etc. > thanks in anticipation. > > On Tue, Jun 5, 2018 at 1:16 PM, James Salsman <jsa...@gm...> wrote: >> >> Hi Adnan, >> >> Try this one: >> >> https://www.cs.cmu.edu/~rkm/eurosp97.cbvq/cbvq.ps >> >> Best regards, >> Jim >> >> On Mon, Jun 4, 2018 at 11:21 PM, adnan ali <adn...@uo...> >> wrote: >> > Hello All, >> > I hope you will find this email in good health. >> > I am writing research paper and I have to cite CMUSphinx in it because I >> > used this tool for experiments and results. >> > I was going thorough some paper and it says it is " CMU Sphinx uses >> > GMM-HMM >> > model " I wasn't aware before that it also use GMM. >> > Now I am confuse. >> > I also want to explain its working in term of HMM and GMM. Some help may >> > be >> > handy. >> > >> > It would be great if I get some base paper (link/name/file) of CMUSphinx >> > which explains its internal working so I can read and quote it in my >> > paper. >> > thanks in anticipation. >> > >> > -- >> > Regards >> > >> > Adnan Ali >> > Associate Lecturer >> > University of Gujrat (Sialkot Sub-Campus-PPP) >> > Daska Road, Sialkot, Pakistan >> > >> > >> > ------------------------------------------------------------------------------ >> > Check out the vibrant tech community on one of the world's most >> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> > _______________________________________________ >> > Cmusphinx-devel mailing list >> > Cmu...@li... >> > https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel >> > > > > > > -- > Regards > > Adnan Ali > Associate Lecturer > University of Gujrat (Sialkot Sub-Campus-PPP) > Daska Road, Sialkot, Pakistan |
From: adnan a. <adn...@uo...> - 2018-06-05 08:33:32
|
Thanks James for reply. I really appreciate the response. but this is not relevant to my research. I need to study GMM and HMM role in CMUSphinx. that how it works and what are the differences etc. thanks in anticipation. On Tue, Jun 5, 2018 at 1:16 PM, James Salsman <jsa...@gm...> wrote: > Hi Adnan, > > Try this one: > > https://www.cs.cmu.edu/~rkm/eurosp97.cbvq/cbvq.ps > > Best regards, > Jim > > On Mon, Jun 4, 2018 at 11:21 PM, adnan ali <adn...@uo...> > wrote: > > Hello All, > > I hope you will find this email in good health. > > I am writing research paper and I have to cite CMUSphinx in it because I > > used this tool for experiments and results. > > I was going thorough some paper and it says it is " CMU Sphinx uses > GMM-HMM > > model " I wasn't aware before that it also use GMM. > > Now I am confuse. > > I also want to explain its working in term of HMM and GMM. Some help may > be > > handy. > > > > It would be great if I get some base paper (link/name/file) of CMUSphinx > > which explains its internal working so I can read and quote it in my > paper. > > thanks in anticipation. > > > > -- > > Regards > > > > Adnan Ali > > Associate Lecturer > > University of Gujrat (Sialkot Sub-Campus-PPP) > > Daska Road, Sialkot, Pakistan > > > > ------------------------------------------------------------ > ------------------ > > Check out the vibrant tech community on one of the world's most > > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > _______________________________________________ > > Cmusphinx-devel mailing list > > Cmu...@li... > > https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel > > > -- *Regards* *Adnan Ali * <https://www.facebook.com/MhAdnanAli> Associate Lecturer *University of Gujrat (Sialkot Sub-Campus-PPP**)* Daska Road, Sialkot, Pakistan |
From: James S. <jsa...@gm...> - 2018-06-05 08:16:57
|
Hi Adnan, Try this one: https://www.cs.cmu.edu/~rkm/eurosp97.cbvq/cbvq.ps Best regards, Jim On Mon, Jun 4, 2018 at 11:21 PM, adnan ali <adn...@uo...> wrote: > Hello All, > I hope you will find this email in good health. > I am writing research paper and I have to cite CMUSphinx in it because I > used this tool for experiments and results. > I was going thorough some paper and it says it is " CMU Sphinx uses GMM-HMM > model " I wasn't aware before that it also use GMM. > Now I am confuse. > I also want to explain its working in term of HMM and GMM. Some help may be > handy. > > It would be great if I get some base paper (link/name/file) of CMUSphinx > which explains its internal working so I can read and quote it in my paper. > thanks in anticipation. > > -- > Regards > > Adnan Ali > Associate Lecturer > University of Gujrat (Sialkot Sub-Campus-PPP) > Daska Road, Sialkot, Pakistan > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Cmusphinx-devel mailing list > Cmu...@li... > https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel > |
From: adnan a. <adn...@uo...> - 2018-06-05 05:53:23
|
Hello All, I hope you will find this email in good health. I am writing research paper and I have to cite CMUSphinx in it because I used this tool for experiments and results. I was going thorough some paper and it says it is " CMU Sphinx uses GMM-HMM model " I wasn't aware before that it also use GMM. Now I am confuse. I also want to explain its working in term of HMM and GMM. Some help may be handy. It would be great if I get some base paper (link/name/file) of CMUSphinx which explains its internal working so I can read and quote it in my paper. thanks in anticipation. -- *Regards* *Adnan Ali * <https://www.facebook.com/MhAdnanAli> Associate Lecturer *University of Gujrat (Sialkot Sub-Campus-PPP**)* Daska Road, Sialkot, Pakistan |
From: James S. <jsa...@gm...> - 2018-05-15 23:09:41
|
Hi Lawrence, Which models are you using? Please see below. ---------- Forwarded message ---------- From: Daniel Povey <dp...@gm...> Date: Mon, Apr 9, 2018 at 1:45 PM Subject: Re: best public data set for pronunciation assessment? To: James Salsman <jsa...@gm...> I'd probably recommend that you start with the mini_librispeech example scripts (it trains a model from scratch on free data), and once you figure out the workflow, maybe upgrade to the regular librispeech example scripts which uses more data but will take longer to train. We generally recommend people train their own models. But the learning curve for Kaldi is much steeper than for pocketsphinx. So be prepared to learn some stuff, there is no "turnkey" solution. Dan On Mon, Apr 9, 2018 at 10:15 AM, James Salsman <jsa...@gm...> wrote: > > Hi Dan, > > I would like to replicate http://j.mp/irslides with Kaldi. > > Is there an open and free turn-key model available for such work? > > Best regards, > Jim On Tue, May 15, 2018 at 1:31 PM, Evandro Gouvea <eg...@gm...> wrote: > Message from non-subscriber bounced. Please Cc sender. > > --Evandro > > > > ---------- Forwarded message ---------- > From: Lawrence Wu <law...@so...> > To: cmu...@li... > Cc: Zili Li <zi...@so...>, Majid Emami <ma...@so...> > Bcc: > Date: Tue, 15 May 2018 11:49:23 -0700 > Subject: Kaldi model in sphinx > Hi, > > Does anyone have an example of using a Kaldi GMM model in Sphinx 4 (e.g. a > test program)? I have specified KaldiLoader in my XML configuration but > this doesn’t seem to be sufficient. > > Thank you, > Lawrence Wu > > > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Cmusphinx-devel mailing list > Cmu...@li... > https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel > > |
From: Evandro G. <eg...@gm...> - 2018-05-15 19:31:14
|
Message from non-subscriber bounced. Please Cc sender. --Evandro ---------- Forwarded message ---------- From: Lawrence Wu <law...@so...> To: cmu...@li... Cc: Zili Li <zi...@so...>, Majid Emami <ma...@so...> Bcc: Date: Tue, 15 May 2018 11:49:23 -0700 Subject: Kaldi model in sphinx Hi, Does anyone have an example of using a Kaldi GMM model in Sphinx 4 (e.g. a test program)? I have specified KaldiLoader in my XML configuration but this doesn’t seem to be sufficient. Thank you, Lawrence Wu |
From: James S. <jsa...@gm...> - 2018-05-02 18:01:06
|
The human cost of bad speech recognition engineering for pronunciation assessment (now with a speaker identification twist) is back in the news again today: https://www.ft.com/video/a3b002e3-856f-4062-a8ec-f9b9bfafe34f https://twitter.com/jsalsman/status/991699158699421697 As many of you might not know, a senior Google employee with a direct financial ownership interest in Google Ventures' BBN spinoff EnglishCentral.com has forbidden me from further participation in GSoC because I objected to the exclusion of online dot-voting for the mentor summit unconference agenda process, and to his employee's attempt to preclude asking the other mentors for their opinions on the question. I have asked that another org admin from last year reach out to him about this conflict of interest, but so far there has been no response to my questions on that matter. I would agree to serve as a strictly voluntary, uncredited co-mentor if Brij agrees to be (one of the) org admin(s) next year and someone else signs up just to give me the t-shirt. We need to finish getting Brij's enhancements to Wiktionary: https://brijmohan.github.io/iremedy/single_line.html I expect to have very decent ten feature/phoneme code (the 9 feature/phoneme code on http://j.mp/irslides plus the nasal flap) and maybe even one or two more obscure vocal tract articulations, the elimination of diphthongs (which is something we all should have done back in the 1980s...) and plenty of data for 7,000 words by summer 2019. (Anyone else who hasn't requested their mentor stipend from last year, I spent it on data collection, but can sure scrounge them up if you let me take as long as you took.) Brij, are you willing to be CMU Sphinx org admin in Summer 2019, which starts around December getting project descriptions ready? Any objections to Brij as lead org admin in GSoC next year? Best regards, Jim |
From: Evandro G. <eg...@gm...> - 2018-03-09 13:34:48
|
Folks, Message from non-subscriber bounced. Please Cc sender when replying. Bill: When I tried the link below, the browser got redirected to https://cmusphinx.github.io/wiki/ --Evandro ---------- Forwarded message ---------- From: <cmu...@li...> Date: 9 March 2018 at 12:36 Subject: Auto-discard notification To: cmu...@li... The attached message has been automatically discarded. ---------- Forwarded message ---------- From: Bill Cawley <bi...@az...> To: cmu...@li... Cc: Bcc: Date: Fri, 9 Mar 2018 11:08:03 +0000 Subject: Wiki link not working Hello, I'm keen to try using Sphinx4 - we're working on robots, but the link to your Wiki at http://cmusphinx.sourceforge.net/wiki is giving me an error. Could you help me please? Thanks, Bill, -- Bill Cawley Telephone: 02034 245023 www.azquo.com |
From: Nickolay S. <nsh...@gm...> - 2017-12-15 22:25:08
|
The manual page is here: http://www1.icsi.berkeley.edu/Speech/docs/sctk-1.2/sclite.htm The command should be something like sclite -i rm -r reference -h hypothesis -o sum,pralign,dtl You can also change the variable $DEC_CFG_ALIGN = "builtin"; to $DEC_CFG_ALIGN = "sclite"; And it will automatically use sclite during decoding. > 14 дек. 2017 г., в 13:08, adnan ali <adn...@uo...> написал(а): > > Thanks for reply. > Can you guide me in this a little. > I am trying it but it dont have much helping material online. > > I have installed it now I am running it with this below command. > $ ./sclite -i '/export/home3/seecs/adnan.ali/Sphinx/Urdu_training/etc/Urdu_training_train.transcription' -r '/export/home3/seecs/adnan.ali/Sphinx/Urdu_training/result/Urdu_training.match' > > while after -i is my input file which in transcription file. > > after -r is output file after training. > > it gives error > > sclite: Error, Input not specified, use transcription input or piped input > > > Is it right syntax ? > > > > > On Mon, Dec 11, 2017 at 4:03 AM, Nickolay Shmyrev <nsh...@gm...> wrote: > > > 30 нояб. 2017 г., в 13:41, adnan ali <adn...@uo...> написал(а): > > > > Hey All. > > > > I have done training and testing of data using sphinxtrain. I am getting WER around 35%. Is it good or bad ? > > It is ok. > > > Now I have to do analysis of data, I want to see what are the words that caused error rate. > > Is there any tool or code which can "Compare log files and testing files". > > Thanks for anticipation. > > You can use sclite from sctk https://www.nist.gov/itl/iad/mig/tools > > > > > -- > Regards > > Adnan Ali > Associate Lecturer > University of Gujrat (Sialkot Sub-Campus-PPP) > Daska Road, Sialkot, Pakistan |
From: adnan a. <adn...@uo...> - 2017-12-14 10:09:35
|
Thanks for reply. Can you guide me in this a little. I am trying it but it dont have much helping material online. I have installed it now I am running it with this below command. $ ./sclite -i '/export/home3/seecs/adnan.ali/Sphinx/Urdu_training/etc/Urdu_training_train.transcription' -r '/export/home3/seecs/adnan.ali/Sphinx/Urdu_training/result/Urdu_training.match' while after -i is my input file which in transcription file. after -r is output file after training. it gives error *sclite: Error, Input not specified, use transcription input or piped input* Is it right syntax ? On Mon, Dec 11, 2017 at 4:03 AM, Nickolay Shmyrev <nsh...@gm...> wrote: > > > 30 нояб. 2017 г., в 13:41, adnan ali <adn...@uo...> > написал(а): > > > > Hey All. > > > > I have done training and testing of data using sphinxtrain. I am > getting WER around 35%. Is it good or bad ? > > It is ok. > > > Now I have to do analysis of data, I want to see what are the words > that caused error rate. > > Is there any tool or code which can "Compare log files and testing > files". > > Thanks for anticipation. > > You can use sclite from sctk https://www.nist.gov/itl/iad/mig/tools > > -- *Regards* *Adnan Ali * <https://www.facebook.com/MhAdnanAli> Associate Lecturer *University of Gujrat (Sialkot Sub-Campus-PPP**)* Daska Road, Sialkot, Pakistan |
From: Nickolay S. <nsh...@gm...> - 2017-12-10 23:03:48
|
> 30 нояб. 2017 г., в 13:41, adnan ali <adn...@uo...> написал(а): > > Hey All. > > I have done training and testing of data using sphinxtrain. I am getting WER around 35%. Is it good or bad ? It is ok. > Now I have to do analysis of data, I want to see what are the words that caused error rate. > Is there any tool or code which can "Compare log files and testing files". > Thanks for anticipation. You can use sclite from sctk https://www.nist.gov/itl/iad/mig/tools |
From: adnan a. <adn...@uo...> - 2017-11-30 11:10:59
|
Hey All. I have done training and testing of data using sphinxtrain. I am getting WER around 35%. Is it good or bad ? Now I have to do analysis of data, I want to see what are the words that caused error rate. Is there any tool or code which can "Compare log files and testing files". Thanks for anticipation.. -- *Regards* *Adnan Ali * <https://www.facebook.com/MhAdnanAli> Associate Lecturer *University of Gujrat (Sialkot Sub-Campus-PPP**)* Daska Road, Sialkot, Pakistan |
From: Evandro G. <eg...@gm...> - 2017-11-27 16:54:48
|
Folks, Sender not in mailing list, please Cc sender. --Evandro ---------- Forwarded message ---------- From: <cmu...@li...> Date: 27 November 2017 at 17:25 Subject: Auto-discard notification To: cmu...@li... The attached message has been automatically discarded. ---------- Forwarded message ---------- From: Benjamin Maurice <ma...@li...> To: cmu...@li... Cc: Bcc: Date: Mon, 27 Nov 2017 17:05:37 +0100 Subject: Installation Good morning, I try to install sphinx4-core from https://github.com/cmusphinx/sphinx4, but I have some errors. When I launch the project, I have a null pointer exception for HMM. Is it possible to have a .jar who works ? Thank you very much. |
From: PANKAJ B. <pan...@du...> - 2017-10-24 13:34:41
|
Hey all! *Last two hours *before applications close for Google Code-in! We request you to kindly look at the file attached and either verify that the tasks present therein are worth completing or you could add some more tasks which you think could help CMUSphinx provide better support to the community. https://docs.google.com/spreadsheets/d/1yK_isyTqbinz4tGryx86tsW9yP4ojutUq siZJG88TmI/edit?usp=sharing <https://mailtrack.io/trace/link/2029a2a11bb3cda169e60d1e2aa78626d6909f26?url=https%3A%2F%2Fdocs.google.com%2Fspreadsheets%2Fd%2F1yK_isyTqbinz4tGryx86tsW9yP4ojutUqsiZJG88TmI%2Fedit%3Fusp%3Dsharing&userId=1349169&signature=1580667b0ceca5ea> For sample tasks, you can visit: https://developers.google.com/open-source/gci/resources/example-tasks <https://mailtrack.io/trace/link/f52b53b3a4890d17c71c8d16585533f2b19ab4da?url=https%3A%2F%2Fdevelopers.google.com%2Fopen-source%2Fgci%2Fresources%2Fexample-tasks&userId=1349169&signature=da56115d67d8ce10> Some small but long overdue tasks could be completed this way. And you could get to mentor these students on behalf of CMUSphinx if we get selected! Last Date is 24-October. So, we need to hurry! Your contribution could help CMUSphinx get selected. Thanks |
From: PANKAJ B. <pan...@du...> - 2017-10-17 18:30:06
|
People! Clock is ticking! Please contribute. On Sun, Oct 15, 2017 at 11:42 PM, PANKAJ BARANWAL <pan...@du...> wrote: > Hey everyone! > This year, CMUSphinx is participating in Google Code-in! > It is much like Google Summer of Code, but for younger students and with > much less burden on the mentor. You can know more about it here: > https://codein.withgoogle.com > <https://mailtrack.io/trace/link/8d53e7ef5fa58246a209808327507d5ad2566696?url=https%3A%2F%2Fcodein.withgoogle.com&userId=1349169&signature=e8432a7ea9a83af7> > > So, if you are interested in helping CMUSphinx get through the selection > procedure, you will need to follow 1 simple instruction: > >> Go to: > > https://developers.google.com/open-source/gci/resources/example-tasks >> <https://mailtrack.io/trace/link/bc694bf4622d6c6058e6a1cfdd18af151736feee?url=https%3A%2F%2Fdevelopers.google.com%2Fopen-source%2Fgci%2Fresources%2Fexample-tasks&userId=1349169&signature=4ea2780c4f60874d> > > Check out the example tasks. > > And help us create similar such tasks by adding a new row in this document: > > https://docs.google.com/spreadsheets/d/1yK_isyTqbinz4tGryx86 >> tsW9yP4ojutUqsiZJG88TmI/edit?usp=sharing >> <https://mailtrack.io/trace/link/651ca009dd83a1461f495341968e38b4b91a6a43?url=https%3A%2F%2Fdocs.google.com%2Fspreadsheets%2Fd%2F1yK_isyTqbinz4tGryx86tsW9yP4ojutUqsiZJG88TmI%2Fedit%3Fusp%3Dsharing&userId=1349169&signature=a6b9e367b64180d7> > > Each row will be treated as SubTasks. Just complete the columns mentioned >> and THAT'S IT! > > Some small but long overdue tasks could be completed this way. And you > could get to mentor these students on behalf of CMUSphinx if we get > selected! > Last Date is 24-October. So, we need to hurry! Your contribution could > help CMUSphinx get selected. > Thanks > > -- Cheers Pankaj |