cmusphinx-devel Mailing List for CMU Sphinx (Page 2)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

You have always been one of the good ones.

Kevin 

> On Jun 30, 2018, at 6:23 PM, James Salsman <jsa...@gm...> wrote:
> 
> Dr. Rudnicky,
> 
> If I had a command-and-control and/or dictation test suite I would
> measure the empirical question myself, but I don't, so I hope you, or
> anyone here who does have such test suites, will take a closer look at
> the question. I've never been interested in command-and-control or
> dictation. There are (more labor-intensive?) ways to measure the
> change for synthesis, too, which would be absolutely necessary along
> with the other aspects Kevin mentioned, before changing it for
> everyone.
> 
> Dr. Horacio Franco at SRI, after having set the formula for "Goodness
> of Pronunciation" (often "GoP") around 1996 with Drs. Jared Bernstein
> et al., which was taken up in turn by Prof. Steve Young's group at
> Cambridge e.g. in Dr. Silke Witt's 1999/2000 thesis, is now
> contemplating extraction of articulatory features early in the signal
> processing path:
> https://www.researchgate.net/publication/325570699_Articulatory_Features_for_ASR_of_Pathological_Speech
> 
> We really need to help everyone who ever used that formula, which I
> suspect has a physical units' type mismatch error, because of larger
> societal effects such as these:
> 
> https://www.theguardian.com/australia-news/2017/aug/08/computer-says-no-irish-vet-fails-oral-english-test-needed-to-stay-in-australia
> 
> https://www.ft.com/content/dc7faee2-51e5-11e8-b3ee-41e0209208ec
> 
> These are very serious manifestations of a software bug, affecting the
> immigration status of tens of thousands of people.
> 
> Best regards,
> Jim
> 
>> On Sat, Jun 30, 2018 at 4:13 PM, Alexander Rudnicky <ai...@cs...> wrote:
>> Uh, I have been maintaining it, more or less. It's now at (I think) version 0.7b. The versions get bumped whenever I feel I've put in some big chunk of fixing. There's minor maintenance ongoing.
>> 
>> Apart from adding words (harvested from words that show up in lmtool uploads that are backed off to LtoS, making sure that the 10k most-frequent words are in there, cleaning out some of the crap, etc) and fixing stuff people send in. I occasionally put time into regularizations of one kind or another. Also, trying to figure out how to deal with Nickolay's bootleg version. And so on.
>> 
>> I've had to fix your script at http://www.speech.cs.cmu.edu/cgi-bin/cmudict, since perl had evolved too much.
>> 
>> There are also ongoing requests for the same resource in other languages (e.g. Spanish). Someday.
>> 
>> It's been more or less on a hobby level, though at some point I did put together an NSF proposal to get more (people) resources for it. But then got busy with some other stuff. I should get back top that...
>> 
>> Alex
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: Kevin Lenzo <kev...@gm...>
>> Sent: Saturday, June 30, 2018 5:15 PM
>> To: Alexander Rudnicky <ai...@cs...>
>> Cc: James Salsman <jsa...@gm...>; cmusphinx-devel <cmu...@li...>
>> Subject: Re: [Cmusphinx-devel] diphthong removal
>> 
>> Just following up here, I wish this really important resource would be updated. CMU did something really important here, and it is still use A LOT, but hasn’t seen a serious consideration and version update in almost 20 years. It would be nice if CMU recognized its leadership here and really pushed ahead with publicly available lexica.
>> 
>> oak
>> 
>>> On Jun 30, 2018, at 4:29 PM, Kevin Lenzo <kev...@gm...> wrote:
>>> 
>>> Hi Alex,
>>> 
>>> I’m sure you’re aware that the CMU Pronouncing Dictionary is used for more than ASR. For example, Festival and FestVox both use it, and many voices available on Android today come from that.
>>> 
>>> If we are considering changes to the dictionary itself, it is worth considering the linguistic value, as well as the multiple clients of this data. To make such a change would have unnecessary impact on these downstream uses of the data, favoring one perhaps temporary result of a particular client’s application.
>>> 
>>> Do whatever you want with the data — that’s the whole point of the liberal licensing scheme; but consider carefully the impact on the overall use and importance of the fundamental resource.
>>> 
>>> oak
>>> 
>>>> On Jun 30, 2018, at 10:16 AM, Alexander Rudnicky <ai...@cs...> wrote:
>>>> 
>>>> Folks,
>>>> 
>>>> Cmudict is intended for use in ASR. The set of phonetic units evolved to fit that need. The original phone set had stuff like DX, the dental flap, that eventually disappeared. An important reason was the realization that context is highly predictive of phonetic realization and that it's not necessary to explicitly code allophones (assuming you can even to it right).
>>>> 
>>>> The diphthong issue is a bit different of course. I would hazard that the diphthong provides a better context scope for reco. So diphthongs might still be useful as a unit. To me this seems enough of a reason. But acoustic modeling has evolved. So who knows?
>>>> 
>>>> I agree with James, though. It's an empirical question. Try it both ways and see what happens. There is also TTS and other applications that might benefit; ideally there's a deterministic mapping from the base ASR representation to the others.
>>>> 
>>>> Alex
>>>> 
>>>> -----Original Message-----
>>>> From: James Salsman <jsa...@gm...>
>>>> Sent: Saturday, June 30, 2018 9:54 AM
>>>> To: cmusphinx-devel <cmu...@li...>; Kevin
>>>> Lenzo <kev...@gm...>
>>>> Subject: Re: [Cmusphinx-devel] diphthong removal
>>>> 
>>>> Kevin,
>>>> 
>>>> Thank you for your reply. Your name is still at the bottom of the CMUDICT page, so I figure it's worth as much trying to convince you of this, whether I end up using it locally or otherwise.
>>>> 
>>>>> In IPA, the diphthongs are indicated by a tie marker. [Replacing
>>>>> them with consecutive phonemes] loses information
>>>> 
>>>> I would like to measure how much information is involved there. I'm not convinced it's positive. It's certainly much smaller than the amount of information available from nondiphthong phonemes about which diphones a learner is able to articulate.
>>>> 
>>>> There is a physiological argument that diphthongs, as the only nonstationary articulations, are transitions instead of positions in articulation space. Whether this actually provides information is very questionable, because such data compression is either lossy or lossless, not enhancing. Therefore at best, no information could be added. Such coding mistakes are common in natural languages.
>>>> 
>>>> However, the  /ʊ/ vs. /ɜ/, /ɛ/ vs. /e/, /ɔ/ and /o/ substitutions available in the phoneme set would unquestionably represent a loss of some information, and I'd like to figure out how to measure how much.
>>>> I think it's pretty easy -- just make two -dict's and measure the relative accuracy, speed, etc. on some sufficiently broad test suite.
>>>> People on this list do that kind of thing all the time, right?
>>>> 
>>>> Best regards,
>>>> Jim
>>>> 
>>>>> On Fri, Jun 29, 2018 at 7:26 PM, Kevin Lenzo <kev...@gm...> wrote:
>>>>> James,
>>>>> 
>>>>> I don’t control this or update it these days, but there are phonological reasons for not doing this in the dictionary. There is a significant phonetic distinction between a diphthong and two consecutive phones. In IPA, the diphthongs are indicated by a tie marker. Your suggestion loses information and so is not a good idea for the general dictionary.
>>>>> 
>>>>> Kevin
>>>>> 
>>>>>> On Jun 29, 2018, at 4:13 PM, James Salsman <jsa...@gm...> wrote:
>>>>>> 
>>>>>> Kevin,
>>>>>> 
>>>>>> I need to use diphones for learner analytics to complete
>>>>>> http://j.mp/irslides
>>>>>> 
>>>>>> How do you feel about eliminating diphthongs from CMUDICT?
>>>>>> 
>>>>>> AW -> AA UH
>>>>>> AY -> AA IY
>>>>>> ER -> UH R  (substitutes /ʊ/ for /ɜ/) EY -> EH IY (substitutes /ɛ/
>>>>>> for /e/) OW -> AO UH (substitutes /ɔ/ for /o/) OY -> AO IY
>>>>>> 
>>>>>> I'm going to do it, locally in something called Cloud Firestore
>>>>>> anyway, but I wonder whether there are any good reasons to support
>>>>>> diphthongs at all. Like ligatures in typography, a lot of simple
>>>>>> algorithms don't stay simple if they need to process the composites.
>>>>>> And it's certainly less elegant and less parsimonious, and probably
>>>>>> a violation of some rule of normal forms to support them.
>>>>>> 
>>>>>> Best regards,
>>>>>> Jim
>>>> 
>>>> ---------------------------------------------------------------------
>>>> --------- Check out the vibrant tech community on one of the world's
>>>> most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>> _______________________________________________
>>>> Cmusphinx-devel mailing list
>>>> Cmu...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel
>> 

2002	Jan (29)	Feb (11)	Mar (134)	Apr (55)	May (91)	Jun (77)	Jul (46)	Aug (147)	Sep (96)	Oct (103)	Nov (130)	Dec (103)
2003	Jan (207)	Feb (132)	Mar (87)	Apr (183)	May (89)	Jun (44)	Jul (41)	Aug (22)	Sep (39)	Oct (37)	Nov (47)	Dec (8)
2004	Jan (22)	Feb (54)	Mar (52)	Apr (7)	May (47)	Jun (19)	Jul	Aug (43)	Sep (7)	Oct (5)	Nov	Dec
2005	Jan	Feb	Mar	Apr	May	Jun (3)	Jul	Aug	Sep	Oct	Nov	Dec
2006	Jan	Feb	Mar	Apr	May	Jun	Jul (1)	Aug	Sep	Oct	Nov	Dec
2007	Jan	Feb	Mar (1)	Apr	May	Jun	Jul	Aug	Sep (1)	Oct	Nov	Dec
2008	Jan	Feb	Mar (6)	Apr (20)	May	Jun (4)	Jul (2)	Aug (1)	Sep	Oct (17)	Nov (11)	Dec (14)
2009	Jan (3)	Feb (56)	Mar (19)	Apr (81)	May (24)	Jun	Jul	Aug (5)	Sep (57)	Oct (24)	Nov (34)	Dec (56)
2010	Jan (15)	Feb (42)	Mar (79)	Apr (45)	May (27)	Jun (41)	Jul (69)	Aug (14)	Sep (18)	Oct (19)	Nov (33)	Dec (25)
2011	Jan (4)	Feb (22)	Mar (53)	Apr (47)	May (29)	Jun (18)	Jul (18)	Aug (2)	Sep (6)	Oct (32)	Nov (18)	Dec
2012	Jan (8)	Feb (12)	Mar (62)	Apr (37)	May (22)	Jun (8)	Jul (3)	Aug (3)	Sep (2)	Oct (2)	Nov (28)	Dec (19)
2013	Jan (28)	Feb (24)	Mar (8)	Apr (17)	May (6)	Jun (14)	Jul (6)	Aug (5)	Sep (17)	Oct (12)	Nov (24)	Dec (16)
2014	Jan (13)	Feb (9)	Mar (2)	Apr (3)	May	Jun (2)	Jul (6)	Aug (6)	Sep (28)	Oct (4)	Nov (1)	Dec (4)
2015	Jan (5)	Feb (11)	Mar (6)	Apr (10)	May (9)	Jun (13)	Jul (5)	Aug (10)	Sep (13)	Oct (4)	Nov (14)	Dec (14)
2016	Jan (32)	Feb (33)	Mar (37)	Apr (6)	May (7)	Jun (11)	Jul (3)	Aug (2)	Sep (9)	Oct (11)	Nov (16)	Dec (3)
2017	Jan (3)	Feb (15)	Mar (41)	Apr (10)	May (8)	Jun (4)	Jul	Aug (63)	Sep	Oct (7)	Nov (2)	Dec (3)
2018	Jan	Feb	Mar (1)	Apr	May (3)	Jun (15)	Jul	Aug	Sep (2)	Oct (2)	Nov (1)	Dec
2019	Jan	Feb	Mar	Apr	May (2)	Jun	Jul (1)	Aug (6)	Sep (3)	Oct (2)	Nov	Dec
2020	Jan	Feb (3)	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (1)	Nov	Dec
2021	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (2)	Dec
2022	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (1)	Nov	Dec

cmusphinx-devel Mailing List for CMU Sphinx (Page 2)

Speech Recognition Toolkit

cmusphinx-devel — CMU Sphinx Develoment Mailing List