wordlist-devel Mailing List for SCOWL (and friends)
Brought to you by:
kevina
You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2009 |
Jan
|
Feb
(7) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2011 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
From: Kevin A. <ke...@at...> - 2011-12-05 02:41:35
|
On Sun, 4 Dec 2011, Kevin Atkinson wrote: > On Sun, 4 Dec 2011, Elephant Eight wrote: > >> I searched through all of the word lists in the SCOWL 7.1 package for >> various *less words and although I found some, such as "bookless" I did not >> find others such as "phoneless" or "telephoneless." Although uncommon, it >> is perfectly good English to add "less" to any singular common noun which >> makes it into an adjective meaning "without something." As such, you can >> say "We were left phoneless after the storm." or "Many homes before 1900 >> were still telephoneless." > > Like you said, it is uncommon; bookless is included in level 80 of scowl. > Level 80 and above are likely to have lots of inconsistencies; this level > also include lots of uncommon words. > > Also I can not add less to every noun, as that will give way to many words > which are hardly ever used, some of which make no sense. I already have that > problem with the possessive form being added to too many words. > >> Continuing on this topic, it is possible to add "ness" after "less" to make >> an adjective into a noun in a sentence such as "Booklessness is common >> among uneducated households." A simple rule that "ness can be placed after >> less" does not work since "blessness" is not a word. > > You see, I can't even use a simple rule. Thus I make no attempt, as "less" > and "ness" are not common enough to worry about. > >> Taking the noun-to-adjective and adjective-to-noun suffix rules to an >> extreme however, seems to fall apart since "booklessnessless" seems fairly >> useless. (and uselessness should definitely be a criteria for not adding >> to the dictionary; that is to say that the dictionary selection criteria >> should not be uselessnessless.) > > Again, you proved my point about adding suffixes indiscreetly based on ^^ indiscriminately Spell checkers are not perfect. ;) > something such as part-of-speech. > >> On the flip side, I found at least one seemingly none-sense word with the >> english-words.95 list - "windowless's." Adjectives should generally not >> have possessives. After thinking about it, I decided that it in special >> circumstances, an adjective can be turned into a noun and take a >> possessive, such as "You have a choice between a window or a windowless >> office, but the windowless's desk is much nicer." Should this be taken >> further? Following up on the storm example above, "We were fortunate, so we >> visited the phoneless's (or should it be "phonelesss' " since there are >> several) homes to make sure that none of our neighbors had been hurt." > > Level 95 is known to contain garbage. I offer absolutely no guarantee that > even 50% of the level 95 only words are valid English words. You are best to > stick with level 80 and below. > > The most effort goes into level 60 as that is the level I use to create the > English Dictionary for the purpose of spell checking. > >> In any case, I hope that this message is not entirely pointless. |
From: Geoff K. <ge...@cs...> - 2011-12-05 02:36:12
|
I'm with Kevin. Spelling word lists aren't ever going to be comprehensive, and attempting to make them so is an exercise in both frustration and error. ...not to mention the fact that essentially every spell-checker offers the option of a local word list, so if you have a habit of using "phoneless" it's easy to add it to your private list. Meanwhile, those of us who shudder at such infelicitous coinages can remain happily phoneful. -- Geoff Kuenning ge...@cs... http://www.cs.hmc.edu/~geoff/ A programmer who can't write readable prose is as incompetent as one who can't produce working code. |
From: Kevin A. <ke...@at...> - 2011-12-04 23:09:43
|
On Sun, 4 Dec 2011, Elephant Eight wrote: > I searched through all of the word lists in the SCOWL 7.1 package for > various *less words and although I found some, such as "bookless" I did > not find others such as "phoneless" or "telephoneless." Although > uncommon, it is perfectly good English to add "less" to any singular > common noun which makes it into an adjective meaning "without > something." As such, you can say "We were left phoneless after the > storm." or "Many homes before 1900 were still telephoneless." Like you said, it is uncommon; bookless is included in level 80 of scowl. Level 80 and above are likely to have lots of inconsistencies; this level also include lots of uncommon words. Also I can not add less to every noun, as that will give way to many words which are hardly ever used, some of which make no sense. I already have that problem with the possessive form being added to too many words. > Continuing on this topic, it is possible to add "ness" after "less" to > make an adjective into a noun in a sentence such as "Booklessness is > common among uneducated households." A simple rule that "ness can be > placed after less" does not work since "blessness" is not a word. You see, I can't even use a simple rule. Thus I make no attempt, as "less" and "ness" are not common enough to worry about. > Taking the noun-to-adjective and adjective-to-noun suffix rules to an > extreme however, seems to fall apart since "booklessnessless" seems > fairly useless. (and uselessness should definitely be a criteria for not > adding to the dictionary; that is to say that the dictionary selection > criteria should not be uselessnessless.) Again, you proved my point about adding suffixes indiscreetly based on something such as part-of-speech. > On the flip side, I found at least one seemingly none-sense word with > the english-words.95 list - "windowless's." Adjectives should generally > not have possessives. After thinking about it, I decided that it in > special circumstances, an adjective can be turned into a noun and take a > possessive, such as "You have a choice between a window or a windowless > office, but the windowless's desk is much nicer." Should this be taken > further? Following up on the storm example above, "We were fortunate, so > we visited the phoneless's (or should it be "phonelesss' " since there > are several) homes to make sure that none of our neighbors had > been hurt." Level 95 is known to contain garbage. I offer absolutely no guarantee that even 50% of the level 95 only words are valid English words. You are best to stick with level 80 and below. The most effort goes into level 60 as that is the level I use to create the English Dictionary for the purpose of spell checking. > In any case, I hope that this message is not entirely pointless. |
From: Elephant E. <ele...@ya...> - 2011-12-04 20:50:28
|
I was looking at the SCOWL word list available on wordlist.sourceforge.net, and I immediately found some inconsistencies especially when it comes to adding the "less" suffix for common English nouns. I searched through all of the word lists in the SCOWL 7.1 package for various *less words and although I found some, such as "bookless" I did not find others such as "phoneless" or "telephoneless." Although uncommon, it is perfectly good English to add "less" to any singular common noun which makes it into an adjective meaning "without something." As such, you can say "We were left phoneless after the storm." or "Many homes before 1900 were still telephoneless." Continuing on this topic, it is possible to add "ness" after "less" to make an adjective into a noun in a sentence such as "Booklessness is common among uneducated households." A simple rule that "ness can be placed after less" does not work since "blessness" is not a word. Taking the noun-to-adjective and adjective-to-noun suffix rules to an extreme however, seems to fall apart since "booklessnessless" seems fairly useless. (and uselessness should definitely be a criteria for not adding to the dictionary; that is to say that the dictionary selection criteria should not be uselessnessless.) On the flip side, I found at least one seemingly none-sense word with the english-words.95 list - "windowless's." Adjectives should generally not have possessives. After thinking about it, I decided that it in special circumstances, an adjective can be turned into a noun and take a possessive, such as "You have a choice between a window or a windowless office, but the windowless's desk is much nicer." Should this be taken further? Following up on the storm example above, "We were fortunate, so we visited the phoneless's (or should it be "phonelesss' " since there are several) homes to make sure that none of our neighbors had been hurt." In any case, I hope that this message is not entirely pointless. |
From: Kevin A. <ke...@gn...> - 2011-01-06 18:08:10
|
I am pleased to announce that a new version of SCOWL, corresponding Aspell and Hunspell dictionaries, and VarCon is now available. These are bug fix releases. Get SCOWL, Hunspell dictionaries, and VarCon at: http://wordlist.sourceforge.net/ Get the updated Aspell dictionary at: http://ftp.gnu.org/gnu/aspell/dict/en/ Spell Checking Oriented Word Lists (SCOWL) Revision 7.1 (SVN Revision 162) January 6, 2011 by Kevin Atkinson (ke...@gn...) SCOWL (Spell Checker Oriented Word Lists) is a collection of word lists split up in various sizes, and other categories, intended to be suitable for use in spell checkers. However, I am sure it will have numerous other uses as well. CHANGES: >From Revision 7 to 7.1 (January 6, 2011) Updated to revision 5.1 of Varcon which corrected several errors. Fixed various problems with the variant processing which corrected a few more errors. Added several now common proper names and some other words now in common use. Include misc/ and speller/ directory which where in SVN but left out of the release tarball. Other minor fixes, including some fixes to the taboo word lists. >From Revision 6 to 7 (December 27, 2010) Updated to revision 5.0 of Varcon which corrected many errors, especially in the British and Canadian spelling categories. Also added new spelling categories for the British and Canadian spelling variants and separated them out from the main variant_* categories. Moved Moby names lists (3897male.nam 4946fema.len 21986na.mes) to 95 level since they contain too many errors and rare names. Moved frequently class 0 from Brian Kelk's Wordlist from level 60 to 70, and also filter it with level 80 due to, too many misspellings. Many other minor fixes. Variant Conversion Info (VARCON) Revision 5.1 (SVN Revision 161) January 6, 2011 by Kevin Atkinson (ke...@gn...) This package contains information to convert between American, British, and Canadian spellings and vocabulary as well and other variant information. CHANGES: >From Revision 5.0 to Revision 5.1 (January 6, 2010) - Corrected numerous errors after running various forms of verification on varcon.txt. - Reordered the clusters in varcon.txt so that they are mostly in alphabetic order based on the headword. >From Revision 4.1 to Revision 5.0 (December 27, 2010) - Completely new format for the main table which, in addition to providing the preferred spelling of a word for various forms of English, also records variant and other information. To reflect this change, the name of the file was renamed from abbc.tab to varcon.txt. - Massive effort to verify the variant information against authoritative sources (mainly Oxford dictionaries). Most entries for common words (SCOWL level 35 and below) have been checked against at least a British and Canadian dictionary. - Added variant information for numerous other words, even when there is no difference between the various forms on English. - Other changes corresponding to the new format. |
From: Kevin A. <ke...@gn...> - 2010-12-28 18:05:02
|
I am pleased to announce that a new version of SCOWL, corresponding Aspell and Hunspell dictionaries, and VarCon is now available. The biggest changes are greatly improved Canadian and British wordlists due to an massive effort to verify the variant information in VarCon against authoritative sources. The American wordlist has also been improved. Get SCOWL, Hunspell dictionaries, and VarCon at: http://wordlist.sourceforge.net/ Get the updated Aspell dictionary at: http://ftp.gnu.org/gnu/aspell/dict/en/ Spell Checking Oriented Word Lists (SCOWL) Revision 7.0 (SVN Revision 127) December 27, 2010 by Kevin Atkinson (ke...@gn...) SCOWL (Spell Checker Oriented Word Lists) is a collection of word lists split up in various sizes, and other categories, intended to be suitable for use in spell checkers. However, I am sure it will have numerous other uses as well. CHANGES: >From Revision 6 to 7 (December 27, 2010) Updated to revision 5.0 of Varcon which corrected many errors, especially in the British and Canadian spelling categories. Also added new spelling categories for the British and Canadian spelling variants and separated them out from the main variant_* categories. Moved Moby names lists (3897male.nam 4946fema.len 21986na.mes) to 95 level since they contain too many errors and rare names. Moved frequently class 0 from Brian Kelk's Wordlist from level 60 to 70, and also filter it with level 80 due to, too many misspellings. Many other minor fixes. Variant Conversion Info (VARCON) Revision 5.0 (SVN Revision 124) December 27, 2010 by Kevin Atkinson (ke...@gn...) This package contains information to convert between American, British, and Canadian spellings and vocabulary as well and other variant information. CHANGES: >From Revision 4.1 to Revision 5.0 (December 27, 2010) - Completely new format for the main table which, in addition to providing the preferred spelling of a word for various forms of English, also records variant and other information. To reflect this change, the name of the file was renamed from abbc.tab to varcon.txt. - Massive effort to verify the variant information against authoritative sources (mainly Oxford dictionaries). Most entries for common words (SCOWL level 35 and below) have been checked against at least a British and Canadian dictionary. - Added variant information for numerous other words, even when there is no difference between the various forms on English. - Other changes corresponding to the new format. |
From: wael m. <zez...@gm...> - 2010-04-09 22:52:47
|
-- zezo |
From: Kevin A. <ke...@at...> - 2009-11-28 04:51:52
|
On Fri, 27 Nov 2009, Kelly Jones wrote: > Is there any way I can get definitions for the words in the scowl word list? > > I want to add these words to wiktionary, but can't do this without definitions. What makes you think I have the definitions? |
From: Kelly J. <kel...@gm...> - 2009-11-28 03:57:43
|
Is there any way I can get definitions for the words in the scowl word list? I want to add these words to wiktionary, but can't do this without definitions. -- We're just a Bunch Of Regular Guys, a collective group that's trying to understand and assimilate technology. We feel that resistance to new ideas and technology is unwise and ultimately futile. |
From: Kevin A. <ke...@gn...> - 2009-02-25 06:06:51
|
First off a disclaimer: I am not a professional writer, and I tend to make many mistakes, I wrote a spell checker (not Hunspell) because of that. So please excuse any mistakes I may make in any of my emails. On Tue, 24 Feb 2009, D Dibble wrote: > At this time I do not think any one really maintains the American > English dictionary for OO. The subject was brought up a while ago. I > created a new official dictionary from the SVN version of SCOWL which > you can find at wordlist.sourceforge.net. The Hunspell maintainer made > some enhancements and later posted his version. I think something was > uploaded for the next version of OO, but I lost track. > ________________________________________ > > A copy of your e-mail was sent to me by wordlist-devel, and I wanted to clarify. By "no one" I mean no one, as far as I know, has taken responsibility for keeping the official Open Office en_US dictionary up to date. I currently maintain the upstream version of the en_US dictionary which is used by Open Office and Mozilla. > I submitted a replacement U.S. English dictionary on September 8, 2008, > and later was told to join wordlist-devel so that the word list would be > available to AbiWord, Mozilla, etc. I have been working on the > dictionary since 1993. It is about the same size as the existing OOo > dictionary, but checking with Hunspell shows there is more than a 20,000 > word difference between mine and the existing dictionary. Please see > Open Office issue 92383 for a discussion of the word errors (and how > many of them stem from Microsoft Word), my aims in compiling an accurate > and current word list, and something about my background and > qualifications. All the words I submitted were checked against > dictionaries, usually against online sources such as > http://dictionary.reference.com and http://www.merriam-webster.com. > Doing a check with copy and paste insures that no typos were introduced > during the proofing process. URL to the issue (for the benefit of others): http://qa.openoffice.org/issues/show_bug.cgi?id=92383 As the originally author of the English wordlist I can tell you that the wordlist has nothing to do with Microsoft Word. It was created from an old version of SCOWL. Most likely version 4. I have made many improvements to the dictionary since then. And corrected many of the errors you noticed, but perhapses not all. More below. > I removed the offensive racial > epithets in the MS Word dictionary, which were also in the OOo > dictionary, and hoped that this decision would be approved, but some > people may feel differently. I for one strongly disagree. I reluctantly agreed to mark the most offensive of the terms as NOSUGGEST. > I believe something of the dictionary will be introduced in Open Office 3.1. I really can't speak for that. But it is unlikely that your dictionary will be used over mine. That doesn't mean that you contribution has to go to waste. That being said I could really use your expertise in improving SCOWL (and thus, the en_US word list). So rather than simply presenting a new wordlist, it will be infinitely more helpful to provide a list of corrections. However, before you do that, here are some comments on the corrections/enhancements you mention in the bug report. "alright" - has to stay for a general purpose wordlist, many consider its usage acceptable. "airbag" (and other compound words) - there are tricky, some may consider "airbag" correct. In fact, it is in some dictionaries, still I would like a list of compound words you think need to be two words or separated with a hyphen. "sanserif" - that is an error, it comes from Brian Kelk's "UK English Wordlist with Frequency Classification". I should perhapses reconsider using it. "antisemitic" - again tricky, see compound note above "Lichtenstein" - I would probably agree with you. But I need a complete list of them. Variants - I mostly agree with you. I for the most part try to only include one spelling for any word. But I make exceptions, for example, I include both "OK" and "okay". However, my source of variant information (VarCon) is incomplete, thus, some variants slip by. I am working to improve this now, will gladly make use of any information you can provide me. major metropolitan centers in the rest of the world - I will gladly include them, provided the the list is not too large. However, keep in mind that this is an American English dictionary, thus it focuses on English as spoken in the United States, there are separate word lists for other variants of English. accents - I have accent information on many words, but I remove them as most people don't use the accented forms of words in common written (email, web, etc.). But I agree with you that in professional writing the correct accented form is generally preferable. "corespondent" - it is a valid word that is in most dictionaries, but I do not have a strong option on it, I would really like a complete list of words you feel fall into this category. Possessive forms - I agree that too many words have possessive forms. The problem is that there is no good source of words in which a possessive form is appropriate, thus I add possessive forms to most nouns. I would gladly make use of more precise information. Finally, before you send me a list of corrections, you should compare your list against the new en_US dictionary available at wordlist.sourceforge.net (http://downloads.sourceforge.net/wordlist/hunspell-en_US-20081205.zip) or even better the SVN version of SCOWL (words up to level 60 are included in en_US). You can either send me the corrected lists to this list or, preferably, file bug reports at https://sourceforge.net/tracker/?group_id=10079&atid=1014602 Thanks in advance for any help you can give me for improving SCOWL. |
From: D D. <dib...@sb...> - 2009-02-24 17:05:21
|
--- On Tue, 2/24/09, Kevin Atkinson <ke...@gn...> wrote: At this time I do not think any one really maintains the American English dictionary for OO. The subject was brought up a while ago. I created a new official dictionary from the SVN version of SCOWL which you can find at wordlist.sourceforge.net. The Hunspell maintainer made some enhancements and later posted his version. I think something was uploaded for the next version of OO, but I lost track. ________________________________________ A copy of your e-mail was sent to me by wordlist-devel, and I wanted to clarify. I submitted a replacement U.S. English dictionary on September 8, 2008, and later was told to join wordlist-devel so that the word list would be available to AbiWord, Mozilla, etc. I have been working on the dictionary since 1993. It is about the same size as the existing OOo dictionary, but checking with Hunspell shows there is more than a 20,000 word difference between mine and the existing dictionary. Please see Open Office issue 92383 for a discussion of the word errors (and how many of them stem from Microsoft Word), my aims in compiling an accurate and current word list, and something about my background and qualifications. All the words I submitted were checked against dictionaries, usually against online sources such as http://dictionary.reference.com and http://www.merriam-webster.com. Doing a check with copy and paste insures that no typos were introduced during the proofing process. I haven't really had any feedback. I removed the offensive racial epithets in the MS Word dictionary, which were also in the OOo dictionary, and hoped that this decision would be approved, but some people may feel differently. I believe something of the dictionary will be introduced in Open Office 3.1. David M. Dibble |
From: Kevin A. <ke...@gn...> - 2009-02-24 02:34:20
|
On Mon, 23 Feb 2009, Kevin Atkinson wrote: > On Tue, 24 Feb 2009, Michael van den Berg wrote: > >> Thank you for your reply! And thanks also for the explanation in which cases >> you will and in which cases not add any words from the list. I think those >> reasons are only very fair! >> >> Before I burden you with my list, I would probably do the best to update my >> current list. Do you know whether OOo2.4 has the most current list? > > Most likely not. > >> Or should >> I ask this in the OOo forum? I believe that this worldlist-devel are in fact >> a separate project from whom any application can borrow. Who maintains the >> spell checker and content for OOo? > > At this time I do not think any one really maintains the American English > dictionary for OO. The subject was brought up a while ago. I created a > new official dictionary from the SVN version of SCOWL which you can find at > wordlist.sourceforge.net. The Hunspell maintainer made some enhancements > and later posted his version. I think something was uploaded for the next > version of OO, but I lost track. > > Using the one at wordlist.sourceforge.net should be sufficient. > >> For my information: is the most up-to-date list from SCOWL from 2004? This is >> what I seemed to see in the README. I like the examples in the README; I >> believe that I sometimes use inflection that only would appear in the 70 or >> 80 level of the list, like holism (from holistic). > > Yes that was the last released version. However, the SVN version contains > many changes since then. If you have a GNU/Linux box you should be able > to build SCOWL from SVN. Not so sure about non GNU/Linux systems. BTW: "holism" is in level 55 of the current SVN version. > >> Michael van den Berg >> Canada > > I just noticed you are from Canada. The Canadian dictionary is in serious > need of help. The version at wordlist.sourceforge.net should be a major > improvement, but it still needs help. If you are interested in a better > Canadian dictionary (as appose to American) let me know and I can explain > further. > > >> Cite Kevin Atkinson <ke...@gn...> >> Sent ma 23 feb 2009 20:06:48 CET >> SubjectRe: [Wordlist-devel] contribute with wordlist >> --------------- >> >>> On Mon, 23 Feb 2009, Michael van den Berg wrote: >>> >>>> Writing quite some papers for university, I have collected a list of >>>> words not in the word list of OOo2.4, EN-US. >>>> >>>> How can I contribute? >>>> Is there a way that everyone can contribute? >>>> I have asked already asked on several forums (like Ubuntu forums and >>>> OOo forum) >>> >>> Your best bet is to submit it on the Issue Tracker at >>> https://sourceforge.net/tracker/?group_id=10079&atid=1014602 >>> >>> Please note, however, that I will not simply add everyword. I will cross >>> reference the list with the SVN version of SCOWL. The en_US wordlist is >>> created from SCOWL using word up to size (or level) 60. Since the >>> OpenOffice dictionary you are using is likely very old and contains many >>> errors, so it could be some of the words are already included. For other >>> words it could be that the word is in SCOWL but at a higher size. I will >>> likely reject those also since I want to keep the dictionary from getting >>> too large. Finally it could be that the word is an variant spelling which >>> I chose not to include. I might consider adding the remaining words if I >>> recognize them and think they are common enough to warrant inclusion. >>> >>> So please submit your wordlist, but do not be offended if I reject the >>> majority of them. I will still carefully look over the words (assuming it >>> not too large), even if I said I would likely reject them, in order to find >>> possible bugs in the creation process. >> >> > > ------------------------------------------------------------------------------ > Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA > -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise > -Strategies to boost innovation and cut costs with open source participation > -Receive a $600 discount off the registration fee with the source code: SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > Wordlist-devel mailing list > Wor...@li... > https://lists.sourceforge.net/lists/listinfo/wordlist-devel > > |
From: Kevin A. <ke...@gn...> - 2009-02-24 02:32:21
|
On Tue, 24 Feb 2009, Michael van den Berg wrote: > Thank you for your reply! And thanks also for the explanation in which cases > you will and in which cases not add any words from the list. I think those > reasons are only very fair! > > Before I burden you with my list, I would probably do the best to update my > current list. Do you know whether OOo2.4 has the most current list? Most likely not. > Or should > I ask this in the OOo forum? I believe that this worldlist-devel are in fact > a separate project from whom any application can borrow. Who maintains the > spell checker and content for OOo? At this time I do not think any one really maintains the American English dictionary for OO. The subject was brought up a while ago. I created a new official dictionary from the SVN version of SCOWL which you can find at wordlist.sourceforge.net. The Hunspell maintainer made some enhancements and later posted his version. I think something was uploaded for the next version of OO, but I lost track. Using the one at wordlist.sourceforge.net should be sufficient. > For my information: is the most up-to-date list from SCOWL from 2004? This is > what I seemed to see in the README. I like the examples in the README; I > believe that I sometimes use inflection that only would appear in the 70 or > 80 level of the list, like holism (from holistic). Yes that was the last released version. However, the SVN version contains many changes since then. If you have a GNU/Linux box you should be able to build SCOWL from SVN. Not so sure about non GNU/Linux systems. > Michael van den Berg > Canada I just noticed you are from Canada. The Canadian dictionary is in serious need of help. The version at wordlist.sourceforge.net should be a major improvement, but it still needs help. If you are interested in a better Canadian dictionary (as appose to American) let me know and I can explain further. > Cite Kevin Atkinson <ke...@gn...> > Sent ma 23 feb 2009 20:06:48 CET > SubjectRe: [Wordlist-devel] contribute with wordlist > --------------- > >> On Mon, 23 Feb 2009, Michael van den Berg wrote: >> >>> Writing quite some papers for university, I have collected a list of >>> words not in the word list of OOo2.4, EN-US. >>> >>> How can I contribute? >>> Is there a way that everyone can contribute? >>> I have asked already asked on several forums (like Ubuntu forums and >>> OOo forum) >> >> Your best bet is to submit it on the Issue Tracker at >> https://sourceforge.net/tracker/?group_id=10079&atid=1014602 >> >> Please note, however, that I will not simply add everyword. I will cross >> reference the list with the SVN version of SCOWL. The en_US wordlist is >> created from SCOWL using word up to size (or level) 60. Since the >> OpenOffice dictionary you are using is likely very old and contains many >> errors, so it could be some of the words are already included. For other >> words it could be that the word is in SCOWL but at a higher size. I will >> likely reject those also since I want to keep the dictionary from getting >> too large. Finally it could be that the word is an variant spelling which >> I chose not to include. I might consider adding the remaining words if I >> recognize them and think they are common enough to warrant inclusion. >> >> So please submit your wordlist, but do not be offended if I reject the >> majority of them. I will still carefully look over the words (assuming it >> not too large), even if I said I would likely reject them, in order to find >> possible bugs in the creation process. > > |
From: Michael v. d. B. <sp...@zo...> - 2009-02-24 00:45:40
|
Thank you for your reply! And thanks also for the explanation in which cases you will and in which cases not add any words from the list. I think those reasons are only very fair! Before I burden you with my list, I would probably do the best to update my current list. Do you know whether OOo2.4 has the most current list? Or should I ask this in the OOo forum? I believe that this worldlist-devel are in fact a separate project from whom any application can borrow. Who maintains the spell checker and content for OOo? For my information: is the most up-to-date list from SCOWL from 2004? This is what I seemed to see in the README. I like the examples in the README; I believe that I sometimes use inflection that only would appear in the 70 or 80 level of the list, like holism (from holistic). Thanks! Michael van den Berg Canada -------------- Cite Kevin Atkinson <ke...@gn...> Sent ma 23 feb 2009 20:06:48 CET SubjectRe: [Wordlist-devel] contribute with wordlist --------------- > On Mon, 23 Feb 2009, Michael van den Berg wrote: > >> Writing quite some papers for university, I have collected a list of >> words not in the word list of OOo2.4, EN-US. >> >> How can I contribute? >> Is there a way that everyone can contribute? >> I have asked already asked on several forums (like Ubuntu forums and >> OOo forum) > > Your best bet is to submit it on the Issue Tracker at > https://sourceforge.net/tracker/?group_id=10079&atid=1014602 > > Please note, however, that I will not simply add everyword. I will > cross reference the list with the SVN version of SCOWL. The en_US > wordlist is created from SCOWL using word up to size (or level) 60. > Since the OpenOffice dictionary you are using is likely very old and > contains many errors, so it could be some of the words are already > included. For other words it could be that the word is in SCOWL but > at a higher size. I will likely reject those also since I want to > keep the dictionary from getting too large. Finally it could be that > the word is an variant spelling which I chose not to include. I > might consider adding the remaining words if I recognize them and > think they are common enough to warrant inclusion. > > So please submit your wordlist, but do not be offended if I reject > the majority of them. I will still carefully look over the words > (assuming it not too large), even if I said I would likely reject > them, in order to find possible bugs in the creation process. |
From: Kevin A. <ke...@gn...> - 2009-02-23 19:31:08
|
On Mon, 23 Feb 2009, Michael van den Berg wrote: > Writing quite some papers for university, I have collected a list of > words not in the word list of OOo2.4, EN-US. > > How can I contribute? > Is there a way that everyone can contribute? > I have asked already asked on several forums (like Ubuntu forums and > OOo forum) Your best bet is to submit it on the Issue Tracker at https://sourceforge.net/tracker/?group_id=10079&atid=1014602 Please note, however, that I will not simply add everyword. I will cross reference the list with the SVN version of SCOWL. The en_US wordlist is created from SCOWL using word up to size (or level) 60. Since the OpenOffice dictionary you are using is likely very old and contains many errors, so it could be some of the words are already included. For other words it could be that the word is in SCOWL but at a higher size. I will likely reject those also since I want to keep the dictionary from getting too large. Finally it could be that the word is an variant spelling which I chose not to include. I might consider adding the remaining words if I recognize them and think they are common enough to warrant inclusion. So please submit your wordlist, but do not be offended if I reject the majority of them. I will still carefully look over the words (assuming it not too large), even if I said I would likely reject them, in order to find possible bugs in the creation process. |
From: Michael v. d. B. <sp...@zo...> - 2009-02-23 16:54:43
|
Writing quite some papers for university, I have collected a list of words not in the word list of OOo2.4, EN-US. How can I contribute? Is there a way that everyone can contribute? I have asked already asked on several forums (like Ubuntu forums and OOo forum) Michael van den Berg Canada |
From: Kevin A. <ke...@gn...> - 2007-11-23 09:17:49
|
... |