Hi, i'll first introduce myself.
I am RJ (Ruud) Baars, part of teh OpenTaal initiative (www.opentaal.org)
, which is 'opening Dutch'.
We managed to collect lots of correct Dutch words, create a word list,
get an official approval for this, and produced a hunspell dictionary
which works rather well..
Now we want to introduce compounding, since Dutch is very flexible in
Since there is a bit of regularity in correctness of compound based on
wort tupe/classification, the most usefull way to do it seems to be
using the COMPOUNDRULES. But then, there is a bit of a limitation with
affixes and suffixes.
In a word like 'man' (UK: man) there are the forms : mannen (UK:men),
mannetje (little man), mannetjes (little man).
mannenauto (UK men car) is correct, mannetjeauto is NOT correct,
mannetjesauto IS correct.
Is there a way to do this using suffixes, and will the word compunded
with also support suffixes then ?
Another issue is the abbreviation-words, like AA (pronounced as 2
separate letters). In Dutch:
AA-club is OK
AAclub is not
shit-AA is OK
shit-aa is NOT.
shitaa is wrong too
For short : case should be kept in ompounds and as a separate word, and
the dash is required in any compound
(Too make iit more complex, there are words like pc, which sould be kept
in lowercase in compounds, except whe at start of sentence, where it
will change into Pc-.
I have not been able to produce a dic and aff file that realises this.
Is it possible ?
Maybe you can give me a hint on how to ?
just an answer to your first question:
On Mon, Aug 27, 2007 at 05:16:15PM +0200, r.j.baars wrote:
> In a word like 'man' (UK: man) there are the forms : mannen (UK:men),
> mannetje (little man), mannetjes (little man).
> mannenauto (UK men car) is correct, mannetjeauto is NOT correct,
> mannetjesauto IS correct.
> Is there a way to do this using suffixes, and will the word compunded
> with also support suffixes then ?
try adding the COMPOUNDPERMITFLAG to the suffix rule like:
SFX x 0 netjesauto/c n
with rules similar to this man + netjesauto would also be allowed to be
suffix expanded in the middle of compound words. You would have to test
if this also works with the COMPOUNDRULE mechanism. I use the
COMPOUNDPERMITFLAG with COMPOUNDBEGIN/MIDDLE/END and there it works just
fine. If it does not work with the COMPOUNDRULE mechanism, please report