I did send a few other mails before, as I started to work on a project with
Now I am at the point, where I could release the first data, however, the big
problem is: which encoding do I use?
Here I use unicode, or to be more precise utf-8 which converts a 16 bit unicode
character into 1 to 6 bytes depending on the character code. (ASCII 0 .. 127
are still single byte ASCII) - as it should be supported by the dict server
protocoll. I did choose this format, so all my wordlists have the same encoding,
no matter if they are for european languages (english, french, german, ...) or
greek, russian, chineese.
As far as I understand it, there should be no problems using this protocoll. -
Is this right? - There wouldn´t be problems using regular expressions or the
like, I should warn users about?