Yeah, sorry guys, I've been a bit slack with that.
I should have some free time this weekend. I've got an implementation
that has most of the work done, and supports Unicode 5.0.
Done:
* category
* combining
* decimal
* decomposition
* digit
* mirrored
* name
* numeric
* unidata_version
Outstanding
* east_asian_width
* lookup
* normal
* ucd_4_1_0 ?
* ucd_3_2_0 ?
I can make the code available - I've got some space and should be able
to expose it via hg or bzr (I think), or just send you a .tgz if
you're interested. But first I need to pull in the updates from main
trunk - I've not got the latest build changes in, so I'll have a small
merge to do first.
There are some open questions, such as do we want to provide an
implementation that supports Unicode 4.1, as per CPython 2.5 and
Unicode 3.2, as per CPython 2.3. I think that the worth of such
implementations is questionable.
Performance can probably be improved - for my first draft I went for a
simple brute force approach, but a smaller table lookup would be
possible, a la Xerces XMLChar or XOM Verifier.
Cheers,
James
On 22/11/2007, Philip Jenvey <pjenvey@...> wrote:
>
> On Nov 22, 2007, at 2:57 AM, Mehendran T wrote:
>
> > Hi,
> >
> > I started to implement unicodedata module. And I have just done some
> > investigation on that.
> >
> > In Python, it is written in C. It has a unicodename_db.h which
> > contains
> > entire Unicode info which had taken from the UnicodeData.txt
> > from ftp://ftp.unicode.org/Public/3.0-Update/UnicodeData-3.0.0.txt.
> > and unicodename_db.h is a generated file from the Tools/
> > makeunicodedata.py.
> >
> > In Java, I have seen two classes Character and a static class
> > Character.UnicodeBlock where we can get the Unicode values. But I
> > think those
> > mayl not be useful to write the module properly.
> >
> > Please tell me how to proceed to implement the Unicodedata module.
> >
> > My question is :
> > Shall I follow the same design as CPython does or
> > Does java itself support to map unicode data values, if so, how
> > can I make use
> > of that?
> >
>
> James Abley has actually already started working on a unicodedata
> module -- I know this because he's blogged about his progress a few
> times.
>
> Though I haven't seen a blog entry about it in a little while. You
> should shoot him an email and maybe you two can collaborate on
> finishing it up.
>
> The jython section of his blog is here:
>
> http://eternusuk.blogspot.com/search/label/jython
>
> --
> Philip Jenvey
>
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Jython-dev mailing list
> Jython-dev@...
> https://lists.sourceforge.net/lists/listinfo/jython-dev
>
|