Re: [Gramps-users] First impression of GRAMPS

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Greg,

On Tue, Aug 10, 2004 at 07:02:39PM -0700, Greg Kuperberg wrote:
> On Tue, Aug 10, 2004 at 08:12:39PM -0500, Alex Roitman wrote:
> > Pleas do. I'd be curious to see how this is done. Can it also take care=
 of
> > cyrillic, chinese, and other non-latin charsets?=20
>=20
> I will send it in the next message.  You could easily extend the filter
> to Cyrillic, although it is intend more strictly for de-accenting
> than for transliteration.  Transliterating Hebrew is pretty
> much hopeless because the vowels are missing.  Japanese
> is even worse, and I suspect that Chinese and Arabic would also
> be hard.

[attachment]

Thanks for the code.

I guess I was not careful in reading your previous message :-)
I can see that there are dictionaries mapping the unicode characters to
either de-accented characters or the TeX commands for those characters.

Now, trying to think where exactly we can use these tools (please point
me to the right direction if I'm missing the obvious):

1. We don't really want to de-accent letters anywhere, do we? It seems
   innocuous for a mostly latin text to have occasional "apres" instead
   of "apr=E8s", but it would be totally wrong for French text. Even worse,
   in some languages the presense/absense of an umlaut can completely=20
   change pronunciation and/or meaning. It seems that if the user entered=
=20
   the non-ascii data then it should be preserved as such, in both screen=
=20
   output and reports.=20

   This IMHO would go for all report formats, including plain text
   and PDF. Now, there's a problem with the reportlab-generated PDFs
   (the PDF format option) when the text is not in iso-8859-1. We
   even contemplated removing this format in favor of the gnomeprint-based
   one, as it also has some other drawbacks. But the iso-8859-1 users
   (which heavily use accents, cedillas, and umlauts, btw) wanted to have
   an option for lean PDFs. These PDFs use standard PS fonts so no font
   information has to be embedded in the file, but this only supports
   iso-8859-1 charset.

2. As for the TeX commands, they would be suitable for the LaTeX output
   format. Except that we are using utf8 package shipped with teTeX
   that can do it for us :-)

3. Back to the first point. The gnome-print plugin is really the proper way
   for the majority of the users. The more advanced people can live
   just fine with LaTeX. The persistent ones can live with OOo and
   export into PDF from within OOo. But the gnome-print is integrated
   with the rest of the desktop, can generate nice preview, and supports
   unicode without any effort on our part. All we need to do is to use
   the fonts which cover the characters found in the text.

   The freefont covers most of the UCS, is freely available and easy
   to install. I don't really see a problem in telling the users "install
   package X to have feature Y" if X is available. On Debian, ttf-freefont
   is in the Recommended field (or should be anyway :-). We might provide
   a better message -- e.g. a dialog instead of a console output.=20

Is there any good usecase to de-accent non-ascii letters?
I have to confess that I myself do not use non-ascii and am likely
unaware of some subtleties. I'd be happy to learn :-)

> I don't know if you feel loyal to conventional graph theory terms;
> if you do, "connected components" is a better term than
> "partition".  Yes it is a partition, but that is a general
> term that denotes an arbitrary grouping.

Connected components does sound better to me.=20

Alex

--=20
Alexander Roitman   http://ebner.neuroscience.umn.edu/people/alex.html
Dept. of Neuroscience, Lions Research Building
2001 6th Street SE, Minneapolis, MN  55455
Tel (612) 625-7566   FAX (612) 626-9201

Re: [Gramps-users] First impression of GRAMPS

Gramps, the open source genealogy program

Re: [Gramps-users] First impression of GRAMPS