#804 Serialize UTF8 using UTF8 escape strings

Next full release
open
nobody
None
5
2013-01-30
2013-01-04
Oliver Kopp
No

If a user writes a comment in "Review" tab in UTF8, say to write in their own language, non-ASCII characters can cause halt in compiling a document even when the document is pure English. This happens for example if I use default latex in TeXLive 2012. Although JabRef allows many encodings, as far as UTF8 concerns, converting it to unicode escape strings in Java style may resolve the problem. But I'm not sure if this is the best solution when JabRef accepts other encodings.

Discussion

  • I think that is a very bad idea. Fields should be stored in the same encoding as the normal text. Otherwise you break many other applications including LaTeX typesetting.

    Escaped or not: LaTeX will have problems anyway. Unicode escape strings cause more problems. If it stops, TeX processes the tokenized strings somehow (e.g. trying to typeset them). In that case you should ensure that you are using the correct input encoding. There are several ways to deal with this problem either with unicore-aware TeX programs (Omega, XeTeX, LuaTeX) or using the ucs package or using \usepackage[utf8]{inputenc} together with the fontencodings that are needed to typeset the corresponding characters.