#150 xetex: broken pdf contents & to-UTF16 conversion warnings


# consider following document
$ cat x.txt
Привет Мир!

Миру мир

.. contents::

Hello шпаргалка


Альфа бета гамма

А ну-ка!


# it translates to .tex and .pdf nicely and resulting pdf looks good with cyrillic in pdf content and pdfinfo
$ rst2xetex x.txt && xelatex x.tex

# but if I do
$ rst2xetex --language=ru x.txt && xelatex x.tex

there are several xelatex warnings:

** WARNING ** Failed to convert input string to UTF16...

and also pdfcontents and pdfinfo display random glyphs.

The problem is that in --language=ru case, rst2xelatex adds 'russian' to document options,
and this seems to be clashing somehow with xetex.

Suggested patch, unfortunately without tests, attached.



  • Kirill Smelkov

    Kirill Smelkov - 2010-11-01
  • Günter Milde

    Günter Milde - 2010-11-02

    I cannot reproduce the warning nor the "random glyphs" in the Contents/Soderzhanie here
    (TeXLive 2009 from the Debian packages).

    In the PDF-bookmarks, I see
    a) only latin chars (without "russian" documentoption, or
    b) latin chars + "random glyphs" (with "russian" documentoption.

    Hence, leaving out the language document option does not solve the problem.
    I'll have to ask at comp.text.tex for cyrillic bookmarks with XeTeX.

    BTW.: there is a cyrillic test case in docutils/test/functional.

  • Günter Milde

    Günter Milde - 2010-11-02
  • Kirill Smelkov

    Kirill Smelkov - 2010-11-02

    Strange. I'm also on Debian using TeXLive 2009 (both on testing and also on another machine with stable and tex from testing), and removing 'russian' from documentoption helps - warnings go away, and pdf bookmarks looks good (i.e. latin + cyrillic).

    Sorry for confusion - I was wrongly referencing to pdf bookmarks as pdfcontents - in fact, the Contents/Soderzhanie always look ok, regardles whether there is an 'russian' in documentoptions or not.

    Attachiong tex source and resulting good pdf, just in case.


    P.S. thanks for pointer to cyr test

  • Kirill Smelkov

    Kirill Smelkov - 2010-11-02
  • Kirill Smelkov

    Kirill Smelkov - 2010-11-02
  • Günter Milde

    Günter Milde - 2010-11-04

    Still no warnings, but

    * there is a PDF-viewer issue: with evince, acroread, and okular I get cyrillic characters in the bookmarks,
    while with xpdf they are missing.

    * the "strange" glyphs show also without "russian" documentoption when using "unicode" hyperref option.

    * hyperref.sty activates "unicode" when given the "russian" option (directly or as global one)::


    Now I only have to find out, if this is due to my old version 2009/10/09 v6.79a ...

    The simplest fix would be not to add the language to documentoptions --
    iit's only helpful with additional packages.

  • Günter Milde

    Günter Milde - 2010-11-05
  • Günter Milde

    Günter Milde - 2010-11-05

    The global document-language option is used by several packages, including hyperref to
    adapt processing or generate localized text.

    I'd rather keep it also in rst2xetex output

    The bug with hyperref's "russian" option under XeTeX can be worked around with --language=ru --hyperref-options="unicode=false"
    or specifying ``hyperref-options: unicode=false`` for the xetex writer in the config file.

    See also the updated test-case "" and "xetex-cyrillic.tex"

  • Günter Milde

    Günter Milde - 2010-11-18

    The issue is solved in hyperref version v6.79g (2009/11/20).

    If updating the hyperref package is not an option, the workaround is
    to set ::


    or (in the config file)::

    [xetex writer]

    hyperref-option: unicode=false

  • Günter Milde

    Günter Milde - 2010-11-18
  • Günter Milde

    Günter Milde - 2010-11-18
