From: Gert-Ludwig I. <ger...@ph...> - 2017-09-17 12:26:14
|
Hello, according to the docs, texenc fixes the encoding used in the communication with (La)TeX. Unfortunately, in general, the log messages of (La)TeX are not returned in a well-defined encoding. Specifically, messages containing input from the (La)TeX file are returned in the fontenc, e.g. T1 (cf. e.g. https://tex.stackexchange.com/questions/131238/what-controls-the-encoding-of-the-latex-log-file-and-how-to-change-it). In the following example, this will lead to a UnicodeDecodeError: from pyx import canvas, text text.set(text.LatexRunner, texenc="utf-8") text.preamble(r"""\usepackage[T1]{fontenc} \usepackage[utf8x]{inputenc}""") c = canvas.canvas() c.text(0, 0, r'foo föo föo f\"oo f\"oo', [text.parbox(2)]) c.writePDFfile() The parbox is sufficiently narrow to produce an overfull hbox and the error message will contain the input string. Independently of whether the umlaut ö is given as Unicode character or TeX code, it will be translated into a byte 0xf6 which cannot be interpreted as UTF-8 encoded character. As a temporary solution, I set the encoding of MonitorOutput in my copy of text.py to latin1 because I want to be able to handle input in utf8. Most likely, however, this is not the best solution... Best regards, Gert -- Gert-Ludwig Ingold email: Ger...@Ph... Institut für Physik Phone: +49-821-598-3234 Universität Augsburg Fax : +49-821-598-3222 D-86135 Augsburg WWW : www.physik.uni-augsburg.de/theo1/ingold Germany PGP : 86FF5A93, key available from homepage |