By default PyYAML outputs an ASCII character stream escaping Unicode characters:

'Über' -> '\xDCber'

Read YAML 1.1 specification (Chapter 5. Characters):

YAML streams use the printable subset of the Unicode character set.

If a character stream does not begin with a byte order mark (#FEFF), the character encoding shall be UTF-8.


It looks like in this case PyYAML does not follow the specification.

In the issue 11 (http://pyyaml.org/ticket/11) there is an explanation:

The default is to escape non-ASCII characters because they will produce garbage in non-utf8 terminals.


It is no problem to be able to escape non-ASCII characters but it should not be the default because it makes the output far less readable.



As a maintainer of SnakeYAML I try to stay as close as possible to PyYAML to allow developers to re-use the knowledge and API and save some brain cycles when they have to work with YAML in Python and Java.

I would like to keep the list of deviations as short as possible...