unicode characters

2009-05-16
2013-05-14
  • Joseph Reagle
    Joseph Reagle
    2009-05-16

    I want to define a word (a bibtex key) that might include u'ć'. The following doesn't seem to work:

    ident_chars = "-_'" + alphanums + alphas8bit + u'ć'

    I think the corresponding hex for that char is \xc4\x87 and pyparse matches only the first byte. In any case, I'm confused, so how to refer to accented/unicode characters beyond alphas8bit? Or that, less other characters?

    ident_chars =