PyChecker / Feature Requests / #20 7-bit source str literals

#20 7-bit source str literals

Status: closed-out-of-date

Owner: nobody

Labels: None

Priority: 5

Updated: 2006-10-10

Created: 2004-08-06

Creator: Anders J. Munch

Private: No

I'd like to see a warning for non-unicode non-ascii
string literals.

Example:

# -*- coding: latin-1 -*-
x = "blĺbćrgrřd"

This is a likely error; the encoding declaration
doesn't actually do anything for the non-ascii string,
and consequently, when you pass it to some
unicode-capable presentation device, unicode(x) will
bomb with a UnicodeDecodeError exception.

Hence I'd like a warning for any non-unicode string
literal that contains non-ascii characters typed
literally (i.e. not entered as escape sequences). The
fix to avoid such a warning would, depending on your
intent, be either:

x = 'bl\xe5b\xe6rgr\xf8d' # intent: raw byte string
with bit-8-set values

# -*- coding: latin-1 -*-
x = u"blĺbćrgrřd" # intent: latin-1 character string

Discussion

Anders J. Munch - 2006-10-10

status: open --> open-out-of-date
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders J. Munch - 2006-10-10

Logged In: YES
user_id=384806

Close this: Since Python 2.5 makes this a syntax error, so
there is no longer any need for pychecker to do anything.

(BTW, those question marks are not what I wrote, but then
that's a SourceForge bug.)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders J. Munch - 2006-10-10

status: open-out-of-date --> closed-out-of-date
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

7-bit source str literals

Group

Searches

Help

#20 7-bit source str literals

Discussion