#47 Error opening file with Umlauts

Marek Kubica

When I try to open the attached file, I get an error

Error Encoding utf-8
Traceback (most recent call last):
File "c:\Programme\drpython\drUTF8.py", line 76, in
sText=unicode(text, 'utf-8')
UnicodeDecodeError: 'utf8' codec can't decode bytes in
position 62-64: invalid data

Using Dr 3.9.5.


    • priority: 5 --> 7
  • Logged In: YES

    Can you upload an example file?

    Does manually changing to utf-16 do any good?

    How about turning encoding *off* in preferences.
    Does that help?

  • Logged In: YES

    (OT) I also uploaded once a file and it didn't appear.
    I forgot to acivate the checkbox:
    "Check to Upload and Attach a File:"

  • Marek Kubica
    Marek Kubica

    Logged In: YES

    I *wanted* to upload a file, but probably I forgot to check
    the box.

    After I turned to encoding detection the file gets opened...
    problem: it is displayed empty.

  • Marek Kubica
    Marek Kubica

    The file. Another try.

  • Logged In: YES

    Well. This is fun.

    Manually entering utf-16 will cause a lovely (I assume
    chinese) file to be displayed. Only the file is clearly not
    in chinese.

    If you use the ansi version of wxpython, there will be no
    problem (the file will display and save correctly).

    Unicode is where odd things happen.

    You need to set the encoding to 'latin-1' to get it to
    display correctly.

    So what I am going to do is see if I can figure out
    autodetection of encoding type (beyond simply unicode).

    If I can, I will add encoding options for all types wxSTC

    If not, I will add the option for a custom encoding to be
    used by default (less than ideal).

  • Marek Kubica
    Marek Kubica

    Logged In: YES

    Well, so that's in fact a problem with wxPy and it's
    "support". I used it a bit, but then changed to PyGTK, as it
    fits my needs better.

    Strange that this file is not very exotic, that's one of
    many on my system, so this has to be fixed somehow.

    I'm looking forward to a fixed version, as DrPython seems to
    be a good editor and I'd really like to try it out using a
    more productive environment, with real programs.

  • Logged In: YES

    This is not a problem with wxPython (although for your
    purposes, switching to the ansi version might be best).

    It is a general problem with how can you tell what character
    encoding a file is using?

    For the fix, I am going to simply add an option to manually
    specify a default encoding to use when opening a file.

    (So you can leave it ansi, unicode, or custom).

    I will also add an option to select encodings.

  • Marek Kubica
    Marek Kubica

    Logged In: YES

    I'd be glad to see it working in the next release :)

    It's great there is so much development in DrPython!

    • status: open --> closed-fixed
  • Logged In: YES

    Finally got it (hopefully). This will be in 3.9.6.