3.9.6 Progress, Encoding Questions

  • Daniel Pozmanter

    Few problems:

    The solution to the encoding bug is an ugly one, but I do not see a good way around it:

    For saving there is no problem.  A user will be able to select an encoding to save the file in, and further, can select a default encoding to use.

    For opening, who knows what is on the other end?
    And this is a problem.  Not just because figuring out what is on the other end is difficult (possibly impossible), but because it would be a performance drain.

    So what I am thinking is I may release my work thus far as 3.9.6, and then try to figure out what a file's encoding is.  (This may also be an opportunity to redo part of the interface.)

    In any case, this requires more thought, and I certaintly want to hear what people have to think
    (especially if you know of a good way to detect a file's character encoding.)
    I will make some sort of release tomorrow, everything else (save the file dialog correction) is done, I just need to decide whether or not to try something out with encoding.

    Perhaps leaving it so that the user must select a default encoding, with no autodetection, is better.
    I just don't like limiting the user that way.  Perhaps taking an old feature request from limodou (integrating encoding into the file dialog?).

    • Daniel Pozmanter

      I did a some research (and read some interesting stuff by the mozilla folks).


      Auto-detection for unicode, manual for the rest.
      There just does not seem to be a viable way to detecting other charsets.

      Thus you can set it to 'latin-1', and that wil be what documents are opened/saved as.

      I think I may try to add it to the file dialog as well.

    • Daniel Pozmanter

      Here is what I got so far:
      (I still need to let users set a default encoding):

      You can select an encoding
      (<Default Encoding>, ascii, latin-1, utf-8, utf-16)
      or type one in manually in the file dialog.

      You  can turn unicode autodetection on or off.

      You will be able to specify a default encoding (same list).

      Drpython will use the default encoding in the prompt
      (or just leave it ascii).

      For opening and saving files,
      DrPython will first see if the user manually specified an encoding in the dialog, and try this.

      If that fails, and autodetection is on, utf-8 will be used.  (If not, skip to the next step)

      If that fails, and there is a default encoding,
      that will be tried.

      Finally, it will be tried with no encoding.

      On an error here, the file will not be opened,
      and an error dialog displayed.

      I can always add additional encodings to the list.

      For the future, I can add in autodetect using the special comment line:

      # -*- coding: ENCODINGSTRING -*-

      I just have to test it a bit more, and handle the prefs bit, and 3.9.6 will be ready (and the encoding bugs
      hopefully squashed).


Log in to post a comment.