Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

Unicode problems

jec
2006-07-20
2013-03-15
  • jec
    jec
    2006-07-20

    Hi,

    I installed the newest version of Eclipse (3.2.0) and Pydev (1.2.1) earlier this week, to work on an existing project (currently stored in CVS, in case this matters).

    Here's my problem:  whenever I edit the file in Eclipse (using the Pydev editor), the tool is automatically updating the file-level preferences encoding property to explicitly be UTF-8.

    Here's why I think it's happening: Our CVS server uses Latin-1 encoding, so I need to edit python source files so that they're saved in Latin-1 format.  I've set the Project-level preferences is set to ISO-8859-1.

    BUT... the automated test harness requires that they be executed using UTF-8 encoding, so the file contains the "coding: UTF-8" directive at the top of the file.

    If I edit while the file's properties are set to UTF-8, the synch viewer is showing character deltas between the repository version and my local version.  If I force the format of the file explicitly to ISO-8859-1, the deltas go away, but it seems to bother the interpreter and my test cases fail.

    Ideally, I'd like to edit in one mode, and run in another.  Is this possible?

    If not... what (do you think) would happen if I submitted the file to the Latin-1 repository from the UTF-8 encoding client?

     
    • Fabio Zadrozny
      Fabio Zadrozny
      2006-07-21

      Well, I don't think you can edit in one mode and run in another... if you specify the coding:xxx, eclipse will interpret it as that.

      As for having the encoding in one format and the cvs in the other, I've never experienced problems regarding that (but I don't think I've ever tested it... so, I think you should test it...

      Cheers,

      Fabio

       
    • jec
      jec
      2006-07-21

      Thanks Fabio,

      And rats!

      I am afraid that we might already be having problems, and that the "test" has already failed...

      I'm the only person on the team trying out the PyDev plug-in.  Everyone else is using Eclipse without a (Python) plug-in (thus, they edit in Eclipse, but run tests at the command line).

      Very definitely, if I explicitly set the file encoding to Latin-1, the accented characters is my source code file show up OK (and the same) in both my local version and the remote version.  But I can't run the test in the Eclipse environment.

      If I allow Eclipse to update the file encoding to UTF-8, the accented characters in my local copy show up OK, but the remote version shows garbage, and the compare shows me the characters have changed between the two versions.  But I can run the test in the Eclipse environment (since my local copy is OK).

      Since the Eclipse/CVS code delta tool is definitely seeing a difference between the files when my settings say it is UTF-8, I guess that there's a difference.  So, I guess that if I submit my update these new differences will also be submitted, and everyone else on the team (not using the PyDev plug-in, and editing the file as though it is Latin-1) will start to see differences.

      I hate to lose the interactive debugging feature for Python scripts (this is what I really wanted!), but I'm not sure I can submit these deltas.

      Any suggestions?

      jec

       
      • Fabio Zadrozny
        Fabio Zadrozny
        2006-07-21

        Ok, I see your point, but I have some doubts on how you're doing things:

        1 - which test do you have and why (how) does it fail?
        2 - I believe you agree that you have a big inconsistency in your environment if you have a file marked as being utf-8 and edited as latin-1, so, if all are marking things as utf-8, they should be interpreted as utf-8 (you can configure this in the Eclipse preferences... you don't need pydev for that).

        What pydev is doing is just making sure that your files are consistent with the encoding you're declaring...

        Cheers,

        Fabio

         
        • jec
          jec
          2006-07-21

          Hi Fabio,

          I really appreciate your help!  (and your interest!)

          But something strange is going on now.  Yesterday, I spent 4 hours doing nothing except looking at the various things I could do with this file, and seeing what the effect was.  It all seemed very predictable.

          Today, the whole problem seems less predictable.  I am running a test with the exact same settings I noted yesterday, and getting different results.  (Now, editing my file has updated its preferences to UTF-8, which is still causing the delta from the repository, but my testcase is also failing because of the character set mismatch... which does not make sense).  I need to investigate further before I continue asking for help with this.

          But... do you think, if I explicitly set the file-level encoding to Latin-1 would Eclipse stop updating it to UTF-8?  I ask because I'm not sure  what is causing Eclipse/PyDev to automatically update the encoding of the file in the first place, so I'm not sure how to make it stop.  This is only happening in my environment, where the big difference is the addition of PyDev, so I am guessing that it is related to the presence of PyDev... but I'm not sure.

          jec

           
          • Fabio Zadrozny
            Fabio Zadrozny
            2006-07-21

            Hi,

            Well, this is probably related to pydev... whenever it finds the encoding in a python file as defined in the pep 263 (http://www.python.org/dev/peps/pep-0263/) it will change mark that file as having that encoding in eclipse.

            If you don't put that encoding declaration it does no changes to the default encoding.

            Cheers,

            Fabio