Menu

#550 Migrate from Python 2 to Python 3

None
closed-fixed
None
5
2018-03-20
2017-12-27
No

SDCC uses some Python 2. This should be migrated to Python 3, as Python 2 is EOL 2020. Not an urgent task, but probably something to be done before 3.8.0.

Philipp

Discussion

  • Philipp Klaus Krause

    This turns out to be harder than expected. Apparently the code is actually Python 1 whatever that still runs in Python 2, but is not supported by the automatic conversion tools.

    In [r10269]/[r10270] 3 files out of 7 have been converted to be compatible with both Python 2 and Python 3.

    Philipp

     
  • Philipp Klaus Krause

    There was further progress. However, there are remaining issues when using Python 3. In particular for test gcc-torture-execute-20000227-1.c:

    Traceback (most recent call last):
      File "./generate-cases.py", line 191, in <module>
        main()
      File "./generate-cases.py", line 188, in main
        s.generate()
      File "./generate-cases.py", line 165, in generate
        self.readfile()
      File "./generate-cases.py", line 125, in readfile
        self.lines = fin.readlines()
      File "/usr/lib/python3.6/codecs.py", line 321, in decode
        (result, consumed) = self._buffer_decode(data, self.errors, final)
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 211: invalid start byte
    

    This happens for various ports (at least mcs51-small and hc08).

    Philipp

     
  • Philipp Klaus Krause

    Thanks to Erik's commit today, the issue above (and the same problem for test iso-8859-1.c) are now the only remaining issues. Once those two have been sorted out, we can accpet Python 3 in the AC_CHECK_PROGS in configure.ac.

    Philipp

     
  • Philipp Klaus Krause

    generate-cases.py reads the C source files. If the source contains characters that are not valid UTF-8, this fails in Python 3, but works in Python 2.
    test iso-8859-1.c and gcc-torture-execute-20000227-1.c are the only two tests that use non-UTF-8 source.

    Philipp

     
  • Erik Petrich

    Erik Petrich - 2018-03-19

    I have updated generate-cases.py and HTMLgen.py to read/write the files using the Latin-1 encoding when Python 3 or later is detected. This maps the values 0x00 - 0xff in the files to Unicode code points U+0000 - U+00FF. This has the drawback that any UTF-8 sequences in the source files will not translate to the proper Unicode code points, but as long as the names and values in the templates stick to ASCII only, this should not be a problem and any UTF-8 sequences will simply pass through unchanged. Likewise, any extended ASCII characters (that may or may not be UTF-8 compatible) will also pass through unchanged.

     
  • Philipp Klaus Krause

    • status: open --> closed-fixed
    • assigned_to: Philipp Klaus Krause
    • Group: -->
     
  • Philipp Klaus Krause

    The Python migration is complete in [r10294]. All Python code used in SDCC should now be fully compatible with both Python 2.7 and Python 3.6 (the new default).

    Philipp

     

Log in to post a comment.

MongoDB Logo MongoDB