SDCC uses some Python 2. This should be migrated to Python 3, as Python 2 is EOL 2020. Not an urgent task, but probably something to be done before 3.8.0.
This turns out to be harder than expected. Apparently the code is actually Python 1 whatever that still runs in Python 2, but is not supported by the automatic conversion tools.
In [r10269]/[r10270] 3 files out of 7 have been converted to be compatible with both Python 2 and Python 3.
Philipp
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There was further progress. However, there are remaining issues when using Python 3. In particular for test gcc-torture-execute-20000227-1.c:
Traceback (most recent call last):
File "./generate-cases.py", line 191, in <module>
main()
File "./generate-cases.py", line 188, in main
s.generate()
File "./generate-cases.py", line 165, in generate
self.readfile()
File "./generate-cases.py", line 125, in readfile
self.lines = fin.readlines()
File "/usr/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 211: invalid start byte
This happens for various ports (at least mcs51-small and hc08).
Philipp
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks to Erik's commit today, the issue above (and the same problem for test iso-8859-1.c) are now the only remaining issues. Once those two have been sorted out, we can accpet Python 3 in the AC_CHECK_PROGS in configure.ac.
Philipp
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
generate-cases.py reads the C source files. If the source contains characters that are not valid UTF-8, this fails in Python 3, but works in Python 2.
test iso-8859-1.c and gcc-torture-execute-20000227-1.c are the only two tests that use non-UTF-8 source.
Philipp
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have updated generate-cases.py and HTMLgen.py to read/write the files using the Latin-1 encoding when Python 3 or later is detected. This maps the values 0x00 - 0xff in the files to Unicode code points U+0000 - U+00FF. This has the drawback that any UTF-8 sequences in the source files will not translate to the proper Unicode code points, but as long as the names and values in the templates stick to ASCII only, this should not be a problem and any UTF-8 sequences will simply pass through unchanged. Likewise, any extended ASCII characters (that may or may not be UTF-8 compatible) will also pass through unchanged.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The Python migration is complete in [r10294]. All Python code used in SDCC should now be fully compatible with both Python 2.7 and Python 3.6 (the new default).
Philipp
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This turns out to be harder than expected. Apparently the code is actually Python 1 whatever that still runs in Python 2, but is not supported by the automatic conversion tools.
In [r10269]/[r10270] 3 files out of 7 have been converted to be compatible with both Python 2 and Python 3.
Philipp
There was further progress. However, there are remaining issues when using Python 3. In particular for test gcc-torture-execute-20000227-1.c:
This happens for various ports (at least mcs51-small and hc08).
Philipp
Thanks to Erik's commit today, the issue above (and the same problem for test iso-8859-1.c) are now the only remaining issues. Once those two have been sorted out, we can accpet Python 3 in the AC_CHECK_PROGS in configure.ac.
Philipp
generate-cases.py reads the C source files. If the source contains characters that are not valid UTF-8, this fails in Python 3, but works in Python 2.
test iso-8859-1.c and gcc-torture-execute-20000227-1.c are the only two tests that use non-UTF-8 source.
Philipp
I have updated generate-cases.py and HTMLgen.py to read/write the files using the Latin-1 encoding when Python 3 or later is detected. This maps the values 0x00 - 0xff in the files to Unicode code points U+0000 - U+00FF. This has the drawback that any UTF-8 sequences in the source files will not translate to the proper Unicode code points, but as long as the names and values in the templates stick to ASCII only, this should not be a problem and any UTF-8 sequences will simply pass through unchanged. Likewise, any extended ASCII characters (that may or may not be UTF-8 compatible) will also pass through unchanged.
The Python migration is complete in [r10294]. All Python code used in SDCC should now be fully compatible with both Python 2.7 and Python 3.6 (the new default).
Philipp