Any plans on adding support for python 3? Or is it someway possible to run a local python 3 install instead of the bundled 2.7?
And as you known, there are some syntax changes from 2.x to 3.x eg "class C(MyBaseClass, metaclass=MyMetaClass)".
I've written a pep8/pyflakes linter for parsing python code and displaying indicators and annotations for the outputted warnings. While keeping the parsed source code backwards compatible with 2.7, PythonScript running 2.7 will suffice but if there's any python3-only syntax it won't work.
Ideally I would like to have the option to switch python version.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Unfortunately, you can't switch between 2 and 3 without C++ code changes in PythonScript. There are currently no plans to switch to python 3 for one simple reason: strings.
Strings in Python 2 are effectively byte arrays, and can hold text encoded in ASCII, any single byte encoding, or UTF-8. Scintilla uses UTF-8 to store text and all text manipulations are performed with UTF-8 and byte offsets. Python 3 stores all strings as UTF-16. What this means is that all the lengths in Python 3 are character lengths, and not byte lengths, but scintilla needs byte lengths. This means the (PythonScript) user has to be extra careful when using offsets and lengths in python 3.
There could maybe things we could do to aid this, but ultimately it would be down to the author of the script to ensure that they don't run into problems when characters come up outside of ASCII. I don't think the win for Python 3 is great enough to justify the extra problems that are likely to come up.
Cheers,
Dave.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, at some point comes the time to move to python 3. 2.7 maintenance ends 2020.
The str/bytes changes from python 2 to 3 must be manageable in C++.
One must already be aware of this in python, but it's not hard to keep the code compatible with both 2.7 and 3. So script authors could easily handle the needed changes if he/she would choose to use 2 and/or 3.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Wouldn't it use at least two bytes per ASCII character?
(Really really late, I know. Development is on Github. It's also not really relevant, since Python 3 would be using bytes instead of str if you wanted byte offsets, and the discussion would be about how to make that more convenient to use.)
Edit: I didn't realize there was a nested reply system.
Edit: Python str uses the smallest possible fixed encoding for each string, which means the internal representation's char width is the width of the biggest character in that string. See https://www.python.org/dev/peps/pep-0393/. It also says, "the specification chooses UTF-8 as the recommended way of exposing strings to C code."
Yes, pep 393 changed the way internally thngs are stored since 3.3. But really, it's not the internal structure that matters, it's what len(...) returns.
Python 3:
>>>len('Dänemark')8
Python 2:
>>>len('Dänemark')9
You can probably imagine any number of scenarios where a script does something like an editor.search, looks at the len() of the result, then uses that in a call to editor.setTarget(). This goes against the principal of least surprise.
Having said that, if somebody wanted to produce a Python 3 version, ie. send a pull request, I'd be more than happy to release a Python3Script. But I don't currently have the bandwidth to make all the necessary changes (and all the thinking about how to deal with all the string based editor methods).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
2 points (off the top):
• I don’t have any interest in learning 2 very similar-but-different syntaxes for what purports to be different versions of the same language and trying to keep them straight especially when the one in use here has been declared dead already for years, so it’s the one I’ll soon be forced to forget anyway.
• Looking at your own example:
Python 3:
len('Dänemark')
8
Python 2:
len('Dänemark')
9
Seriously, which result makes more sense in a text-editing context? From a programming end they are both explainable, but not so much to someone from an end user background trying to write macros — and on top of that you want to tell them that it’s going to change in a few years anyway?
As someone who’s done this for a long time, but mostly avoided Python until recently … well, it looked like it would be worth taking an interest in initially, but this sort of thing makes me think it should just be avoided for a half dozen more years and see if there’s anything left since most peole don’t seem willing to take the new version seriously.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Any plans on adding support for python 3? Or is it someway possible to run a local python 3 install instead of the bundled 2.7?
And as you known, there are some syntax changes from 2.x to 3.x eg "class C(MyBaseClass, metaclass=MyMetaClass)".
I've written a pep8/pyflakes linter for parsing python code and displaying indicators and annotations for the outputted warnings. While keeping the parsed source code backwards compatible with 2.7, PythonScript running 2.7 will suffice but if there's any python3-only syntax it won't work.
Ideally I would like to have the option to switch python version.
Unfortunately, you can't switch between 2 and 3 without C++ code changes in PythonScript. There are currently no plans to switch to python 3 for one simple reason: strings.
Strings in Python 2 are effectively byte arrays, and can hold text encoded in ASCII, any single byte encoding, or UTF-8. Scintilla uses UTF-8 to store text and all text manipulations are performed with UTF-8 and byte offsets. Python 3 stores all strings as UTF-16. What this means is that all the lengths in Python 3 are character lengths, and not byte lengths, but scintilla needs byte lengths. This means the (PythonScript) user has to be extra careful when using offsets and lengths in python 3.
There could maybe things we could do to aid this, but ultimately it would be down to the author of the script to ensure that they don't run into problems when characters come up outside of ASCII. I don't think the win for Python 3 is great enough to justify the extra problems that are likely to come up.
Cheers,
Dave.
Well, at some point comes the time to move to python 3. 2.7 maintenance ends 2020.
The str/bytes changes from python 2 to 3 must be manageable in C++.
One must already be aware of this in python, but it's not hard to keep the code compatible with both 2.7 and 3. So script authors could easily handle the needed changes if he/she would choose to use 2 and/or 3.
Does (C)Python 3 really use UTF-16?
Wouldn't it use at least two bytes per ASCII character?
(Really really late, I know. Development is on Github. It's also not really relevant, since Python 3 would be using
bytes
instead ofstr
if you wanted byte offsets, and the discussion would be about how to make that more convenient to use.)Edit: I didn't realize there was a nested reply system.
Edit: Python
str
uses the smallest possible fixed encoding for each string, which means the internal representation's char width is the width of the biggest character in that string. See https://www.python.org/dev/peps/pep-0393/. It also says, "the specification chooses UTF-8 as the recommended way of exposing strings to C code."Last edit: Franklin Lee 2016-03-26
Yes, pep 393 changed the way internally thngs are stored since 3.3. But really, it's not the internal structure that matters, it's what
len(...)
returns.Python 3:
Python 2:
You can probably imagine any number of scenarios where a script does something like an
editor.search
, looks at thelen()
of the result, then uses that in a call toeditor.setTarget()
. This goes against the principal of least surprise.Having said that, if somebody wanted to produce a Python 3 version, ie. send a pull request, I'd be more than happy to release a Python3Script. But I don't currently have the bandwidth to make all the necessary changes (and all the thinking about how to deal with all the string based
editor
methods).2 points (off the top):
• I don’t have any interest in learning 2 very similar-but-different syntaxes for what purports to be different versions of the same language and trying to keep them straight especially when the one in use here has been declared dead already for years, so it’s the one I’ll soon be forced to forget anyway.
• Looking at your own example:
As someone who’s done this for a long time, but mostly avoided Python until recently … well, it looked like it would be worth taking an interest in initially, but this sort of thing makes me think it should just be avoided for a half dozen more years and see if there’s anything left since most peole don’t seem willing to take the new version seriously.