From: Alan G I. <ai...@am...> - 2008-07-22 13:57:12
|
Last night I wanted rst2html on Python 3, and (so far) it seems this was not too hard to achieve. Today I realized that perhaps I should share this experience. No problem ---------- - Lots of replacing ``has_key`` with ``in`` - Replace a few ``print`` statements - use builtin type names instead of the ``types`` module (``str`` instead of ``types.StringType`` etc) - import ``configparser as ConfigParser`` if cannot import ``ConfigParser - copy ``UserString.py`` to docutils folder and use relative import Some problems for earlier versions ---------------------------------- - Replace a few ``print >>`` - ``except`` and ``raise`` - change ``u'string'`` to ``'string'`` - encoding test (now 'utf-8' instead of 'unicode') - use ``str(data)`` instead of ``unicode(data)`` Hmm, I think that was most of it. Anyway, the changes in the first group could all be made without affecting 2.2 compatability. Should I make some of them? If so, do you want patches, or should I put the changed files somewhere (where?) on SVN? Cheers, Alan Isaac |
From: David G. <go...@py...> - 2008-07-22 17:14:26
|
On Tue, Jul 22, 2008 at 10:01, Alan G Isaac <ai...@am...> wrote: > Last night I wanted rst2html on Python 3, and > (so far) it seems this was not too hard to achieve. > Today I realized that perhaps I should share this > experience. Are you using the 2to3 tool? It should take care of most, perhaps all, necessary changes. I believe someone else may also be working on Python 3 compatibility. I don't recall details though. > No problem > ---------- > > - Lots of replacing ``has_key`` with ``in`` Now that Python 2.1 compatibility has been dropped, this is fine. > - Replace a few ``print`` statements > - use builtin type names instead of the ``types`` module > (``str`` instead of ``types.StringType`` etc) > - import ``configparser as ConfigParser`` if cannot import ``ConfigParser Because the module name was changed? > - copy ``UserString.py`` to docutils folder and use relative import We might be able to avoid UserString.py altogether by subclassing str instead (or unicode in 2.x). > Some problems for earlier versions > ---------------------------------- > > - Replace a few ``print >>`` > - ``except`` and ``raise`` > - change ``u'string'`` to ``'string'`` > - encoding test (now 'utf-8' instead of 'unicode') Not sure what this is. Note that in at least one place, Docutils uses "unicode" as shorthand for "text not encoded, already Unicode -- no decoding necessary" (e.g. for programmatic use, when the source text passed in is Unicode). > - use ``str(data)`` instead of ``unicode(data)`` > > Hmm, I think that was most of it. > > Anyway, the changes in the first group could all be > made without affecting 2.2 compatability. > Should I make some of them? If so, do you want patches, or > should I put the changed files somewhere (where?) on SVN? The 2.2-compatible changes can be made directly in the codebase. Be sure to run the test suite before checking in changes, and add tests as necessary. It is not generally possible to have code simultaneously compatible with both 2.x and 3.0. If there are any cases that aren't handled by 2to3, we'll have to make a branch for compatibility with Python 3. -- David Goodger <http://python.net/~goodger> |
From: Alan G I. <ai...@am...> - 2008-07-22 18:54:07
|
On Tue, 22 Jul 2008, David Goodger apparently wrote: > Are you using the 2to3 tool? No, for two reasons. I was interested in seeing the frequency of problems in real code, and I was interested in looking for ways to resolve version 2 v. version 3 incompatibilities (i.e., to understand to what extent a single code base might be possible). As an example, would you have any object to replacing print >>sys.stderr, "this" with sys.stderr.write("this\n") (which both versions can use)? Thanks, Alan |
From: David G. <go...@py...> - 2008-07-22 19:22:43
|
On Tue, Jul 22, 2008 at 14:58, Alan G Isaac <ai...@am...> wrote: > On Tue, 22 Jul 2008, David Goodger apparently wrote: >> Are you using the 2to3 tool? > > No, for two reasons. > I was interested in seeing the frequency of problems in real code, > and I was interested in looking for ways to resolve > version 2 v. version 3 incompatibilities (i.e., to understand to what > extent a single code base might be possible). I don't think it's possible (or at least not easy) to maintain a single code base compatible with both 2.x and 3.x. I'd rather not maintain two codebases manually. The solution seems to be to maintain only a 2.x codebase and use 2to3 to auto-convert to 3.x. That does require some changes to the 2.x code. Eventually, we may abandon the 2.x line and then maintain only 3.x code. I don't know all the details here though. I'm just a lurker on python-dev etc. > As an example, would you have any object to replacing > > print >>sys.stderr, "this" > > with > > sys.stderr.write("this\n") > > (which both versions can use)? Three objections: 1. sys.stderr.write is harder to read than the print statement, IMHO. Not a biggie though. 2. "print" does special processing (newlines, soft spaces), so care has to be taken when converting. I don't know if this applies to the uses in Docutils, which should be few. 3. If 2to3 can handle the differences, we shouldn't be messing with working code for no good reason. The same kind of objections apply to any similar changes. -- David Goodger <http://python.net/~goodger> |
From: Alan G I. <ai...@am...> - 2008-07-22 19:41:54
|
On Tue, 22 Jul 2008, David Goodger apparently wrote: > If 2to3 can handle the differences, we shouldn't be > messing with working code for no good reason. OK. So that eliminates almost all the changes, I believe. Two exceptions: 1. use of the ``UserString`` module Using the rule above, it seems the simple, fast, and safe way it to to make a local of ``UserString`` and import it (relatively). (Note: simply inheriting from ``str`` instead does not work.) Is this the right answer for now? 2. Lines like:: if self.encoding and self.encoding.lower() == 'unicode': should become (am I right?):: if self.encoding and self.encoding.lower() in ('unicode','utf-8'): Alan |
From: David G. <go...@py...> - 2008-07-23 01:31:06
|
On Tue, Jul 22, 2008 at 15:46, Alan G Isaac <ai...@am...> wrote: > On Tue, 22 Jul 2008, David Goodger apparently wrote: >> If 2to3 can handle the differences, we shouldn't be >> messing with working code for no good reason. > > OK. > So that eliminates almost all the changes, > I believe. > > Two exceptions: > > 1. use of the ``UserString`` module > > Using the rule above, it seems the simple, fast, and > safe way it to to make a local of ``UserString`` > and import it (relatively). (Note: simply inheriting > from ``str`` instead does not work.) > > Is this the right answer for now? I'd rather not copy modules. I'd prefer to subclass str (unicode in 2.x) instead of UserString. Why doesn't it work? Can it be made to work? > 2. Lines like:: > > if self.encoding and self.encoding.lower() == 'unicode': > > should become (am I right?):: > > if self.encoding and self.encoding.lower() in ('unicode','utf-8'): No. "self.encoding == 'unicode'" means that the source data is *already* Unicode, already decoded, no decoding necessary. If the encoding is UTF-8, decoding *is* necessary. No change is necessary in the above code. "self.encoding == 'unicode'" is specific to Docutils. Changes in Python won't affect it. With Python 3, there will be a simple test for this: (type(source) == str) implies (self.encoding == 'unicode'), and no decoding necessary. If (type(source) == bytes), decoding source is necessary. -- David Goodger <http://python.net/~goodger> |
From: Stefan R. <lis...@st...> - 2008-07-23 14:19:52
|
on 23.07.2008 03:31 David Goodger said the following: <snip> > I'd rather not copy modules. I'd prefer to subclass str (unicode in > 2.x) instead of UserString. Why doesn't it work? Can it be made to > work? I added a branch with a quick try to get it working. There are 4 failures and 2 errors left (with python2.5). The failures are harmless and could/should be removed by changing the tests, the errors are more complicated I think. The reason for all of them is that the UserString subclass was simply a container for its 'data' member which was a 'str' object most of the time but sometimes a 'unicode'. This affects what repr(textinstance) looked like: 'xyz' vs. u'xyz'. Apart from the 4 failures left, I dealt with this type of test failure by adding a unicode subclass that removes the u from its repr. The errors: it was possible to instantiate the UserString subclass with non-ascii strings, this is not possible with the unicode subclass anymore, but it is used twice in the test suite. I am not sure how to resolve that. cheers, stefan |
From: Alan G I. <ai...@am...> - 2008-07-23 14:58:07
|
On Wed, 23 Jul 2008, Stefan Rank apparently wrote: > I dealt with this type of test failure by adding a unicode > subclass that removes the u from its repr. A problem with this approach is that eliminating the need to import UserString is part of preparing docutils for Python 3. But there is no ``unicode`` builtin in Python 3. (It just uses ``str``.) If you can proceed with your approach, ould the following be acceptable? try: class reprunicode(unicode): """ A class that removes the initial u from unicode's repr. """ def __repr__(self): return unicode.__repr__(self)[1:] except NameError: reprunicode = str Cheers, Alan Isaac |
From: Stefan R. <lis...@st...> - 2008-07-23 15:28:36
|
Hi Alan, on 23.07.2008 17:02 Alan G Isaac said the following: > On Wed, 23 Jul 2008, Stefan Rank apparently wrote: >> I dealt with this type of test failure by adding a unicode >> subclass that removes the u from its repr. > > A problem with this approach is that eliminating the > need to import UserString is part of preparing docutils > for Python 3. But there is no ``unicode`` builtin in > Python 3. (It just uses ``str``.) This should be taken care of by the 2to3 tool. It will simply rewrite all uses of unicode to str and then the resulting codebase should - fingers crossed - just work. Ad your other mail: For running the test suite in a branch, simply check out the branch, and run 'python alltests.py' in the branch's test directory. The script takes care that other installed docutils version don't interfere. cheers, stefan |
From: David G. <go...@py...> - 2008-07-23 15:33:08
|
On Wed, Jul 23, 2008 at 11:02, Alan G Isaac <ai...@am...> wrote: > On Wed, 23 Jul 2008, Stefan Rank apparently wrote: >> I dealt with this type of test failure by adding a unicode >> subclass that removes the u from its repr. > > A problem with this approach is that eliminating the > need to import UserString is part of preparing docutils > for Python 3. But there is no ``unicode`` builtin in > Python 3. (It just uses ``str``.) > > If you can proceed with your approach, > ould the following be acceptable? > > try: > class reprunicode(unicode): > """ > A class that removes the initial u from unicode's repr. > """ > def __repr__(self): > return unicode.__repr__(self)[1:] > except NameError: > reprunicode = str No, that's absolutely not acceptable. It's meaningless to Python 2.x. Python 3 is a different language, and requires a separate codebase (100% auto-converted if possible). It is a fool's errand to try to have a single codebase for both Python 2 and 3. It's fine to adapt the current Python 2.x compatible code to work with 2to3, for easy auto-conversion -- as long as that doesn't have a negative impact. It's not fine to try to shoehorn 3.x features into the 2.x codebase. See PEP 3000 for the recommended development model: http://www.python.org/dev/peps/pep-3000/#compatibility-and-transition -- David Goodger <http://python.net/~goodger> |
From: Alan G I. <ai...@am...> - 2008-07-23 15:50:15
|
On Wed, 23 Jul 2008, David Goodger apparently wrote: > It is a fool's errand to try to > have a single codebase for both Python 2 and 3. But that is not the purpose. The purpose is to facilitate conversion. Perhaps you missed the indexing in Stefan's definition of ``__repr__``? If 2to3 simply substitutes ``str`` for ``unicode``, that will be wrong. Cheers, Alan |
From: Stefan R. <lis...@st...> - 2008-07-23 16:07:06
|
on 23.07.2008 17:54 Alan G Isaac said the following: > On Wed, 23 Jul 2008, David Goodger apparently wrote: >> It is a fool's errand to try to >> have a single codebase for both Python 2 and 3. > > But that is not the purpose. > The purpose is to facilitate conversion. > > Perhaps you missed the indexing in Stefan's > definition of ``__repr__``? > If 2to3 simply substitutes ``str`` for ``unicode``, > that will be wrong. Ah you are right, I missed that in your mail... However the try-except won't help since the code needs to be fed through 2to3 anyway, so the 'unicode' in the try will be 'str' in the py3 version, so there will be no NameError. The way to go is probably a conditional on the definition of repr:: class reprunicode(unicode): if < some-way-to-check-for-python-<3 >: def __repr__(... ... But someone should try running 2to3 on the codebase first. (not installed here, sorry) cheers, stefan |
From: Alan G I. <ai...@am...> - 2008-07-23 17:08:33
|
On Wed, 23 Jul 2008, Stefan Rank apparently wrote: > someone should try running 2to3 on the codebase first Well, I installed it from SVN (on Win XP) and it did not run. No time to figure out why at the moment. Cheers, Alan |
From: Alan G I. <ai...@am...> - 2008-07-23 17:18:49
|
On Wed, 23 Jul 2008, Stefan Rank apparently wrote: > try-except won't help ... there will be no NameError. Oh, right. > The way to go is probably a conditional on the definition of repr:: Will conditioning on the first character fail in some way? E.g., :: def __repr__(self): repr = unicode.__repr__(self) if repr[0] == 'u': return repr[1:] else: return repr Cheers, Alan |
From: Alan G I. <ai...@am...> - 2008-07-23 15:21:15
|
On Wed, 23 Jul 2008, Stefan Rank apparently wrote: > I added a branch with a quick try to get it working. > There are 4 failures and 2 errors left (with python2.5). I'm new to this kind of thing. How did you run the tests on your branch? Thank you, Alan |
From: engelbert g. <gr...@us...> - 2008-07-23 11:28:41
|
On 7/22/08, Alan G Isaac <ai...@am...> wrote: > On Tue, 22 Jul 2008, David Goodger apparently wrote: > > > If 2to3 can handle the differences, we shouldn't be > > messing with working code for no good reason. i would take david's previous line : The 2.2-compatible changes can be made directly in the codebase. Be sure to run the test suite before checking in changes, and add tests as necessary. as an allowance to commit all the 2.2 compatible things. why should we loose the work you did. and even if 2to3 handles everything perfect, less differences in the then two branches might ease changes. for the release i did install python2.2 to 3.0 on my linux so i could do the test against all supported versions if you want cheers |
From: Stefan R. <lis...@st...> - 2008-07-23 20:05:12
|
on 23.07.2008 18:13 Alan G Isaac said the following: > On Wed, 23 Jul 2008, Stefan Rank apparently wrote: >> For running the test suite in a branch, simply check out >> the branch, and run 'python alltests.py' in the branch's >> test directory. The script takes care that other installed >> docutils version don't interfere. > > OK, I did that, but I'm still uncertain. > On your branch, I committed changes of `has_key` > to the use of `in`. > > I hope you do not consider that impolite. Not at all. It would probably be a good idea to use separate branches for such changes, but I think the 'has_key' -> 'in' change is very uncontroversial and mixing it with the userstring change should be ok. I just replaced all remaining occurences of has_key. Anyway, the wonders of revision control make it easy to separate these. OTOH, I just found out that the wonders of TortoiseSVN make it possible to branch in svn and have a different branch locally (trunk/docutils) than the branch commited to the server (trunk/docutils/docutils)... 8-o this was the cause for my most recent commits that were unintentionally split across branches and trunk. Sorry for that. stefan |
From: Stefan R. <lis...@st...> - 2008-07-24 09:23:02
|
on 23.07.2008 13:28 engelbert gruber said the following: > i would take david's previous line : > > The 2.2-compatible changes can be made directly in the codebase. Be > sure to run the test suite before checking in changes, and add tests > as necessary. > > as an allowance to commit all the 2.2 compatible things. > why should we loose the work you did. and even if 2to3 handles > everything perfect, less differences in the then two branches > might ease changes. > > for the release i did install python2.2 to 3.0 on my linux so i could > do the test against all supported versions if you want The branch abolish-userstring-haskey now has all changes prompted by py2.6 -3 deprecation warnings and by subclassing Text from unicode. On py2.5 and py2.6 all tests pass. py2.6 still gives two DeprecationWarnings (with or without the -3 flag), but I think these are overzealous false alarms: "BaseException.message has been deprecated as of Python 2.6" This comes up twice for parsers.rst.DirectiveError. But the class does not rely on the message property of BaseException, it sets a member 'message' itself. Engelbert, could you have a look at the changes and run the tests on the other python versions? cheers, stefan |
From: engelbert g. <gr...@us...> - 2008-07-24 10:30:57
Attachments:
alltests.py-py2.2.out
alltests.py-py2.3.out
|
2.4 Ran 1055 tests in 61.919s 2.2 and 2.3 outputs are attached somethings with u' fail test_parsers/test_rst/test_directives/test_raw.py: totest['raw'][9]; test_parser (DocutilsTestSupport.ParserTestCase) input: .. raw:: html :file: non-existent.file -: expected +: output <document source="test data"> <system_message level="4" line="1" source="test data" type="SEVERE"> <paragraph> Problems with "raw" directive path: - [Errno 2] No such file or directory: u'non-existent.file'. ? - + [Errno 2] No such file or directory: 'non-existent.file'. <literal_block xml:space="preserve"> .. raw:: html :file: non-existent.file |
From: Stefan R. <lis...@st...> - 2008-07-24 17:34:04
|
on 24.07.2008 12:30 engelbert gruber said the following: > 2.4 Ran 1055 tests in 61.919s > > 2.2 and 2.3 outputs are attached Thanks a lot. I hope that the latest commit does remove the errors with 2.2. The failures however are a little puzzling: > somethings with u' fail <snip> > - [Errno 2] No such file or directory: u'non-existent.file'. > + [Errno 2] No such file or directory: 'non-existent.file'. This is simply str(IOError) and I had to add the u for py2.5 and py2.6... I wonder why they dissappear with py<2.3. Maybe open(filename) converted filename to a str before py2.4. Can I ask you to run the tests again and include later versions too? cheers, stefan |
From: Alan G I. <ai...@am...> - 2008-07-24 18:21:26
|
On Thu, 24 Jul 2008, Stefan Rank apparently wrote: > Maybe open(filename) converted filename to a str before py2.4. http://www.python.org/dev/peps/pep-0277/ hth, Alan Isaac |
From: engelbert g. <gr...@us...> - 2008-07-24 22:15:48
|
On 7/24/08, Stefan Rank <lis...@st...> wrote: > on 24.07.2008 12:30 engelbert gruber said the following: > > > 2.4 Ran 1055 tests in 61.919s > > > > 2.2 and 2.3 outputs are attached > > > > Thanks a lot. > I hope that the latest commit does remove the errors with 2.2. > > The failures however are a little puzzling: > > > > somethings with u' fail > > > <snip> > > > - [Errno 2] No such file or directory: u'non-existent.file'. > > + [Errno 2] No such file or directory: 'non-existent.file'. > > > > This is simply str(IOError) and I had to add the u for py2.5 and py2.6... I > wonder why they dissappear with py<2.3. > Maybe open(filename) converted filename to a str before py2.4. > > Can I ask you to run the tests again and include later versions too? sure 2.2 --- parsers/test_rst/test_directives/test_raw.py: totest['raw'][8]; test_parser (DocutilsTestSupport.ParserTestCase) input: .. raw:: html :file: non-existent.file -: expected +: output <document source="test data"> <system_message level="4" line="1" source="test data" type="SEVERE"> <paragraph> Problems with "raw" directive path: - [Errno 2] No such file or directory: u'non-existent.file'. ? - + [Errno 2] No such file or directory: 'non-existent.file'. <literal_block xml:space="preserve"> .. raw:: html :file: non-existent.file 2.3 the one from 2.2 and -------------------------------- test_parsers/test_rst/test_directives/test_tables.py: totest['csv-table'][12]; test_parser (DocutilsTestSupport.ParserTestCase) input: .. csv-table:: no such file :file: bogus.csv -: expected +: output <document source="test data"> <system_message level="4" line="1" source="test data" type="SEVERE"> <paragraph> Problems with "csv-table" directive path: - [Errno 2] No such file or directory: u'bogus.csv'. ? - + [Errno 2] No such file or directory: 'bogus.csv'. <literal_block xml:space="preserve"> .. csv-table:: no such file :file: bogus.csv 2.4, 2.5 and 2.6 succeed |
From: Stefan R. <lis...@st...> - 2008-07-25 07:01:02
|
I had some time to install pythons: on 25.07.2008 00:15 engelbert gruber said the following: > > 2.2 > --- <IOError failure> > > 2.3 the one from 2.2 and > -------------------------------- <another IOError failure that is already skipped in 2.2> > > 2.4, 2.5 and 2.6 succeed Since the difference is only in the printed output of an IOError, I think we should not do anything about that in the docutils code but only in the tests, so I added special-case rewriting. It was needed for 2.2 and for 2.3 but only on non-windows platforms. The tests now pass here for - windows: 2.2.3, 2.3.5, 2.4.3, 2.5.1, 2.6b2 - linux: 2.3.6, 2.4.4, 2.5.2 (sorry gentoo does not have a 2.2 in the portage tree anymore) Green light for checking in to trunk? cheers, stefan |
From: engelbert g. <gr...@us...> - 2008-07-25 08:40:18
|
On 7/25/08, Stefan Rank <lis...@st...> wrote: > I had some time to install pythons: > > on 25.07.2008 00:15 engelbert gruber said the following: > > > > 2.2 > > --- > <IOError failure> > > > > > 2.3 the one from 2.2 and > > -------------------------------- > > <another IOError failure that is already skipped in 2.2> > > > > > 2.4, 2.5 and 2.6 succeed > > > Since the difference is only in the printed output of an IOError, I > think we should not do anything about that in the docutils code but only > in the tests, so I added special-case rewriting. > It was needed for 2.2 and for 2.3 but only on non-windows platforms. > > The tests now pass here for > > - windows: 2.2.3, 2.3.5, 2.4.3, 2.5.1, 2.6b2 > - linux: 2.3.6, 2.4.4, 2.5.2 > (sorry gentoo does not have a 2.2 in the portage tree anymore) here, linux (ubuntu) 2.2.3, 2.3.7, 2.4.3, 2.5.2, 2.6b2+ all tests pass > Green light for checking in to trunk? i think yes all the best |
From: Beni C. <cb...@us...> - 2008-08-01 15:08:29
|
[Oops, replied privately again. Resending to list.] On Tue, Jul 22, 2008 at 8:14 PM, David Goodger <go...@py...> wrote: > On Tue, Jul 22, 2008 at 10:01, Alan G Isaac <ai...@am...> wrote: >> - import ``configparser as ConfigParser`` if cannot import ``ConfigParser > > Because the module name was changed? > This is moot because I assume 2to3 does (or else *should*) handle renamed std modules. But I want to comment anyway because it's a good example of the pythonic approach to language rot: Don't add workarounds for new versions -- rewrite for the newest version (and add workarounds for old versions)! In this case, it'd be better to ``import ConfigParser as configparser`` if cannot ``import configparser`` (and change all code to refer to ``configparser``). Reasons: 1. Early adoption of language changes gives your code more time until language rot breaks it again. Allowing this is why Python PEPs always allow new coding styles (through __future__ if needed) before they deprecate old styles! 2. Some day you'll have to rewrite anyway. If you don't do it now, your workarounds will begin to pile one upon the other. Like dirty socks. 3. Newer Python coding styles are always more pythonic than old ones [1]_, so this way your code looks better [2]_. .. [1] by definition of Pythonic == current Python style .. [2] by definition of Pythonic == looks good ;-) Of course if you need really a lot of re-writing, as in the 2.x->3.0 transition, a 2to3 tool is better than manual re-writing. And given an automatic tool the only works in the forward direction, it's better to keep old code until you switch. -- early-adopter-but-otherwise-procrastinating-ly y'rs, Beni Cherniavsky |