From: dman <ds...@vm...> - 2001-11-09 22:06:45
|
This bounced last time. -D ----- Forwarded message from dman <ds...@ri...> ----- From: dman <ds...@ri...> Date: Fri, 9 Nov 2001 14:07:51 -0500 To: jyt...@li... User-Agent: Mutt/1.2.5i Mail-Followup-To: jyt...@li... On Fri, Nov 09, 2001 at 04:25:15PM +0000, Finn Bock wrote: | [dman] | | >Could someone please explain to me, again, why jython munges streams | >after the fashion of ms windows if binary mode isn't specified? | | First, was the problem the CR-NL munging or the non-ascii munging | <wink>? I assume you are running unix so it must have been be the | non-ascii munging that bit you. The IMAP RFC states that all lines end in CRLF. When I printed out the last 2 characters of sockfile.readline() I got something (the last piece of data on the line), then 0xa. All the data was fine, except that the CR was missing. | The basic issue is how to deal with characters (16-bits) vs. bytes | (8-bit). Java have two ways: Stream and Reader, but python only have one | open() method. I decided to override the 'b' flag for this behavior | because many (windows) programmers would already know about the 'b' flag | on the open() function. By re-using the 'b' flag the default text mode | was obvious because that is what windows uses. Was the logic of input identical to text-files on windows, or is there more to it than that? How does java decide what the encoding of the data is (ie Unicode 16-bit chars or ASCII 8-bit chars)? How does it decide to remove the CR, but not harm any other data in the stream? I don't really understand much of Java's java.io package, other than it takes some work to figure out which class has the method that does what you want. IMO Python's read() and readline() methods are so much simpler and get the job done just as well. | >I just had a real (annoying) waste of time tracking down why imaplib | >would throw an unexpected response exception | | Have you forgotton how JPython-1.1 did this? I haven't forgotten because I never knew. I've only used Jython >= 2.0. (and CPython, but that is irrelevant here) | Data was written as a through a Writer but reading data was through | a InputStream. With no way of changing that behaviour. What we have | now is better by far. Ok, I agree that allowing specifying the 'b' flag to make it work "right" is better than not allowing specification. (Personally, I think that all streams should just be streams with no magic munging under the programmer's feet. That is, I think that there should only be "binary mode" reading of files and sockets.) | >(on all correct | >responses, except UID responses), but worked beautifully with cpython. | >I am now submitting the patch below to cpython on sourceforge (that's | >where the module is maintained, right? I know that the debian package | >uses cpython's modules). | | It seems like this changes was submitted already: | | >http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=469910 Yeah, Martin von Loewis responed that the bug has already been fixed in CVS and will be included in CPython 2.2. | I'll apply the same patch to jython's version of imaplib.py in the next | release. Cool. BTW, the current version of the Debian package includes the patch. -D ----- End forwarded message ----- |