From: Paul G. <pau...@so...> - 2002-05-13 07:20:48
|
> The problem I'm having if anyone has read this far is with the > ByteBuffer->String conversion required by Jython. If it's just textual > data then I have no problem, but I can't get it to work properly with > binary data, say downloading an image. I'm no expert on this stuff so > if there are any please contact me. Here is what I know about ByteBuffer to String conversion. Check http://www.jugs.ch/html/events/2002/NIO_Presentation/Slide_D02.html. The intended way to go from a ByteBuffer to a String is via a CharsetDecoder, which assumes that the data in the ByteBufer has been encoded with a particular format. The CharsetDecoder converts the encoded bytes into Unicode. This will clearly mess up any binary data (e.g. images) that you want to store in a String. (Storing binary data in Strings is of course normal in Python, but very unothodox in Java.) In this case, it sounds like you want to use the method labeled "WRONG" in the slide: ByteBuffer.asCharBuffer(). This overlays the memory used by the ByteBuffer with a CharBuffer, without doing any conversion on the contents. In fact both objects, ByteBuffer and CharBuffer, can manipulate the same segement of memory at the same time. This is basically the equivalent of having a char pointer and a byte pointer to the same address in a C program. (Pretty radical for Java, IMHO) If you have already gotten this far and it does not work, my appologies. Then I am out of ideas. -Paul -- Paul Giotta Software Architect Technoparkstrasse 1, CH-8005 Zurich. Email: paul.giotta@Softwired-inc.com Home Page WWW: http://www.softwired-inc.com Office: +41 1 4452370 | Fax: +41 1 4452372 | Mobile: +41 76 389 1180 |
From: brian z. <bz...@zi...> - 2002-05-13 15:47:59
|
The architecture of the new JDK 1.4 non-blocking sockets does not condone this use of streams. That's why I've been trying to get a generic byte->char conversion for the new Buffers. I did notice though that the a couple of their demos do use the streams for reading and writing even though the docs say not to do so. This is really my biggest stumbling block at the moment. I think I am going to make the reading/writing as extensible as possible so I can get something out and see if anyone has problems with it rather than just sit on this for much longer. brian > -----Original Message----- > From: jyt...@li... > [mailto:jyt...@li...] On Behalf > Of Kevin J. Butler > Sent: Monday, May 13, 2002 9:39 AM > To: jyt...@li... > Subject: Re: [Jython-users] SSL connections in Jython > > > Samuele Pedroni wrote: > > I would use a bulk get to get from a ByteBuffer to a byte[] > and then > > apply the methods used in PyFile.FileWrapper.getString (yup > they are > > deprecated but as long as they ship they do the job). But > then I don't > > know what kind of access pattern you need ... > > How about just doing: > > f = PyFile( socket.getIntputStream(), > socket.getOutputStream() ) # may > need binary flag... > s = f.read() > f.close() > > Seems quite reasonable: "Convert Java streams to Python > streams, then do > what you want with them..." > > kb > > > _______________________________________________________________ > > Have big pipes? SourceForge.net is looking for download > mirrors. We supply the hardware. You get the recognition. > Email Us: ban...@so... > _______________________________________________ > Jython-users mailing list > Jyt...@li... > https://lists.sourceforge.net/lists/listinfo/jython-users > |
From: Paul G. <pau...@so...> - 2002-05-14 07:06:00
|
> > This overlays the memory used by the > > ByteBuffer with a CharBuffer, without doing any conversion on the > > contents. In fact both objects, ByteBuffer and CharBuffer, can manipulate > > the same segement of memory at the same time. This is basically the > > equivalent of having a char pointer and a byte pointer to the same > > address in a C program. (Pretty radical for Java, IMHO) > > but then if I understand things right, each two byte in the first buffer > will be considered a single char by the second, while what we want is > each byte -> a char. Yes, it gets confusing quite quickly. As I understand it, the original question was something like this: In [PJ]ython one would typically read the contents of a binary file into a String object, even though it will never be interpreted as character data: imageBytes = open("someImage.jpg","rb").read() The question is how to obtain the same imageBytes String object in Jython using java.nio classes (presumably to take advantage of non-blocking IO and/or better performance. ) I am just using an image as an example, it could be any binary data. Now, I presume that if we left out the "b" in the mode parameter of the open() call, then the file contents would be interpreted as text and some character encoding scheme would be applied (implicitly, not explicitly) to convert the bytes in the file to Unicode characters. With the "b" flag, the file contents should be passed through without any conversion, and I do not care whether each one byte is interpreted as a character, or every two bytes is interpreted as a character, since the data is not text anyway. The only important thing is that whatever code subsequently uses the String imageBytes can access the binary data without it being corrupted. With java.nio all raw data from sockets and files are read into ByteBuffers. If you you know that the data is text, and you know what the encoding is, you can convert it to Unicode explicitly (not implicitly) by instantiating a CharDecoder object and specifying the encoding to use. The CharDecoder transfers data from a ByteBuffer to a (different) CharBuffer, as it does the decoding. If the ByteBuffer contains binary data and you want to store the binary data in a String object, (the equivalent of using the "b" flag above) then presumably the correct thing to do is just overlay the ByteBuffer with a CharBuffer, which can then be converted to a String. (BTW, CharSequence is the new common interface implemented by String, StringBuffer and CharBuffer) Whether or not this works depends on exactly how Jython handles binary data in a String. This was certainly trivial when each character was just a byte, and is less trivial now that Strings are Unicode. (Perhaps Python is moving away from using Strings for general byte storage? To be honest, I am not up to date on the newest language developments.) Anyway that is my summary of things. Any further questions need to be answered by someone who knows Jython internals better. -Paul |
From: Samuele P. <pe...@in...> - 2002-05-14 11:26:06
|
From: Paul Giotta <pau...@so...> > > > > This overlays the memory used by the > > > ByteBuffer with a CharBuffer, without doing any conversion on the > > > contents. In fact both objects, ByteBuffer and CharBuffer, can manipulate > > > the same segement of memory at the same time. This is basically the > > > equivalent of having a char pointer and a byte pointer to the same > > > address in a C program. (Pretty radical for Java, IMHO) > > > > but then if I understand things right, each two byte in the first buffer > > will be considered a single char by the second, while what we want is > > each byte -> a char. > > Yes, it gets confusing quite quickly. As I understand it, the original > question was something like this: > > In [PJ]ython one would typically read the contents of a binary file into a > String object, even though it will never be interpreted as character data: > > imageBytes = open("someImage.jpg","rb").read() > > The question is how to obtain the same imageBytes String object in Jython > using java.nio classes (presumably to take advantage of non-blocking IO > and/or better performance. ) I am just using an image as an example, it could > be any binary data. > > Now, I presume that if we left out the "b" in the mode parameter of the open() > call, then the file contents would be interpreted as text and some character > encoding scheme would be applied (implicitly, not explicitly) to convert the > bytes in the file to Unicode characters. With the "b" flag, the file contents > should be passed through without any conversion, and I do not care whether > each one byte is interpreted as a character, or every two bytes is > interpreted as a character, since the data is not text anyway. The only > important thing is that whatever code subsequently uses the String imageBytes > can access the binary data without it being corrupted. You don't care, but jython rules so far are thata binary data: char sequence with num values in the range 0-255 ;) regards |
From: Samuele P. <pe...@in...> - 2002-05-13 10:24:00
|
From: Paul Giotta <pau...@so...> > > In this case, it sounds like you want to use the method labeled "WRONG" in the > slide: ByteBuffer.asCharBuffer(). This overlays the memory used by the > ByteBuffer with a CharBuffer, without doing any conversion on the contents. > In fact both objects, ByteBuffer and CharBuffer, can manipulate the same > segement of memory at the same time. This is basically the equivalent of > having a char pointer and a byte pointer to the same address in a C program. > (Pretty radical for Java, IMHO) > but then if I understand things right, each two byte in the first buffer will be considered a single char by the second, while what we want is each byte -> a char. I would use a bulk get to get from a ByteBuffer to a byte[] and then apply the methods used in PyFile.FileWrapper.getString (yup they are deprecated but as long as they ship they do the job). But then I don't know what kind of access pattern you need ... regards. |
From: Kevin J. B. <kev...@bi...> - 2002-05-13 14:39:08
|
Samuele Pedroni wrote: > I would use a bulk get to get from a ByteBuffer to a byte[] > and then apply the methods used in PyFile.FileWrapper.getString > (yup they are deprecated but as long as they ship they do the > job). But then I don't know what kind of access pattern > you need ... How about just doing: f = PyFile( socket.getIntputStream(), socket.getOutputStream() ) # may need binary flag... s = f.read() f.close() Seems quite reasonable: "Convert Java streams to Python streams, then do what you want with them..." kb |