From: Donal K. F. <don...@ma...> - 2008-11-27 14:35:48
|
Michael Kirkham wrote: > Clerical: > > * Jeff and Pascal both indicated they prefer Tcl_Zlib as a prefix > (with unnamed others preferring Zlib_). [I don't care, but Tcl_Zlib > makes more sense to me. Maybe you can put that to vote.] I'm similarly indifferent to this. Can run it as a subsidiary question. > * 'compresseddata' in the Arguments section is not used anywhere and > should be removed. 'data' should be changed to in/out, or (maybe > better) change Put() to 'srcPtr' and Get() to 'dstPtr', and replace > 'data' in Arguments with these. Good points. > * I think the last time I read the TIP I was confused about Put/Get > thinking one was for compression and the other decompression, because > one inflate() or deflate() call both feeds data in and gets data out, > when using zlib directly. If that's how Get/Put are intended to be > used, it should be clarified. I've been reading the TIP again and I think it is clear that isn't how they are used. Instead, a stream is created as either a compressing or decompressing stream and Get/Put are used to transfer data between the stream's buffers and the rest of Tcl. Will need careful documentation though. > * Jeff mentioned in the previous thread something about consistency > between "-limit" and "-count" at the Tcl level and about multiple > compression method pairs versus one pair with options that don't look > to have been addressed. But I only bring them up because I was > skimming the thread and looking at what was changed since, not > because I have an opinion. I'll just defer to Jeff et. al. regarding > this stuff. You can look to the tcl-core archives for Oct 23-25 '08 > for the thread in question. As far as I can see, the two features are independent and you could always get a short read. I also noticed that the various options at the Tcl level can't be passed to the stream creation function with the listed API... > * I'm puzzled by the stream handle being an integer. I'd think it'd > be an opaque pointer, but maybe this is just an implementation issue > and the int is actually a hash table key. In any case, may be better > as a typedef. (Only came up because I was wondering how to determine > bytes consumed by Put(), and checking if TIP234 exposed the real zlib > handle). Ought to be an opaque token. Good catch. > Underspecified: > > * inflate() and deflate() may both return an error result. It's > unspecified whether Zlib_StreamGet()/Put() are intended to return an > error result (in which case, they should take an interp argument and > handle error messages) or number of bytes read/written. Good question. > * Get() should indicate that data is appended to the Tcl_Obj, or else > take a starting offset. In the previous discussions, Pascal > indicated he was leaning toward the former, and I can probably work > with that (resetting Tcl_Obj length to 0 doesn't free memory, right? > If so, it'd be a big performance hit to TkPNG; if not, should be > fine). Appending would be reasonable. When working with a bytearray object, setting its length to zero doesn't deallocate the buffer. (I really hope you're talking about setting the length of a bytearray Tcl_Obj! Doing it for a "raw" one would be much scarier...) > * A single call to inflate() or deflate() may, but will not > necessarily, consume all provided input data or output buffer space. > For Get(), if it always appends, this is fine. For Put(), it's > unspecified what happens with the leftovers. I'm not certain you can > simply keep calling inflate() or deflate() when the output buffer > space is full and expect all input data to be consumed. This is Tcl. We can expand our buffers. But I'm not sure whether we need the streaming engine for PNG handling; can't we just use the "uncompress a whole chunk at once" API? (Yes, I'm very ignorant!) > So, does consumed input get trimmed from the Tcl_Obj so it can be > passed again directly to consume more input? Does TIP234 buffer the > remaining input itself, so the next Put() call adds the new data to > what it's buffered for input? Or does it do neither, in which case > Put() would need to provide the caller with info on how much was > consumed, as well as take an additional offset argument. I'd expect it to put the whole contents of the Tcl_Obj without changing that Tcl_Obj (assuming it's already a bytearray). But until we can get access to the sources I can't tell. I don't know what the result of the Put will be. > I can work with either of these options, but buffering would be more > caller-friendly while offset input+count return would be most > efficient. But it needs to be specified. I'm tempted to say that the Get/Put methods ought to work with pointers to arrays of unsigned chars. It still integrates with a bytearray object easily enough, but also can work well with lower-level code. > Nice to Have: > > * It may be useful (particularly if, as I've suggested a couple times > in the past, base64 decoding and images-as-strings are moved out of > the GIF/PNG code into the generic photo code, turning both into > channels and making it invisible to the format-specific handler) > there were versions taking Tcl_Channels rather than Tcl_Objs. But > it's not a requirement for me. Well, 8.6 now has native base64 handling (see [binary decode]) but apart from that I think dealing with that stuff is a little out of scope, at least of that TIP. (To be exact, it probably ought to be done but I doubt there's the effort to do it in time.) Donal. |