From: Nicolas C. <war...@fr...> - 2004-04-09 13:40:40
|
> > > o Wouldn't it be better to rename read_i32 and write_i32 to > > > read_i31 and write_i31 ? Then you could add real read_i32 and > > > write_i32 functions based on native ints. > > > > The values that can be readed/written are 31 bits limited caml > > integers (on 32 bits platform , since 64 bits have 63 bits integers). > > But the size of the data readed / written is exactly 32 bits. > > So the type being read and written is 31 bit and the encoding chosen is > 32 bit. All other operations are labelled by the type and not the > encoding. Therefore the names read_i31/write_i32 would be more > consistent. Actually no. Operations are labelled by the encoding : read / write (u)i16 are returning ints read / write (null terminated) string read / write utf8 (not yet here, thanks for the code) > > But having functions which name are claming reading/writting 31 bits > > looks highly suspicious for people who does not know about ocaml > > implementation details :-) > > I would call this a good thing as it might prevent beginners from making > mistakes. There can't be mistake since there is a guard when the 32 bits value readed cannot be represented as a caml int. > > > o A slight optimisation of write_byte would be to use > > > unsafe_char_of_int. > > > > I'll have a look at that. > > unsafe_char_of_int is just defined as "%identity". Since we already know > that the argument is in the right range we can do without the bounds > check. Just saw that. Please note that we need to define it again since it's not exported in pervasives.mli but I'll do the change. > > > o What about read_utf8 and write_utf8 ? > > > > I don't have knowledge in internationalizion, if you have some ideas > > about this, please feel free to contribute ! > > Attached. Please note that the implementation only supports 16 bit > characters and assumes that each character uses the shortest encoding. Thanks for the code, I'll put it into IO. Is it default for UTF8 ? I don't know about it. > Also note that the code comes straight out of ant. So it's in revised > syntax and need to be slightly adapted to the Extlib IO module. (It > assumes that read_byte returns -1 at end-of-file.) Should not : ---- else if c < 0xc0 then c (* should never happen *) --- raise an exception instead ? > > > o Have you made up your mind about supporting seekable streams? > > > > Not yet. This would need to add another closure to the IO prototype : > > I'm not yet sure it's worth it. > > Is this a problem? Usually one does not create that many IO objects. So > the memory consumption should be ignorable. Also, when using IO objects > that do not support seeking, the corresponding slot is initialised by > some default value. So there is no overhead creating a new closure. You have a point here. I might add "seek" soon. Regards, Nicolas Cannasse |