You can subscribe to this list here.
2003 |
Jan
|
Feb
(81) |
Mar
(97) |
Apr
(88) |
May
(80) |
Jun
(170) |
Jul
(9) |
Aug
|
Sep
(18) |
Oct
(58) |
Nov
(19) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(22) |
Feb
(9) |
Mar
(28) |
Apr
(164) |
May
(186) |
Jun
(101) |
Jul
(143) |
Aug
(387) |
Sep
(69) |
Oct
(14) |
Nov
(8) |
Dec
(99) |
2005 |
Jan
(10) |
Feb
(34) |
Mar
(24) |
Apr
(7) |
May
(41) |
Jun
(20) |
Jul
(3) |
Aug
(23) |
Sep
(2) |
Oct
(26) |
Nov
(41) |
Dec
(7) |
2006 |
Jan
(6) |
Feb
(3) |
Mar
(11) |
Apr
|
May
|
Jun
(5) |
Jul
(8) |
Aug
(20) |
Sep
|
Oct
(6) |
Nov
(5) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
|
Apr
(3) |
May
(2) |
Jun
|
Jul
|
Aug
(1) |
Sep
(7) |
Oct
(6) |
Nov
(19) |
Dec
(11) |
2008 |
Jan
|
Feb
(7) |
Mar
(9) |
Apr
(21) |
May
(42) |
Jun
(27) |
Jul
(28) |
Aug
(26) |
Sep
(16) |
Oct
(32) |
Nov
(49) |
Dec
(65) |
2009 |
Jan
(35) |
Feb
(20) |
Mar
(36) |
Apr
(42) |
May
(111) |
Jun
(99) |
Jul
(70) |
Aug
(25) |
Sep
(15) |
Oct
(29) |
Nov
(3) |
Dec
(18) |
2010 |
Jan
(10) |
Feb
(4) |
Mar
(57) |
Apr
(63) |
May
(71) |
Jun
(64) |
Jul
(30) |
Aug
(49) |
Sep
(11) |
Oct
(4) |
Nov
(2) |
Dec
(3) |
2011 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(1) |
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2024 |
Jan
(1) |
Feb
(3) |
Mar
(6) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2025 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: William N. <wne...@cs...> - 2004-04-10 01:16:19
|
On Apr 9, 2004, at 4:24 PM, Brian Hurt wrote: > I hadn't thought about it much. Last time I took a spin through the > crypto libraries, I was more than a little surprised that not one used > GMP > for RSA. Which I found quite surprising, as I'm willing to bet that > GMP > would give you the best possible RSA performance with the least work > from > the implementor. IIRC, GMP already has an "exponentiate in a modular > field" operation, implemented with tuned assembly code. This is the > core > operation of RSA. I was a bit surprised by this as well, so I modified cryptokit to use GMP, and I also added in a number of additional features, like DSA, SHA-{265, 384, 512}, a number of random number and prime generation routines, hash chains, etc. I've been meaning to package it up for quite a while now, but I'm an idiot when it comes to things like makefiles, and I was barely able to cobble one together for my own needs... And oh yeah... my ocamldoc code is a bit farkled. Plus there's the whole issue of getting GMP and MLGMP up and running. Anyway, if you have a need for this, I'd be happy to sent you what I have. > For symmetric key and hashing, I was thinking of > rounding up the hand-tunned assembly versions kicking around. I > haven't > decided what to do with Elliptic curve yet. As always, the problem with hand tuned assembly is the portability issue, that's why I try to stay away from it where I can -- plus, I'm not sure you really get significant enough benefits from assembly coding the modern symmetric ciphers and hashes. And I've been meaning to add some EC stuff to cryptokit-gmp for a long time. Laziness, it'll get you every time. > Thinking about it a second, there's a problem with providing Ocaml > implementations of various symmetric key crypto systems- most of them > assume you have access to efficient 32-bit integers. Which means you > either use Int32 at a serious performance and memory hit. Yep. And this is one of the biggest pains-in-the-ass for me when it comes to OCaml (I do crypto research for a living). There are times I would kill for an efficient word32/64 datatype... Thank goodness we at least have string_unsafe_get and set. William D. Neumann "Here come the bunnies with the sugar water, Do a little dance with the farmer's daughter." -- The Halo Benders |
From: Brian H. <bh...@sp...> - 2004-04-09 21:50:18
|
On Fri, 9 Apr 2004, Nicolas Cannasse wrote: > > On Fri, 9 Apr 2004, Nicolas Cannasse wrote: > > > > A couple of questions: > > > > > external unsafe_char_of_int : int -> char = "%identity" > > > > Is this really any faster than using the pervasives char_of_int? > > %identity means it's removed at compile time, it's just a typed Obj.magic. > on the other side you have : > > let char_of_int n = > if n < 0 || n > 255 then invalid_arg "char_of_int" else > unsafe_char_of_int n I know what %identity is. I was wondering if it really was faster. Doesn't Ocamlopt have deadcode removal? If we have one issue we constantly butt heads over, it's optimization. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Brian H. <bh...@sp...> - 2004-04-09 21:18:36
|
On Fri, 9 Apr 2004, Sylvain LE GALL wrote: > > Actually, writting portable C isn't the problem. I routinely write C code > > which is much more widely portable than Ocaml is. > > > > Well, most of the time, porting from BE to LEndian and trading with > different int size is a great problem ( for me at least ). You need a > lot of #define #ifdef... I don't like this... Test coverage for code is > enough difficult. Adding compile time condition give you more code to > test... Worse yet, I've worked on systems which were *neither* BE nor LE (hint: DEC tried to switch horses in midstream). BE/LE only come into play when you're reading/writting binary data, or playing games with unions. In the first case, I tend to construct my ints by hand to control endianess. If you're doing I/O, the extra shift/ors aren't that big of a problem. As for playing games with unions, these are bad ideas anyways, don't do them. As for int sizes, all this means is you need to think about what types things are. One of the worst habits C coders get into is not thinking about what types variables should be. I'd argue this is more important in C than it is in Ocaml. An example: what type should the variable i be in the following code? for (i = 0; i < sizeof(arr)/sizeof(arr[0]); ++i) { arr[i] = 0; } My answer: i should have the type size_t. Thinking about what types variables should be solves most of the problems (and eliminates the vast majority of pointless typecasts). I hate ifdefs, except for providing imdepotence for header files (at which point I insist on them). But it's surprising how portable you can get without them. > I agree... > > The idea of having zlib, encryption is good but i think it should be not > part of extlib. That's where I'm ending up as well. Or at least not C-based encryption and compression. Ocaml code only. Note that there might be a sister project spawned to use the C code for higher performance. > ps : do you plan to use cryptokit and cryptgps for your library ? I hadn't thought about it much. Last time I took a spin through the crypto libraries, I was more than a little surprised that not one used GMP for RSA. Which I found quite surprising, as I'm willing to bet that GMP would give you the best possible RSA performance with the least work from the implementor. IIRC, GMP already has an "exponentiate in a modular field" operation, implemented with tuned assembly code. This is the core operation of RSA. For symmetric key and hashing, I was thinking of rounding up the hand-tunned assembly versions kicking around. I haven't decided what to do with Elliptic curve yet. Thinking about it a second, there's a problem with providing Ocaml implementations of various symmetric key crypto systems- most of them assume you have access to efficient 32-bit integers. Which means you either use Int32 at a serious performance and memory hit. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Sylvain LE G. <syl...@po...> - 2004-04-09 21:15:16
|
Hello, On Fri, Apr 09, 2004 at 11:00:34PM +0200, Nicolas Cannasse wrote: > > > Hello, > > > > > > Even if i am not an Extlib developper, i permit myself to raise my > > > voice... Sorry if you don't agree. > > > > IMHO, If you're an Extlib user, you have the right to a voice here. > > Same opinion here. > > > > I think extlib is a great project by the 100% Pure Ocaml coding style. I > > > think putting 1 byte of C in it is not a good idea. It will break a lot > > > of things, need to be maintained over a lot of arch ( including Ms Win, > > > Linux, Unix, Irix et al ). I think it is far too complicated to be > > > interessant. > > > > Actually, writting portable C isn't the problem. I routinely write C code > > which is much more widely portable than Ocaml is. > > > > But the more I think about it, the more I think linking to external > > libraries has other problems. For example, zlib and gmp both are > > standard installs on Linux systems. They both exist for Windows, but are > > not part of the standard installs. So now, to use extlib on windows, you > > now not only need to get ocaml and extlib, but also zlib and gmp. Plus, > > the code I'm thinking of grabbing for symmetric key crypto aren't > > standardly packaged in any library I'm aware of. How we connect to this > > code is almost irrelevent, the code still needs to be there. > > As I told you : you don't need to have the libs installed to compile stubs. > Now if you want to use the features that are needind the libs, you only need > to have the dynamic library in your path so that EmuC can find it. All > linking is done dynamicly at runtime. As for the zlib - and many other > libraries - there is always some binary available for windows, because we - > windows programmers - are known to be not clever enough to rebuild from a > Makefile :-) > Yep... But i think it could be achieve by doing an external module... It has three benefits : - don't imply C code in extlib - don't imply dependency on external library - is a good example to know if you can let people build their own external module for extlib ( ie by this way you can test the completeness of the interface given to external people to build things with extlib ). > To answer to Sylvain, crypto was just an example of external library, I'm > not sure it should be part of ExtLib (or maybe only the MD5 - useful enough > to be standard : anyone contributing ? ). > INRIA has already contribut to this ;-) ( Hash of string are MD5 sums, cryptokit use it directly, cryptokit has already a MD5 module ) Regard Sylvain Le Gall |
From: Nicolas C. <war...@fr...> - 2004-04-09 21:03:21
|
> > Hello, > > > > Even if i am not an Extlib developper, i permit myself to raise my > > voice... Sorry if you don't agree. > > IMHO, If you're an Extlib user, you have the right to a voice here. Same opinion here. > > I think extlib is a great project by the 100% Pure Ocaml coding style. I > > think putting 1 byte of C in it is not a good idea. It will break a lot > > of things, need to be maintained over a lot of arch ( including Ms Win, > > Linux, Unix, Irix et al ). I think it is far too complicated to be > > interessant. > > Actually, writting portable C isn't the problem. I routinely write C code > which is much more widely portable than Ocaml is. > > But the more I think about it, the more I think linking to external > libraries has other problems. For example, zlib and gmp both are > standard installs on Linux systems. They both exist for Windows, but are > not part of the standard installs. So now, to use extlib on windows, you > now not only need to get ocaml and extlib, but also zlib and gmp. Plus, > the code I'm thinking of grabbing for symmetric key crypto aren't > standardly packaged in any library I'm aware of. How we connect to this > code is almost irrelevent, the code still needs to be there. As I told you : you don't need to have the libs installed to compile stubs. Now if you want to use the features that are needind the libs, you only need to have the dynamic library in your path so that EmuC can find it. All linking is done dynamicly at runtime. As for the zlib - and many other libraries - there is always some binary available for windows, because we - windows programmers - are known to be not clever enough to rebuild from a Makefile :-) To answer to Sylvain, crypto was just an example of external library, I'm not sure it should be part of ExtLib (or maybe only the MD5 - useful enough to be standard : anyone contributing ? ). Regards, Nicolas Cannasse |
From: Nicolas C. <war...@fr...> - 2004-04-09 20:57:03
|
> On Fri, 9 Apr 2004, Nicolas Cannasse wrote: > > A couple of questions: > > > external unsafe_char_of_int : int -> char = "%identity" > > Is this really any faster than using the pervasives char_of_int? %identity means it's removed at compile time, it's just a typed Obj.magic. on the other side you have : let char_of_int n = if n < 0 || n > 255 then invalid_arg "char_of_int" else unsafe_char_of_int n > > let inv_chars = > > let a = Array.make 256 (-1) in > > for i = 0 to 63 do > > Array.unsafe_set a (int_of_char (Array.unsafe_get chars i)) i; > > Likewise- especially since this code will be executed only once (on module > load), is unsafe_set worthwhile to call? > > > let c = int_of_char (String.unsafe_get s i) in > > Ditto & etc. It's always better to use unsafe when you're really sure that it will never breaks. IMHO, basic functions such as b64 encode/decode should be tuned to get best performances. As long as using unsafe_get instead of get does not break code readability... it's worth doing. Regards, Nicolas Cannasse |
From: Brian H. <bh...@sp...> - 2004-04-09 20:45:31
|
On Fri, 9 Apr 2004, Nicolas Cannasse wrote: A couple of questions: > external unsafe_char_of_int : int -> char = "%identity" Is this really any faster than using the pervasives char_of_int? > let inv_chars = > let a = Array.make 256 (-1) in > for i = 0 to 63 do > Array.unsafe_set a (int_of_char (Array.unsafe_get chars i)) i; Likewise- especially since this code will be executed only once (on module load), is unsafe_set worthwhile to call? > let c = int_of_char (String.unsafe_get s i) in Ditto & etc. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Sylvain LE G. <syl...@po...> - 2004-04-09 20:18:23
|
Hello, On Fri, Apr 09, 2004 at 04:00:37PM -0500, Brian Hurt wrote: > On Fri, 9 Apr 2004 syl...@po... wrote: > > > Hello, > > > > Even if i am not an Extlib developper, i permit myself to raise my > > voice... Sorry if you don't agree. > > IMHO, If you're an Extlib user, you have the right to a voice here. > Thanks ;-) > > > > I think extlib is a great project by the 100% Pure Ocaml coding style. I > > think putting 1 byte of C in it is not a good idea. It will break a lot > > of things, need to be maintained over a lot of arch ( including Ms Win, > > Linux, Unix, Irix et al ). I think it is far too complicated to be > > interessant. > > Actually, writting portable C isn't the problem. I routinely write C code > which is much more widely portable than Ocaml is. > Well, most of the time, porting from BE to LEndian and trading with different int size is a great problem ( for me at least ). You need a lot of #define #ifdef... I don't like this... Test coverage for code is enough difficult. Adding compile time condition give you more code to test... > But the more I think about it, the more I think linking to external > libraries has other problems. For example, zlib and gmp both are > standard installs on Linux systems. They both exist for Windows, but are > not part of the standard installs. So now, to use extlib on windows, you > now not only need to get ocaml and extlib, but also zlib and gmp. Plus, > the code I'm thinking of grabbing for symmetric key crypto aren't > standardly packaged in any library I'm aware of. How we connect to this > code is almost irrelevent, the code still needs to be there. > I agree... The idea of having zlib, encryption is good but i think it should be not part of extlib. Kind regard Sylvain Le Gall ps : do you plan to use cryptokit and cryptgps for your library ? |
From: Brian H. <bh...@sp...> - 2004-04-09 19:54:37
|
On Fri, 9 Apr 2004 syl...@po... wrote: > Hello, > > Even if i am not an Extlib developper, i permit myself to raise my > voice... Sorry if you don't agree. IMHO, If you're an Extlib user, you have the right to a voice here. > > I think extlib is a great project by the 100% Pure Ocaml coding style. I > think putting 1 byte of C in it is not a good idea. It will break a lot > of things, need to be maintained over a lot of arch ( including Ms Win, > Linux, Unix, Irix et al ). I think it is far too complicated to be > interessant. Actually, writting portable C isn't the problem. I routinely write C code which is much more widely portable than Ocaml is. But the more I think about it, the more I think linking to external libraries has other problems. For example, zlib and gmp both are standard installs on Linux systems. They both exist for Windows, but are not part of the standard installs. So now, to use extlib on windows, you now not only need to get ocaml and extlib, but also zlib and gmp. Plus, the code I'm thinking of grabbing for symmetric key crypto aren't standardly packaged in any library I'm aware of. How we connect to this code is almost irrelevent, the code still needs to be there. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: <syl...@po...> - 2004-04-09 19:30:00
|
Hello, On Fri, Apr 09, 2004 at 08:59:51PM +0200, Nicolas Cannasse wrote: > > The point about EmuC is : > - do we write and maintain C stubs for each of the external librairies we > might need > or > - do we use EmuC only and write stubs in OCaml using EmuC > If we choose the EmuC way, then we have only one - quite small - C file to > maintain. That's might still be quite big, so maybe we don't have to put > EmuC in the default install at the beginning, but after we have stubs for > let's say 5 or more librairies, it will be worth it. > > Other choice is to keep away from C, but there is some times (zlib, > cryptography, ipv6 support, and many others... ) that we can't or that the > effort to write it again in ocaml is simply to big compare to writing stubs > and maintaining EmuC. I don't want also to put C code into ExtLib, but maybe > one day we'll have to :-) > Well, i have nothing against emuc... I think it can come along extlib, but i don't think it is good to have it in extlib ( just my point of view ). However, have you considered some tools like camlidl ? I use it to wrap some C stuff, and it is very simple and powerful ( but it is not very flexible... ). It allows you to cp zlib.h camlzlib.idl vim camlzlib.idl -> explain that this structure, should this one in ocaml, this function returns, if the return is this, then it is an error etc camlidl camlzlib.idl -> camlzlib.c camlzlib.ml camlzlib.mli compile It is very easy to write stubs with it... I think it is a good tool to write easily C stub in Ocaml ( moreover it is written by a guy you should know : X. Leroy ) http://caml.inria.fr/camlidl/ I don't say the solution is perfect, but i have so many problem using C on different architecture comparing to ocaml portability ( which is sometime partial -- essentially when it depends on C code ). Kind regard Sylvain Le Gall |
From: Nicolas C. <war...@fr...> - 2004-04-09 19:02:38
|
> > No, the problems I have with this idea are: > > > > 1) You'd still have C dependencies, breaking the Ocaml-only nature of > > ExtLib. > > > > 2) You would still have to deal with impedence mismatches- C API's that > > don't translate well into Ocaml. For example, routines which take > > pointers to variables to return more than one value. Or APIs that depend > > upon the "shape" of variables (for example, using unions). Or that want > > to use Macros to inline code. I'm not sure what all the impedence > > problems might be. > > > > Even if i am not an Extlib developper, i permit myself to raise my > voice... Sorry if you don't agree. > > I think extlib is a great project by the 100% Pure Ocaml coding style. I > think putting 1 byte of C in it is not a good idea. It will break a lot > of things, need to be maintained over a lot of arch ( including Ms Win, > Linux, Unix, Irix et al ). I think it is far too complicated to be > interessant. > > My opinion, is that you can create a new repository especially for > project related to extlib... It will be far more simple, and will gives > dependency on C only on this external module. Just to give you an > example : > CVSROOT/ > extlib/ > extlib-io-zip/ > extlib-io-aes/ > ... > > I think it is the less complicated you can do... > Off course, the compilation will require to load other module than > extlib, but i think it is worth the effort. The point about EmuC is : - do we write and maintain C stubs for each of the external librairies we might need or - do we use EmuC only and write stubs in OCaml using EmuC If we choose the EmuC way, then we have only one - quite small - C file to maintain. That's might still be quite big, so maybe we don't have to put EmuC in the default install at the beginning, but after we have stubs for let's say 5 or more librairies, it will be worth it. Other choice is to keep away from C, but there is some times (zlib, cryptography, ipv6 support, and many others... ) that we can't or that the effort to write it again in ocaml is simply to big compare to writing stubs and maintaining EmuC. I don't want also to put C code into ExtLib, but maybe one day we'll have to :-) Regards, Nicolas Cannasse |
From: <syl...@po...> - 2004-04-09 18:01:43
|
Hello, On Fri, Apr 09, 2004 at 12:33:12PM -0500, Brian Hurt wrote: > On Fri, 9 Apr 2004, Nicolas Cannasse wrote: > > No, the problems I have with this idea are: > > 1) You'd still have C dependencies, breaking the Ocaml-only nature of > ExtLib. > > 2) You would still have to deal with impedence mismatches- C API's that > don't translate well into Ocaml. For example, routines which take > pointers to variables to return more than one value. Or APIs that depend > upon the "shape" of variables (for example, using unions). Or that want > to use Macros to inline code. I'm not sure what all the impedence > problems might be. > Even if i am not an Extlib developper, i permit myself to raise my voice... Sorry if you don't agree. I think extlib is a great project by the 100% Pure Ocaml coding style. I think putting 1 byte of C in it is not a good idea. It will break a lot of things, need to be maintained over a lot of arch ( including Ms Win, Linux, Unix, Irix et al ). I think it is far too complicated to be interessant. My opinion, is that you can create a new repository especially for project related to extlib... It will be far more simple, and will gives dependency on C only on this external module. Just to give you an example : CVSROOT/ extlib/ extlib-io-zip/ extlib-io-aes/ ... I think it is the less complicated you can do... Off course, the compilation will require to load other module than extlib, but i think it is worth the effort. Thank you for reading Kind regard Sylvain Le Gall |
From: Nicolas C. <war...@fr...> - 2004-04-09 17:10:23
|
> > I don't like so much the idea of having nread returning a pair. > > Fine, I can see why you might want a simpler interface in a > general-purpose library. I really did it that way because it was > slightly more convenient with the code I'm using myself, but I guess > that's not a very good design principle... ^^; > > > I have been looking a little and it looks ok to modify the behavior > > of IO without actually modifying the interface. We just have to > > replace really_input by input - followed by a String resize if the > > readed number of chars is lower than the requested one. > > The only reason I didn't use Pervasives.input is that it doesn't promise > to return the number of items you requested, even if that many are > available - for all the other inputs, that does seem to be guaranteed. > I'm not sure if it really matters or not. > > > Please tell me if it's ok for you this way, I'll then make the > > changes. > > Yes, the version you describe would work just as well for me. Some quite minor modifications have been made to the following functions : - input_string - input_channel - input_enum - pipe So now when asked about n elements to read when only n' available ( and 0 < n' < n ), nread will return the n' available elements instead of raising No_more_input . IO should now work correctly when used on buffered streams. Regards, Nicolas Cannasse |
From: Brian H. <bh...@sp...> - 2004-04-09 16:27:11
|
On Fri, 9 Apr 2004, Nicolas Cannasse wrote: > The problem is also the cost of reimplemeting. > It was little painful for me to have to use the zlib, since I know it will > requires all my sub-projects to compile and deploy some C code : the zlib > itself - but also the CamlZip stubs (for which an Win32 Makefile was not > given). But after looking at the ZLib format RFC, I somehow thought that it > would take me quite a lot of time to reimplement it, and in the end maybe > get a lower quality result (since zlib itself is quite a nice piece of code, > highly suppported and optimized). Hmm. Poking at a little bit, it doesn't look that bad. Actually, Ocaml might be a better language to implement them in, as several of the core data structures are already implemented. > > If I had enough time, this is what I will do : [ write EmuC ] The performance I'm not worried about- I don't think it'd be that bad. Nor the writting of stubs. EmuC would just be a library that shouldn't be used except by knowledgable people. Rather like the tricks we do in the optimized List module. We're already saying "do as we say, not as we do." This is one of the purposes of libraries, IMHO- the library gets the tricky stuff right, and everyone else uses the library. No, the problems I have with this idea are: 1) You'd still have C dependencies, breaking the Ocaml-only nature of ExtLib. 2) You would still have to deal with impedence mismatches- C API's that don't translate well into Ocaml. For example, routines which take pointers to variables to return more than one value. Or APIs that depend upon the "shape" of variables (for example, using unions). Or that want to use Macros to inline code. I'm not sure what all the impedence problems might be. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Yamagata Y. <yor...@mb...> - 2004-04-09 16:22:20
|
From: Achim Blumensath <bl...@la...> Subject: Re: [Ocaml-lib-devel] IO update (2) Date: Fri, 9 Apr 2004 16:19:19 +0200 > I would say that it isn't 100% standard compliant but not very serious. > The 16 bit restriction probably should be fixed. As far as longer > encodings are concerned some people are of the opinion that they should > always be rejected. I have no real opinion, I was just lazy. From version 4.0, Unicode standard is changed. Using the shortest encoding becomes mandantory by the security reason. -- Yamagata Yoriyuki |
From: Peter J. <pe...@jo...> - 2004-04-09 16:11:23
|
> I don't like so much the idea of having nread returning a pair. Fine, I can see why you might want a simpler interface in a general-purpose library. I really did it that way because it was slightly more convenient with the code I'm using myself, but I guess that's not a very good design principle... ^^; > I have been looking a little and it looks ok to modify the behavior > of IO without actually modifying the interface. We just have to > replace really_input by input - followed by a String resize if the > readed number of chars is lower than the requested one. The only reason I didn't use Pervasives.input is that it doesn't promise to return the number of items you requested, even if that many are available - for all the other inputs, that does seem to be guaranteed. I'm not sure if it really matters or not. > Please tell me if it's ok for you this way, I'll then make the > changes. Yes, the version you describe would work just as well for me. |
From: Nicolas C. <war...@fr...> - 2004-04-09 16:05:52
|
> Actually, this is a deep problem, and not limited to just compression. I > could see reimplementing zlib in pure Ocaml. But the #2 thing to do is > encryption/decryption. For which you should not only call out to C, but > have the C call hand tuned assembly. [...] > So, there are three possibilities: > > - Stick with implementing everything in Ocaml. This gives us > portability and library independence (you don't need zlib nor gmp > installed) at the cost of performance > > - Allow C/ASM callouts for performance > > - Provide both, and some mechanism for choosing between them (with > the attendent problems) The problem is also the cost of reimplemeting. It was little painful for me to have to use the zlib, since I know it will requires all my sub-projects to compile and deploy some C code : the zlib itself - but also the CamlZip stubs (for which an Win32 Makefile was not given). But after looking at the ZLib format RFC, I somehow thought that it would take me quite a lot of time to reimplement it, and in the end maybe get a lower quality result (since zlib itself is quite a nice piece of code, highly suppported and optimized). If I had enough time, this is what I will do : - write a small C library that can alloc/access C memory , load dynlinked libraries, retreive and call functions from it - add Ocaml interface to this C library - port this C library to all architectures were Ocaml is working (including the good compilation parameters) - add this C library to ExtLib and install it by default and then we can write all the stubs to external C librairies in pure OCaml using this small C library ( let's call it EmuC ). We can then provide a lot of stubs for a lot of different C libraries according that : - theses libraries are dynlinkable - they can be accessed using EmuC Good points : - theses stubs will be pure ocaml and depends only on EmuC so we are minimizing the dependencies - the user can compile the stubs without actually having the library compiled/installed. - we only have to maintain and port one file of C code : emuC Bad points : - we have to correctly write the stubs in order to support several versions of the dynlinked library (while only a recompilation would have been needed for C stubs) - a little speed tradeoff since we're adding an C layer over every memory access we're doing Regards, Nicolas Cannasse |
From: Brian H. <bh...@sp...> - 2004-04-09 15:35:25
|
On Fri, 9 Apr 2004, Nicolas Cannasse wrote: > > 1) sinks and sources. > > 2) filters. > Theses two are now reality in the IO module. This is good- these two were the "core" of the idea, the rest was decoration. > For example, I managed to wrap CamlZip module on an IO and - without > changing a line of my code - I can now read/write over normal or compressed > files. Here's the code that's converting a standard IO into a > compressed/uncompressed one. I think it's worth including into the ExtLib > but the problem is it rely on another module (CamlZip, by Xavier Leroy) > which itself contains C stubs using the ZLib. I think we need an additional > folder into the ExtLib called "tools" that will be useful parts of code - > yet not installed because of external dependencies. > What do you think about the idea ? Actually, this is a deep problem, and not limited to just compression. I could see reimplementing zlib in pure Ocaml. But the #2 thing to do is encryption/decryption. For which you should not only call out to C, but have the C call hand tuned assembly. 20 cycles/byte encryption speed (approx. the speed of a good symmetric key crypto system like AES or Twofish) sounds real good, until you realize that this means 10 milliseconds to encrypt a megabyte of data on a 2GHz machine. Similiarly, I'm willing to be that RSA encryption using GMP will be signifigantly faster than using the Ocaml bignum library. Every cycle counts in this special case- even C is too slow. Compression we might want to reimplement in Ocaml. Reimplementing the crypto libraries in pure Ocaml, while theoretically possible, would impose severe performance limits. So, there are three possibilities: - Stick with implementing everything in Ocaml. This gives us portability and library independence (you don't need zlib nor gmp installed) at the cost of performance - Allow C/ASM callouts for performance - Provide both, and some mechanism for choosing between them (with the attendent problems) -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Nicolas C. <war...@fr...> - 2004-04-09 15:30:23
|
> > Actually IO does not work maybe well with buffered streams : the > > specification is that when you're reading n "elements" (characters or any) > > from an input, the result will be exactly the n elements you needed. Not > > more, and not less. This might actually be questionable : if you have spare > > time, could you suggest what parts of IO module should be modified in order > > to support buffered streams? > > I attach a patch that (IMO) improves things, but it does involve > changing the interface somewhat - maybe that's still okay at this stage? > > It changes the semantics of nread to return anything up to the requested > number items. Unlike Pervasives.input, the only situation in which > fewer are returned is if the end of the stream is reached first. The > return value is an (int, 'b) pair where the int is the number actually > returned. A new read_exactly function is added that behaves like nread > did before, i.e. it calls nread and raises No_more_input if the returned > item count does not match that requested. > > A side-effect of the implementation is that a failed call to > read_exactly is guaranteed to consume the remainder of the stream. This > may or may not be desirable, but it was already the case for streams > wrapping enums, so this merely defines a previously undefined behaviour. > > There's one problem, which is input_bits/output_bits, where the old > behaviour makes more sense. For now all I've done is changed the return > value of nread for input_bits so it compiles. > > Of course, I won't be offended if you don't think it's worth making > these changes. I'm sorry but that's not exactly what I was thinking of. I don't like so much the idea of having nread returning a pair. For example if we have an (char,string) input , the int returned is already stored into the String.length value. Same for a ('a, 'a list) input. I have been looking a little and it looks ok to modify the behavior of IO without actually modifying the interface. We just have to replace really_input by input - followed by a String resize if the readed number of chars is lower than the requested one. Same for the Enum's IOs : actually we're throwing an exception if we're requesting more than available : we will now just return the partial enum (not an empty one). Please tell me if it's ok for you this way, I'll then make the changes. Regards, Nicolas Cannasse |
From: Peter J. <pe...@jo...> - 2004-04-09 15:16:17
|
> Actually IO does not work maybe well with buffered streams : the > specification is that when you're reading n "elements" (characters or any) > from an input, the result will be exactly the n elements you needed. Not > more, and not less. This might actually be questionable : if you have spare > time, could you suggest what parts of IO module should be modified in order > to support buffered streams? I attach a patch that (IMO) improves things, but it does involve changing the interface somewhat - maybe that's still okay at this stage? It changes the semantics of nread to return anything up to the requested number items. Unlike Pervasives.input, the only situation in which fewer are returned is if the end of the stream is reached first. The return value is an (int, 'b) pair where the int is the number actually returned. A new read_exactly function is added that behaves like nread did before, i.e. it calls nread and raises No_more_input if the returned item count does not match that requested. A side-effect of the implementation is that a failed call to read_exactly is guaranteed to consume the remainder of the stream. This may or may not be desirable, but it was already the case for streams wrapping enums, so this merely defines a previously undefined behaviour. There's one problem, which is input_bits/output_bits, where the old behaviour makes more sense. For now all I've done is changed the return value of nread for input_bits so it compiles. Of course, I won't be offended if you don't think it's worth making these changes. |
From: Nicolas C. <war...@fr...> - 2004-04-09 14:39:28
|
> > > Attached. Please note that the implementation only supports 16 bit > > > characters and assumes that each character uses the shortest encoding. > > > > Is it default for UTF8 ? I don't know about it. > > I would say that it isn't 100% standard compliant but not very serious. > The 16 bit restriction probably should be fixed. As far as longer > encodings are concerned some people are of the opinion that they should > always be rejected. I have no real opinion, I was just lazy. I'm not sure then it should be included in IO "as it". Nicolas Cannasse |
From: Achim B. <bl...@la...> - 2004-04-09 14:16:42
|
Nicolas Cannasse wrote: > > unsafe_char_of_int is just defined as "%identity". Since we already know > > that the argument is in the right range we can do without the bounds > > check. > > Just saw that. Please note that we need to define it again since it's not > exported in pervasives.mli but I'll do the change. You can also use Char.unsafe_chr instead. > > Attached. Please note that the implementation only supports 16 bit > > characters and assumes that each character uses the shortest encoding. > > Is it default for UTF8 ? I don't know about it. I would say that it isn't 100% standard compliant but not very serious. The 16 bit restriction probably should be fixed. As far as longer encodings are concerned some people are of the opinion that they should always be rejected. I have no real opinion, I was just lazy. > Should not : > ---- > else if c < 0xc0 then > c (* should never happen *) > --- > raise an exception instead ? If you like. Achim -- ________________________________________________________________________ | \_____/ | Achim Blumensath \O/ \___/\ | LaBRI / Bordeaux =o= \ /\ \| www-mgi.informatik.rwth-aachen.de/~blume /"\ o----| ____________________________________________________________________\___| |
From: Nicolas C. <war...@fr...> - 2004-04-09 13:40:40
|
> > > o Wouldn't it be better to rename read_i32 and write_i32 to > > > read_i31 and write_i31 ? Then you could add real read_i32 and > > > write_i32 functions based on native ints. > > > > The values that can be readed/written are 31 bits limited caml > > integers (on 32 bits platform , since 64 bits have 63 bits integers). > > But the size of the data readed / written is exactly 32 bits. > > So the type being read and written is 31 bit and the encoding chosen is > 32 bit. All other operations are labelled by the type and not the > encoding. Therefore the names read_i31/write_i32 would be more > consistent. Actually no. Operations are labelled by the encoding : read / write (u)i16 are returning ints read / write (null terminated) string read / write utf8 (not yet here, thanks for the code) > > But having functions which name are claming reading/writting 31 bits > > looks highly suspicious for people who does not know about ocaml > > implementation details :-) > > I would call this a good thing as it might prevent beginners from making > mistakes. There can't be mistake since there is a guard when the 32 bits value readed cannot be represented as a caml int. > > > o A slight optimisation of write_byte would be to use > > > unsafe_char_of_int. > > > > I'll have a look at that. > > unsafe_char_of_int is just defined as "%identity". Since we already know > that the argument is in the right range we can do without the bounds > check. Just saw that. Please note that we need to define it again since it's not exported in pervasives.mli but I'll do the change. > > > o What about read_utf8 and write_utf8 ? > > > > I don't have knowledge in internationalizion, if you have some ideas > > about this, please feel free to contribute ! > > Attached. Please note that the implementation only supports 16 bit > characters and assumes that each character uses the shortest encoding. Thanks for the code, I'll put it into IO. Is it default for UTF8 ? I don't know about it. > Also note that the code comes straight out of ant. So it's in revised > syntax and need to be slightly adapted to the Extlib IO module. (It > assumes that read_byte returns -1 at end-of-file.) Should not : ---- else if c < 0xc0 then c (* should never happen *) --- raise an exception instead ? > > > o Have you made up your mind about supporting seekable streams? > > > > Not yet. This would need to add another closure to the IO prototype : > > I'm not yet sure it's worth it. > > Is this a problem? Usually one does not create that many IO objects. So > the memory consumption should be ignorable. Also, when using IO objects > that do not support seeking, the corresponding slot is initialised by > some default value. So there is no overhead creating a new closure. You have a point here. I might add "seek" soon. Regards, Nicolas Cannasse |
From: Nicolas C. <war...@fr...> - 2004-04-09 13:29:54
|
Hi list, Here's some efficient base64 encode/decode algorithm. Now only working with strings, Enum and IO support to be added. Available for review. I'll commit into the CVS a new Base64 module if there is no negative comments about it. Best regards, Nicolas Cannasse ---- exception Invalid_char external unsafe_char_of_int : int -> char = "%identity" let chars = [| 'A';'B';'C';'D';'E';'F';'G';'H';'I';'J';'K';'L';'M';'N';'O';'P'; 'Q';'R';'S';'T';'U';'V';'W';'X';'Y';'Z';'a';'b';'c';'d';'e';'f'; 'g';'h';'i';'j';'k';'l';'m';'n';'o';'p';'q';'r';'s';'t';'u';'v'; 'w';'x';'y';'z';'0';'1';'2';'3';'4';'5';'6';'7';'8';'9';'+';'/' |] let inv_chars = let a = Array.make 256 (-1) in for i = 0 to 63 do Array.unsafe_set a (int_of_char (Array.unsafe_get chars i)) i; done; a let encode s = let data = ref 0 in let count = ref 0 in let b = Buffer.create 0 in let l = String.length s - 1 in for i = 0 to l do let c = int_of_char (String.unsafe_get s i) in data := (!data lsl 8) lor c; count := !count + 8; while !count >= 6 do count := !count - 6; let d = (!data asr !count) land 63 in Buffer.add_char b (Array.unsafe_get chars d) done; done; if !count > 0 then begin let d = (!data lsl (6 - !count)) land 63 in Buffer.add_char b (Array.unsafe_get chars d); end; Buffer.contents b let decode s = let data = ref 0 in let count = ref 0 in let b = Buffer.create 0 in let l = String.length s - 1 in for i = 0 to l do let c = int_of_char (String.unsafe_get s i) in let c = Array.unsafe_get inv_chars c in if c = -1 then raise Invalid_char; data := (!data lsl 6) lor c; count := !count + 6; while !count >= 8 do count := !count - 8; let d = (!data asr !count) land 0xFF in Buffer.add_char b (unsafe_char_of_int d); done; done; Buffer.contents b |
From: Achim B. <bl...@la...> - 2004-04-09 13:27:25
|
Nicolas Cannasse wrote: > > o Wouldn't it be better to rename read_i32 and write_i32 to > > read_i31 and write_i31 ? Then you could add real read_i32 and > > write_i32 functions based on native ints. > > The values that can be readed/written are 31 bits limited caml > integers (on 32 bits platform , since 64 bits have 63 bits integers). > But the size of the data readed / written is exactly 32 bits. So the type being read and written is 31 bit and the encoding chosen is 32 bit. All other operations are labelled by the type and not the encoding. Therefore the names read_i31/write_i32 would be more consistent. > But having functions which name are claming reading/writting 31 bits > looks highly suspicious for people who does not know about ocaml > implementation details :-) I would call this a good thing as it might prevent beginners from making mistakes. > > o A slight optimisation of write_byte would be to use > > unsafe_char_of_int. > > I'll have a look at that. unsafe_char_of_int is just defined as "%identity". Since we already know that the argument is in the right range we can do without the bounds check. > > o What about read_utf8 and write_utf8 ? > > I don't have knowledge in internationalizion, if you have some ideas > about this, please feel free to contribute ! Attached. Please note that the implementation only supports 16 bit characters and assumes that each character uses the shortest encoding. Also note that the code comes straight out of ant. So it's in revised syntax and need to be slightly adapted to the Extlib IO module. (It assumes that read_byte returns -1 at end-of-file.) > > o Have you made up your mind about supporting seekable streams? > > Not yet. This would need to add another closure to the IO prototype : > I'm not yet sure it's worth it. Is this a problem? Usually one does not create that many IO objects. So the memory consumption should be ignorable. Also, when using IO objects that do not support seeking, the corresponding slot is initialised by some default value. So there is no overhead creating a new closure. Achim -- ________________________________________________________________________ | \_____/ | Achim Blumensath \O/ \___/\ | LaBRI / Bordeaux =o= \ /\ \| www-mgi.informatik.rwth-aachen.de/~blume /"\ o----| ____________________________________________________________________\___| |