From: Richard J. <ri...@an...> - 2008-06-08 09:18:11
|
On Mon, Jun 02, 2008 at 08:58:14PM +0200, Florent Monnier wrote: > > > Have you decided to make the extlib with only ocaml code, > > > without any external C ? > > > > That's the policy - to make it easier to port extlib to any platform > > which supports OCaml. > > Maybe write_double_opt could be joined along with the extlib only as a patch > and a readme explaining the difference. It's maybe possible to have optimized versions of extlib functions which are only switched in if they can be compiled, with OCaml versions as back ups. I'll see what others think about that though. > > Is it possible to match the speed-ups using pure OCaml code? eg. by > > carefully looking at the generated assembler (ocamlopt -S) and > > studying why it might be slow? > > yes I have read the gas code of the extlib version compared to the > one of the mixed ocaml/C, and even without this just reading the > original code it is easy to understand what makes the difference: This is the code we're discussing: CAMLprim value double_cast( value str, value d ) { memcpy((char *)str, (double *)d, sizeof(double)); return Val_unit; } external double_cast: buf_str:string -> float -> unit = "double_cast" let buf_str = "01234567" let write_double_opt_native ch d = double_cast ~buf_str d; nwrite ch buf_str As was pointed out in another reply, I don't think this is thread safe. Anyhow you can get the same effect in pure OCaml with this hairy bit of Obj magic: open Printf let hairy_string_of_float (d : float) = let r = Obj.repr d in Obj.set_tag r Obj.string_tag; (Obj.obj r : string) let print_bytes s n = for i = 0 to n-1 do printf "%02x " (Char.code (String.unsafe_get s i)) done; printf "\n" let () = print_bytes (hairy_string_of_float 1.0) 8; print_bytes (hairy_string_of_float 3.0) 8; print_bytes (hairy_string_of_float 1e38) 8; Output: 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00 08 40 b1 a1 16 2a d3 ce d2 47 Note that the string returned from hairy_string_of_float isn't a well-formed OCaml string, so it's not safe to call anything except String.unsafe_get on it. eg. functions such as String.length will definitely fail. I haven't tested the performance, but I did look at the assembly code. On my x86-64 it's unfortunate that the compiler didn't inline the call to Obj.set_tag (instead it's a C call, even though the C function is a two-liner). You can probably replace it with a call to String.unsafe_set with a negative offset to modify the tag directly, and with luck the generated code should be faster than your C impl. Rich. -- Richard Jones Red Hat |