From: Ethan M. <merritt@u.washington.edu> - 2005-06-08 22:01:32
|
On Wednesday 08 June 2005 01:50 pm, Robert Hart wrote: > I used callgrind (and kcachegrind) to profile gnuplot whilst loading > this datafile. Turned out >95% of time was sping in the sscanf on > datafile.c:759 Thank you for taking the time to pin this down. > Questions: Where did this "NO_FORTRAN_NUMS" option come from, and why > isn't it enabled? Is Fortran number support useful? Could it be provided > in an alternative way (perhaps a run time option)? Excellent questions. Here's a brief excerpt from a Fortran manual: A double precision constant has the same form as a scaled real constant except that the E is replaced by D. Examples: 6.1D2 is equivalent to 610.0 +2.3D3 is equivalent to 2300.0 -3.5D-1 is equivalent to -0.35 +4D4 is equivalent to 40000 I have no idea how common this might be in real life data files fed to gnuplot. If you are willing, could you run one more check? This entire section of code, with or without NO_FORTRAN_NUMS, is inside a larger block which starts with the comment: #ifdef OSK /* apparently %n does not work. This implementation * is just as good as the non-OSK one, but close * to a release (at last) we make it os-9 specific */ int count; char *p = strpbrk(s, "dqDQ"); if (p != NULL) *p = 'e'; count = sscanf(s, "%lf", &df_column[df_no_cols].datum); #else [Previously analysed code is here in the #else] The question is whether this comment, which dates back at least to 1999, is in fact correct. Could you please compare the previous benchmarks to the case where the code is prefixed by: #define OSK 1 I know this will break the checks for separators other than whitespace, but we can sort that out afterwards. It would be nice to get a speed-up of 10X by deleting 80+ lines of obfuscated code :-) -- Ethan A Merritt merritt@u.washington.edu Biomolecular Structure Center Mailstop 357742 University of Washington, Seattle, WA 98195 |