From: Ethan M. <merritt@u.washington.edu> - 2004-07-16 16:35:31
|
I'm moving this to the mailing list, because I hate the format of Email sent via the SourceForge patch site. On Friday 16 July 2004 02:22 am, Harald Harders wrote: > Ethan Merritt wrote in the summary for patchset #992149 > > internal.c internal.h > ================= > 1) Define a new internal function f_sprintf(). This is > visible to the user as a new built-in function > sprintf("fmt",...), which is the first, and so far the > only, function in gnuplot that accepts string variables > as arguments. > I think this patch looks like a good thing, but I really > don't like the name `sprintf'. It really sounds like a c > programmer had no good idea how to call it. I was not clear enough. This *is* the C language sprintf routine [*]. The gnuplot code just collects the variables and the format and passes them along to the C library. The documentation will refer users to "man sprintf" or to any C language manual. > set xlabel string(" %d %d %d", 1,2,3) > > The sprintf (or string) command should really understand all > gnuplot formats (%t, %T, %l, %L, etc.). For me, part of the rationale for this work is that the gnuplot formats are limiting. If there is some particular format that you cannot produce using a C language printf() variant, then we can provide a separate routine for that. Or there could be a second formatting function that specifically uses only the non-C gnuplot format conversion specifiers. mystring = string("Format using %T %L etc", var1, var2) set title sprint("C Format with embedded %s", mystring) [*] Actually, it's snprintf() because otherwise there is little hope of preventing all buffer overflows. As I recall, there is an issue with some platforms not providing snprint(). My inclination is to say that string variables are not supported on such platforms. -- Ethan A Merritt merritt@u.washington.edu Biomolecular Structure Center Mailstop 357742 University of Washington, Seattle, WA 98195 |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-16 22:21:16
|
On Fri, 16 Jul 2004, Harald Harders wrote: > I would prefer one function that does everything. I don't think that's an option. Too many of gnuplot's special formatting specifier already collide with C printf formats for that to work: %s, %c, %p, %l. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Harald H. <h.h...@tu...> - 2004-07-17 13:43:34
|
On Sat, 17 Jul 2004, Hans-Bernhard Broeker wrote: > On Fri, 16 Jul 2004, Harald Harders wrote: > > > I would prefer one function that does everything. > > I don't think that's an option. Too many of gnuplot's special > formatting specifier already collide with C printf formats for that to > work: %s, %c, %p, %l. Mmh, you are right here. Nevertheless I think it is more important to provide the format specifiers the gnuplot users are used to than to provide C specifiers. If you are able to write "%l \267 10^{%L}" in tic formats, you also should be able to do this in string variable definitions. Yours Harald -- Harald Harders Langer Kamp 8 Technische Universitaet Braunschweig D-38106 Braunschweig Institut fuer Werkstoffe Germany E-Mail: h.h...@tu... Tel: +49 (5 31) 3 91-3062 WWW : http://www.ifw.tu-bs.de Fax: +49 (5 31) 3 91-3058 |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-17 17:01:08
|
On Sat, 17 Jul 2004, Harald Harders wrote: > provide the format specifiers the gnuplot users are used to than to > provide C specifiers. If you are able to write "%l \267 10^{%L}" in tic > formats, you also should be able to do this in string variable > definitions. I fully agree there. Which means we'll need two functions, eventually. The problem is that %l/%L and similar formats must always be used in pairs, and the code needs to know which the pairs are: the rounding on the %l part affects what the right result on %L is. The number 9.999 can come out as 9.999*10^0 or 10.0*10^^1. The two of them could be called gprintf and sprintf or similar. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Ethan A M. <merritt@u.washington.edu> - 2004-07-18 06:32:01
|
On Saturday 17 July 2004 10:00 am, Hans-Bernhard Broeker wrote: > > The problem is that %l/%L and similar formats must always be used in > pairs, and the code needs to know which the pairs are: the rounding on the > %l part affects what the right result on %L is. The number 9.999 can come > out as 9.999*10^0 or 10.0*10^^1. > > The two of them could be called gprintf and sprintf or similar. I have posted a 2nd patch, stringvars-2, to SourceForge. stringvars-1 and stringvars-2 are to be applied sequentially. This one adds run-time evaluation of all quoted strings beginning "sprintf... that are printed via write_multiline(). It turned out to be amazingly easy. I am very impressed with the existing implementation of expression evaluation; slotting in evaluation of string-valued functions "just worked". Here's a neat example that demonstrates plot-time evaluation: set title 'sprintf("Plotted at %s",`date`)' plot <something> pause 3 "Should show new time" replot pause 3 "Should show new time" replot NB: The specific placement of single and double quotes is critical for this to work. See my recent bug report about gnuplot losing the single-quoted-ness of strings after their initial evaluation. This example currently works by accident, but I think we should re-examine how quoted strings are stored in general. Anyhow, adding additional string-valued functions would be very easy. Let's discuss which ones might be desirable. 1) gprintf("format",mantissa,exponent) Is that the form it should take? 2) Some way to do arithmetic using numerical values stored in a string. E.g. a = "1.2" b = 3.4 c = a+b It would be straightforward, although tedious, to modify every existing atomic evaluation routine in internal.c so that it recognizes string-values during arithmetic. They would be converted to (double) using atof(). But is this at all necessary? Maybe it is sufficient to simply provide a built-in atof() function. Or maybe we don't even need that. 3) user-defined string-valued functions. I don't see at the moment how to implement these, although I think it would be possible. Are they needed? 4) Do we want any string operations besides concatenation? Substrings? String comparison? -- Ethan A Merritt Department of Biochemistry & Biomolecular Structure Center University of Washington, Seattle |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-18 14:03:48
|
On Sat, 17 Jul 2004, Ethan A Merritt wrote: > Anyhow, adding additional string-valued functions would be very easy. > Let's discuss which ones might be desirable. > > 1) gprintf("format",mantissa,exponent) > Is that the form it should take? No. It should be gprintf("format", number). The crucial difference is that C's sprintf() supports multiple % formats and uses up exactly one argument per format specifier, whereas gprintf() only ever has one argument, but may use more than one format specifier with it. The syntax of my stop-gap extension to 'set lable' looks the way it does for a reason: set label 'a = %l * 10^{%L}', a, ', length = %s %cm', length \ at 2,3 > 2) Some way to do arithmetic using numerical values stored in > a string. E.g. > a = "1.2" > b = 3.4 > c = a+b > It would be straightforward, although tedious, to modify every > existing atomic evaluation routine in internal.c so that it > recognizes string-values during arithmetic. They would be > converted to (double) using atof(). But is this at all necessary? I don't think so. I think we should follow a Java-like approach: adding a number to a string converts the number to a string and concatenates. The only way to get a number back from a string would be by explicit function call. > 3) user-defined string-valued functions. > I don't see at the moment how to implement these, although > I think it would be possible. Are they needed? I think they are. Users will almost certainly want to be able to do something like this: filename(i)=sprintf("foobar%d.ps", i) i=15 # in a loaded file: set output filename(i) plot something i=i+1 reread > 4) Do we want any string operations besides concatenation? > Substrings? String comparison? Yes, yes, and possibly, in that order. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Ethan A M. <merritt@u.washington.edu> - 2004-07-18 17:52:10
|
On Sunday 18 July 2004 07:03 am, Hans-Bernhard Broeker wrote: > > 3) user-defined string-valued functions. > > Users will almost certainly want to be able to do something like this: > > filename(i)=sprintf("foobar%d.ps", i) > > i=15 > > # in a loaded file: > set output filename(i) > plot something > i=i+1 > reread But that example does not require a user-defined function. That is the behaviour you would get anyway, courtesy of the automagic string evaluation code already written. filename = 'sprintf("foobar%d.ps",i)' set output filename i = 15 replot i = 16 replot What it does require (and your user-defined function would also require) is that term.c:term_set_output() and a few other places in the actual drivers be modified similarly to what I did for write_multiline(). They would need to check that the string is not really a constant, but instead holds a sprintf() command. 'set output <bar>' is messy because <foo> is used for something other than being printed. Most of the other 'set <foo> "string const"' require no special modification, since the fancy stuff happens inside the eventual print routine. -- Ethan A Merritt Department of Biochemistry & Biomolecular Structure Center University of Washington, Seattle |
From: Dave D. <dde...@es...> - 2004-07-21 10:38:27
|
Ethan A Merritt <merritt@u.washington.edu> writes: > On Sunday 18 July 2004 07:03 am, Hans-Bernhard Broeker wrote: >> > 3) user-defined string-valued functions. >> >> Users will almost certainly want to be able to do something like this: >> >> filename(i)=sprintf("foobar%d.ps", i) >> >> i=15 >> > But that example does not require a user-defined function. > That is the behaviour you would get anyway, courtesy of > the automagic string evaluation code already written. > > filename = 'sprintf("foobar%d.ps",i)' Sorry, I'm keeping up with this topic... This may be a stupid question, but does this syntax extend to arbitrary expressions, or just variable names ? filename = 'sprintf("foobar%d.ps", i+1)' or maptofilename(i)=i+1 filename = 'sprintf("foobar%d.ps", maptofilename(i))' dd -- Dave Denholm <dde...@es...> http://www.esmertec.com |
From: Ethan M. <merritt@u.washington.edu> - 2004-07-21 15:30:41
|
On Wednesday 21 July 2004 03:37 am, Dave Denholm wrote: > This may be a stupid question, but does this syntax extend to > arbitrary expressions, or just variable names ? > > filename = 'sprintf("foobar%d.ps", i+1)' > or > maptofilename(i)=i+1 > filename = 'sprintf("foobar%d.ps", maptofilename(i))' Both of those would work. This is a new capability added to the existing expression evaluation code. It still handles everything it already handled, but now it knows how to handle at least some operations on strings also. -- Ethan A Merritt |
From: Ethan A M. <merritt@u.washington.edu> - 2004-07-18 18:42:52
|
On Sunday 18 July 2004 07:03 am, Hans-Bernhard Broeker wrote: > > > > 1) gprintf("format",mantissa,exponent) > > Is that the form it should take? > > No. It should be gprintf("format", number). The crucial difference > is that C's sprintf() supports multiple % formats and uses up exactly > one argument per format specifier, whereas gprintf() only ever has one > argument, but may use more than one format specifier with it. Let me see if I understand this... The user would type, for example set label gprintf("format",var) and internally this would be converted into a call to the existing function of the form gprintf( (char *)temp, sizeof(temp), (char *)format, (double)current_radix, (double)var); followed by copying temp into the appropriate place, in this case the label structure. From the user's point of view (and the parser's), gprintf always has exactly two parameters: (char *)format and (double)var. Is that correct? -- Ethan A Merritt Department of Biochemistry & Biomolecular Structure Center University of Washington, Seattle |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-18 21:35:53
|
On Sun, 18 Jul 2004, Ethan A Merritt wrote: > But that example does not require a user-defined function. > That is the behaviour you would get anyway, courtesy of > the automagic string evaluation code already written. > > filename = 'sprintf("foobar%d.ps",i)' > set output filename If that indeed works, and recomputes the sprintf every time the 'set output' command is reissued, then by comparison with existing gnuplot machinery for user-defined objects, 'filename' is a function, not an variable. So far, the gnuplot paradigm always was that variables have a static value (unless they're the dummy in a plot command, i.e. 'x', 'y', 't', ...). I.e. so far, variables stored values, not expressions to be evaluated at some later time. If at all possible, I'ld like to keep it that way for strings, if only to minimize the amount of documentation we and the users will need to fully explain all this. > They would need to check that the string is > not really a constant, but instead holds a sprintf() command. I'm opposed to having sprintf() *inside* the quotes. It causes new problems like the '' vs. "" saving issue that we don't really need, for no real gain. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Ethan A M. <merritt@u.washington.edu> - 2004-07-18 23:14:16
|
> So far, variables stored values, not expressions to be evaluated at > some later time. If at all possible, I'ld like to keep it that way for > strings, if only to minimize the amount of documentation we and the users > will need to fully explain all this. I disagree, and I think the record of requests for enhancement support me on this point. People really want a way of embedding variables into strings, and having the current value of the variable substituted in at the time the string is printed. It doesn't matter whether you call that "re-evaluation of a string" or "storing a function instead of a string", or "storing an expression to be evaluated later", that is the desired capability. > I'm opposed to having sprintf() *inside* the quotes. It causes new > problems like the '' vs. "" saving issue that we don't really need, for no > real gain. On the contrary, it's a huge gain. It means that a very powerful ability is introduced in a uniform way, yet requires minimal or no change to the existing code or to the existing storage mechanisms for strings. Anything you do _outside_ the quotes means that every single place that tests for a string constant has to be re-written to handle the possible substitution of other syntactic entities instead. But let me repeat again, I do not care what the exact quoting style is. If it saves confusion, I'm perfectly happy to add a 3rd quote character besides ' and ", and reserve this new quote character for the case of re-evaluation at plot time. That would require changing is_quote(), which is easy, and a lot of places that explicitly test for ' or ", which is more annoying but certainly doable. Assume for the moment we use % for this purpose. Then the documentation would read: "<expression>" evaluate immediately, with substitution '<expression>' evaluate immediately, no substitution %<expression>% evaluate later, with substitution at that time -- Ethan A Merritt Department of Biochemistry & Biomolecular Structure Center University of Washington, Seattle |
From: Dave D. <dde...@es...> - 2004-07-21 10:44:49
|
Ethan A Merritt <merritt@u.washington.edu> writes: >> So far, variables stored values, not expressions to be evaluated at >> some later time. If at all possible, I'ld like to keep it that way for >> strings, if only to minimize the amount of documentation we and the users >> will need to fully explain all this. > > I disagree, and I think the record of requests for enhancement > support me on this point. People really want a way of embedding > variables into strings, and having the current value of the variable > substituted in at the time the string is printed. It doesn't matter > whether you call that "re-evaluation of a string" or "storing a function > instead of a string", or "storing an expression to be evaluated later", > that is the desired capability. > There is a demand for some way of making a plot with labels showing the current value of a variable. I don't think this necessarily translates into a need for dynamic substitution at plot time. (But I think that point has already been made) >> I'm opposed to having sprintf() *inside* the quotes. It causes new >> problems like the '' vs. "" saving issue that we don't really need, for no >> real gain. > > On the contrary, it's a huge gain. It means that a very powerful > ability is introduced in a uniform way, yet requires minimal or no > change to the existing code or to the existing storage mechanisms > for strings. > > Anything you do _outside_ the quotes means that every single place > that tests for a string constant has to be re-written to handle the > possible substitution of other syntactic entities instead. > In any big program, there comes a time for refactoring. It seems to me that, at the start of a development phase, the fact that a feature can be inserted with minimal code impact does not *necessarily* mean it is the best way to do it. (And even at the end of a coding cycle, it may still be better to leave it out than to distort the syntax or limit the future possibilities. I'm speaking generally here, not about this change in particular. On the subject of requested features / large changes... is there a case for introducing array variables which can be read from a file ? Or is this starting to tread on octave's territory ? Loading up a file as a 2-d array of strings (which can be converted to numbers), and being able to plot from an array, may unblock a number of problems. dd -- Dave Denholm <dde...@es...> http://www.esmertec.com |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-18 21:46:09
|
On Sun, 18 Jul 2004, Ethan A Merritt wrote: > On Sunday 18 July 2004 07:03 am, Hans-Bernhard Broeker wrote: > > > > > > 1) gprintf("format",mantissa,exponent) > > > Is that the form it should take? > > > > No. It should be gprintf("format", number). The crucial difference > > is that C's sprintf() supports multiple % formats and uses up exactly > > one argument per format specifier, whereas gprintf() only ever has one > > argument, but may use more than one format specifier with it. > > Let me see if I understand this... > > The user would type, for example > set label gprintf("format",var) > and internally this would be converted into a call to the existing > function of the form > gprintf( (char *)temp, sizeof(temp), > (char *)format, > (double)current_radix, > (double)var); > followed by copying temp into the appropriate place, in this case > the label structure. > > From the user's point of view (and the parser's), gprintf always has > exactly two parameters: (char *)format and (double)var. Yes. It may be useful / necessary to allow for the logarithm base (current_radix), too, but we can worry about that later. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-19 08:37:39
|
On Sun, 18 Jul 2004, Ethan A Merritt wrote: > Anything you do _outside_ the quotes means that every single place > that tests for a string constant has to be re-written to handle the > possible substitution of other syntactic entities instead. And everything you do inside them means you have to handle the substitution at every single point of the code that actually uses a string. That's not necessarily a much smaller set of places. It could actually be a good deal larger. It could almost certainly all be handled by extending m_quote_capture to proceed evaluating a string-valued expression. > Assume for the moment we use % for this purpose. Then the > documentation would read: > "<expression>" evaluate immediately, with substitution > '<expression>' evaluate immediately, no substitution > %<expression>% evaluate later, with substitution at that time I honestly don't see why we need the middle variant. What is the difference between evaluation and substitution that makes you want control over each of them independently? -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Ethan M. <merritt@u.washington.edu> - 2004-07-19 15:26:52
|
On Monday 19 July 2004 01:35 am, you wrote: > On Sun, 18 Jul 2004, Ethan A Merritt wrote: > > Anything you do _outside_ the quotes means that every single place > > that tests for a string constant has to be re-written to handle the > > possible substitution of other syntactic entities instead. > > And everything you do inside them means you have to handle the > substitution at every single point of the code that actually uses a > string. That's not necessarily a much smaller set of places. It could > actually be a good deal larger. Have you actually tried the patchset? Adding the test+evaluation in one place only, the text-printing routine write_multiline(), already catches a large majority of useful cases. Adding it to the file-open code (not sure exactly how many places that is) would catch most of the rest. The minority of text-printing that does not go through write_multiline(), with tic labels being the prime example, should be converted to do so. That would at the same time address several queries about why some of the current text options don't work for tic labels. I may be missing some major class of possible uses for strings, but it seems to me that about covers it right there. What else did you have in mind? > It could almost certainly all be handled by extending m_quote_capture > to proceed evaluating a string-valued expression. That is the second stage, yes, but the first stage is teaching all the parsing routines to accept something other than a quoted string in all the places they currently expect it. > > Assume for the moment we use % for this purpose. Then the > > documentation would read: > > "<expression>" evaluate immediately, with substitution > > '<expression>' evaluate immediately, no substitution > > %<expression>% evaluate later, with substitution at that time > > I honestly don't see why we need the middle variant. What is the > difference between evaluation and substitution that makes you want > control over each of them independently? ??? That's what we have *now*. You want to remove it? It's the third variant I'm trying to add - evaluation deferred until plot time. [aside: About the only thing I use the existing single-quote mode for is to allow inclusion of double-quotes in the string without having to escape them with backslashes. But I don't think that was the original intent. What else is it useful for?] -- Ethan A Merritt |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-19 18:59:03
|
On Mon, 19 Jul 2004, Ethan Merritt wrote: > On Monday 19 July 2004 01:35 am, you wrote: > > On Sun, 18 Jul 2004, Ethan A Merritt wrote: > > > Anything you do _outside_ the quotes means that every single place > > > that tests for a string constant has to be re-written to handle the > > > possible substitution of other syntactic entities instead. > > > > And everything you do inside them means you have to handle the > > substitution at every single point of the code that actually uses a > > string. That's not necessarily a much smaller set of places. It could > > actually be a good deal larger. > > Have you actually tried the patchset? Not really. I've relied on your descriptions of it for now. > Adding the test+evaluation in one place only, the text-printing > routine write_multiline(), already catches a large majority of > useful cases. Adding it to the file-open code (not sure exactly > how many places that is) would catch most of the rest. Catching most isn't the issue. Catching all of them is. > I may be missing some major class of possible uses for strings, > but it seems to me that about covers it right there. What else did > you have in mind? Every usage of strings anywhere in gnuplot. Datafile names, output file names, 'print' strings, plot elements (axis labels, tick labels, labels, title, key title, key entries), fit 'update' and 'via' files, save/load/call file names, 'cd' names, loadpath names, terminal-wide or label-wise font names. In short, all 98 calls of isstring(), and all 19 calls of m_quote_capture and all 62 of quote_str() > > It could almost certainly all be handled by extending m_quote_capture > > to proceed evaluating a string-valued expression. > > That is the second stage, yes, but the first stage is teaching all the > parsing routines to accept something other than a quoted string in > all the places they currently expect it. Not really. The trick I have in mind is to teach m_quote_capture() and friends themselves to accept a quoted string followed by whatever else there is to the string-valued expression it started off. Just as we currently parse the function to be plotted by having the parser eat up as much of the command line as fits the syntax of an expression, a string would continue until the next piece of syntax doesn't match the syntax of a string expression any longer. > > > Assume for the moment we use % for this purpose. Then the > > > documentation would read: > > > "<expression>" evaluate immediately, with substitution > > > '<expression>' evaluate immediately, no substitution > > > %<expression>% evaluate later, with substitution at that time > > > > I honestly don't see why we need the middle variant. What is the > > difference between evaluation and substitution that makes you want > > control over each of them independently? > > ??? > That's what we have *now*. You want to remove it? No. But you had me confused over what your terminology means there. Now, looking at all this a bit closer, it seems like you're understanding "substitution" to mean the stuff we already have for the first two cases (\n, \0123, `backquotes`, ...), and "evaluation" for the new string stuff. Well, so far in gnuplot, late evaluation has been limited to the definition of user-defined functions --- all other expressions are evaluated to fixed results before the command they appear in is finished executing. That method has served us well so far, and before we stray from that path, we should have a solid reason for doing so. I honestly don't see the need for late evaluation from the user interface side of things. I don't see a compelling reason why set title {some expression involving variable i} i = 5 plot something i = 6 plot something else has to produce two different title strings, without even offering the option of getting the same title on both of them. Even closer to the point, why should x = 6 set label 'sprintf("%g", x)' at x, f(x) x = 1 show the label at position (6, f(6)), but print the string as '1.0'? From where I sit, that makes no sense whatsoever. So here's a new summary of the matter: late evaluation should be a subject kept separate from that of string variables. If we add late evaluation, it should be added it for both strings and numbers. > [aside: About the only thing I use the existing single-quote mode for > is to allow inclusion of double-quotes in the string without having to > escape them with backslashes. But I don't think that was the original > intent. What else is it useful for?] E.g. for input of strings in terminals like LaTeX or PostScript enhanced, where you really don't want to have to type every \ character twice, and for DOS/Windows filenames, where \n processing would produce rather surprising results. Although the latter can be circumvented by using / to separate directories, which works just as well, but most Windowsers don't know that. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Ethan M. <merritt@u.washington.edu> - 2004-07-19 21:18:32
|
> > I may be missing some major class of possible uses for strings, > > but it seems to me that about covers it right there. What else did > > you have in mind? > > Every usage of strings anywhere in gnuplot. Datafile names, output file > names, 'print' strings, plot elements (axis labels, tick labels, labels, > title, key title, key entries), Please try it out. It catches those already except for the tic labels and key entries, which ought to go through write_multiline() but don't currently. I'll add that to the patch, or just add it to cvs separately. > In short, all 98 calls of isstring(), and all 19 > calls of m_quote_capture and all 62 of quote_str() No. You are misunderstanding, or I am totally failing to describe things properly. The great thing about this approach, which you suggested yourself, is that none of those need to be changed. > fit 'update' and 'via' files, save/load/call file names, 'cd' names, > loadpath names, terminal-wide or label-wise font names. I agree that filenames are not yet handled in the patchsets. I was not trying for 100% complete coverage in the first go-round. I will describe the patchsets from a new angle; let's see if I can make it more understandable this time: Patchset 1: ----------------- This patchset does 3 things. (1) It adds STRING as a legal "value" (gp_types.h data structure) and overloads concatenation onto the + operation applied to STRINGS. (2) It adds a single string-valued function, sprintf(...). This works automagically anywhere that the gnuplot parser accepts a function name. So simple assigment statements, like LABEL = sprintf("whatever",var1,var2,var3) just work, with no new code needed. (3) It adds a check for this string-valued function in exactly two places where a function name was not previously accepted by the parser. These are set label <something> and set [xyz...]label <something> The latter routine handles titles as well, so the coverage is more complete than you might think at first blush. Patchset 2 --------------- Adds a string-function evaluator. I don't know what the right term is for this, but basically it just triggers the existing function evaluation code. It is directly modelled on do_line(), except that it requires the top level function being evaluated to return a string. At a single call site in write_multiline(), it checks for a magic leading sequence of characters in every string that is printed. If the sequence is recognized, it filters the string through the string-function evaluator. This is a beautifully simple change. At one shot it implements variable substitution into all strings printed via write_multiline(). It also has the effect that the *current* value of variables is substituted in, which is a major bonus in its own right. This is what nearly everyone expects gnuplot to do now, but it doesn't. Possible next steps (hypothetical patchset 3) ------------------------------------------------------- Indeed there are other places that the current syntax does not allow a function name including, as you point out, the specification of file names. This could be addressed in several ways. (1) We can expand the legal syntax at these places one by one as they are needed or requested. Here is the code that went in for labels; other places would be more or less the same: +#ifdef GP_STRING_VARS + /* Allow creation of label text using sprintf() */ + if (equals(c_token,"sprintf")) { + struct value a = {STRING,{NULL}}; + (void) const_express(&a); + this_label->text = a.v.string_val; + } else +#endif + + /* get text from string */ if (!END_OF_COMMAND && isstring(c_token)) { (2) We could replace fopen() with an newly-defined gp_fopen() and add the string evaluation code only in the new routine. (3) We could instead use the approach of my older "userstrings" patch, which essentially implements command-line macro definitions. This would not require changing the individual parsing fragments, but would open up a new front in the argument^h^h^h^h err.. discussion. Since the macro-substitution is done on the input line as a text string, it is independent of the parsing routines that will later come into play. > I honestly don't see the need for late evaluation from the user interface > side of things. I don't see a compelling reason why > > set title {some expression involving variable i} > i = 5 > plot something > i = 6 > plot something else > > has to produce two different title strings, without even offering the > option of getting the same title on both of them. Straw man argument. There are plenty of options for having the title come out the same. What we are talking about is how to get it to come out differently. If you want a more obvious, frequently requested, example: set title 'sprintf("Fit cycle %d finished at `date`, A = %7.4f B = %7.4f", \ ncyc,A,B)' n = n+1; fit f(x) 'data' via A,B; plot f(x) n = n+1; fit f(x) 'data' via A,B; plot f(x) <wash, rinse, repeat as often as you like> > x = 6 > set label 'sprintf("%g", x)' at x, f(x) > x = 1 > > show the label at position (6, f(6)), but print the string as '1.0'? > From where I sit, that makes no sense whatsoever. Well, so don't do that. There is no requirement that you have to select plot-time evaluation of strings if it doesn't make sense. > So here's a new summary of the matter: late evaluation should be a > subject kept separate from that of string variables. That's why I split things into two patchsets. But I hardly think they are independent. > If we add late evaluation, it should be added it for both strings and numbers. Please. Just try the patchsets. It does work for both strings and numbers, at least within the realm it is trying to address. I'm sure you can come up with things it doesn't do at all, but that by itself is not a good argument against using it for the things it *does* do. And on the tangential matter of quote styles: > > [aside: About the only thing I use the existing single-quote mode for > > is to allow inclusion of double-quotes in the string without having to > > escape them with backslashes. But I don't think that was the original > > intent. What else is it useful for?] > > E.g. for input of strings in terminals like LaTeX or PostScript enhanced, > where you really don't want to have to type every \ character twice, and > for DOS/Windows filenames, where \n processing would produce rather > surprising results. No comment on the DOS/Windows issue, but from my unix-centric perspective this is not at all what the single-quote convention is expected to do. Following the conventions of both sh- and csh- derived shells, enclosing a character string inside single quotes should mean that absolutely nothing at all is done to it. No fiddling with back-slashes, no substitution of variables, no execution of shell escapes. IMHO gnuplot should do the same thing - save the single-quoted string as is with no fiddling. Right now gnuplot takes the more complicated and error-prone route of trying to figure out what would have been required to type the string as a double-quoted string instead, and saving that. For what gain? It should just save the string as entered, with a flag that it was given in single quotes. -- Ethan A Merritt merritt@u.washington.edu Biomolecular Structure Center Mailstop 357742 University of Washington, Seattle, WA 98195 |
From: Ethan A M. <merritt@u.washington.edu> - 2004-07-20 03:27:41
|
On Monday 19 July 2004 02:18 pm, Ethan Merritt wrote: > > Please try it out. It catches those already except for the tic labels and > key entries, which ought to go through write_multiline() but don't > currently. I'll add that to the patch, or just add it to cvs separately. Heh. I was mis-remembering. Most of the tic labels do already use write_multiline(). It's only the colorbox tics that bypass it for some reason. So that and the key entries are the missing pieces I can find. -- Ethan A Merritt Department of Biochemistry & Biomolecular Structure Center University of Washington, Seattle |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-20 09:01:44
|
On Mon, 19 Jul 2004, Ethan Merritt wrote: > > In short, all 98 calls of isstring(), and all 19 > > calls of m_quote_capture and all 62 of quote_str() > > No. You are misunderstanding, or I am totally failing to describe > things properly. A combination of both, it appears. I had lost overview somewhere along the line of this, and you didn't see it happen. > Patchset 1: > ----------------- > > This patchset does 3 things. [...] > (3) It adds a check for this string-valued function in exactly two > places where a function name was not previously accepted by > the parser. I think what I'm getting at that this check had better go into isstring() instead --- the function that every command parser fragment is supposed to be using to check whether the upcoming command line token is a string or not. > Patchset 2 > --------------- > > Adds a string-function evaluator. I don't know what the right term is > for this, but basically it just triggers the existing function > evaluation code. It is directly modelled on do_line(), except that it > requires the top level function being evaluated to return a string. > > At a single call site in write_multiline(), it checks for a magic > leading sequence of characters in every string that is printed. If the > sequence is recognized, it filters the string through the > string-function evaluator. Ah, thanks, now this is a whole lot clearer than before. So this patch is what I've been referring to as late evaluation, i.e. the technique of storing an expression, rather than its result, to be evaluated at the latest possible time. I'm still not confinced that 'sprintf( is a suitable choice of trigger string, but now at least I understand what you're doing. I.e. we're actually in perfect agreement as to what the necessary features are --- we just used so completely different terms that we confused each other completely. Thanks for staying with me long enough that we could finally sort this out. > Possible next steps (hypothetical patchset 3) > ------------------------------------------------------- > > Indeed there are other places that the current syntax does not > allow a function name including, as you point out, the specification > of file names. This could be addressed in several ways. > > (1) We can expand the legal syntax at these places one by > one as they are needed or requested. That might be an endless journey. Better do it in a more brutal fashion: either by documenting and recommending the ''+ trick to signal a string expression coming up, or by changing isstring() and friends, as I suggested earlier. > (2) We could replace fopen() with an newly-defined gp_fopen() and > add the string evaluation code only in the new routine. > No comment on the DOS/Windows issue, but from my unix-centric > perspective this is not at all what the single-quote convention is > expected to do. Following the conventions of both sh- and csh- > derived shells, enclosing a character string inside single quotes > should mean that absolutely nothing at all is done to it. And so far, before your second patchset, nothing is --- that patchset actually breaks that rule. Single quoted string are used as-is, unmodified. Except that when they're written out to save files, which may not even stay on the same platform as the gnuplot executable they were built on, it's unreliable to store them as single-quoted ones. That's where conv_text() comes in and generates an ASCII-only, portable representation for us. The actual problem here was simply a bug in conv_text(). This save file issue is one which we can get no guidance from Unix shells from --- they don't save states of internal settings to files meant to be read back in. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Ethan M. <merritt@u.washington.edu> - 2004-07-20 17:47:01
|
On Tuesday 20 July 2004 01:53 am, Hans-Bernhard Broeker wrote: > > (3) It adds a check for this string-valued function in exactly two > > places where a function name was not previously accepted by > > the parser. > > I think what I'm getting at that this check had better go into isstring() > instead --- the function that every command parser fragment is supposed to > be using to check whether the upcoming command line token is a string or > not. That turns out not to help. Consider a typical call site, one of the 98 you counted: set.c (set_fontpath) line 1801: if (isstring(c_token)) { int len; char *ss = gp_alloc(token_len(c_token), "tmp storage"); len = (collect? strlen(collect) : 0); quote_str(ss,c_token,token_len(c_token)); collect = gp_realloc(collect, len+1+strlen(ss)+1, "tmp fontpath"); if (len != 0) { strcpy(collect+len+1,ss); *(collect+len) = PATHSEP; } else strcpy(collect,ss); free(ss); ++c_token; } I am not clever enough to re-write isstring() and quote_str() such that the code at this call site works when the length of the string changes mid-stream. So trying to be clever in isstring() and quote_str() paradoxically makes *more* work, since every call site would have to be inspected and re-written for compatibility. If the program were being re-written from scratch, then yes. But in the interest of sanity and not introducing 98 possible new bug sites, I would rather go for a solution that leaves the input code intact and focuses instead on identifying a small number of places where the string is actually used for something. case 1: the string is printed. I dealt with this by adding code to write_multiline(). case 2: the string is used as a file name. I haven't handled this yet, but I propose to create a new routine gp_fopen() which contains the same check for magic characters at the start of a string that I inserted into write_multiline(). Yes, this requires changing call sites from fd = fopen("...") to fd = gp_fopen("...") but this would be a piece of cake compared to rethinking and rewriting 98 sites like the one above . case 3: Are there any more end-uses for a string? > Ah, thanks, now this is a whole lot clearer than before. So this patch is > what I've been referring to as late evaluation, i.e. the technique of > storing an expression, rather than its result, to be evaluated at the > latest possible time. I'm still not confinced that 'sprintf( is a > suitable choice of trigger string, but now at least I understand what > you're doing. I am not trying to convince anyone that "sprintf" is the best choice. The only virtue it has is that we can refer people to "man sprintf" for help with the format specifiers. I would rather have either an alternate quote character (something different from ' or ") or a single magic character following a normal quote character. set title %<expression>% or set title "!<expression>" Unfortunately the '!' character is a unary operator already, as are '+' and '^' and many of the other obvious candidates. So sticking it in front of a general expression is at best a bit confusing, and at worst ambiguous to parse. I hesitate to suggest '$', but at least it is not a unary operator. The other alternative is a pseudo-function such as set title %(<expression>) but that leads us right back to the problem of those 98 call sites that are expecting a quote character. Please note that this doesn't get away from having to type "sprintf" if that is in fact the function you want to evaluate. So even with a single-%-character quoting convention, the full form would be set title %sprintf("<format>",var1,var2)% but if we allow other string-valued functions, then any one of them might replace "sprintf" in this example. Hypothetically it would look like set title %myfunc(par1,par2)% -- Ethan A Merritt merritt@u.washington.edu Biomolecular Structure Center Mailstop 357742 University of Washington, Seattle, WA 98195 |
From: Volker D. <v.d...@we...> - 2004-07-21 07:34:04
|
Ethan Merritt wrote: > Consider a typical call site, one of the 98 you counted: > > set.c (set_fontpath) line 1801: > > if (isstring(c_token)) { > int len; > char *ss = gp_alloc(token_len(c_token), "tmp storage"); > len = (collect? strlen(collect) : 0); > quote_str(ss,c_token,token_len(c_token)); > collect = gp_realloc(collect, len+1+strlen(ss)+1, "tmp fontpath"); > > I am not clever enough to re-write isstring() and quote_str() such > that the code at this call site works when the length of the string > changes mid-stream. So trying to be clever in isstring() and > quote_str() paradoxically makes *more* work, since every call site > would have to be inspected and re-written for compatibility. > Yes, but this I think this is easy: I took a look at my old `value substitution` patch (the python like syntax of "%(foo).2f"): There are two functions: quote_str and m_quote_capture. Using quote_str you will have to do memory allocation yourself, whereas m_quote_capture will do allocation for you if I remember correctly. My old patch replaced (hopfully) all the invocations of quote_str with m_quote_capture and did substitution there. > If the program were being re-written from scratch, then yes. > But in the interest of sanity and not introducing 98 possible > new bug sites, I would rather go for a solution that leaves the > input code intact ... > I think 98 sites which do memory allocation themself is more dangerous than replacing 98 call sites with one tested and correct call to m_quote_capture. Regarding the discussion about late evaluation: I would like to _object_ to late evaluation. Late evaluation will intruduce much more problems in user space than it will solve. If I construct a string (regardless if done by the '' + trick or sprintf or anyting else) I expect it to be the way I constructed it; at least that's the behavior of all programming languages I am familiar with. I do not see any problem which could be solved easier with late evaluation, at least not from user side: It is very easy to reconstruct the string if a variable change should be reflected in the string. IMHO the substitution should be done right during parsing, no late evaluation and m_quote_capture is a good place for it. Volker -- Volker Dobler _______________________________________________________ WEB.DE Video-Mail - Sagen Sie mehr mit bewegten Bildern Informationen unter: http://freemail.web.de/?mc=021199 |
From: <mi...@ph...> - 2004-07-21 09:29:31
|
> Regarding the discussion about late evaluation: I would like > to _object_ to late evaluation. Late evaluation will intruduce > much more problems in user space than it will solve. The late evaluation can be just a flag to "set label". That's where we expect this feature to happen most frequently. set label 1 sprintf("fitted values a=%g b=%g") at 1,1 lateeval Or, there could be "sprintfLate", but to be authorized only for strings for "set label", "set title", "(s)plot"... i.e., where the string is used immediately. --- PM |
From: Ethan M. <merritt@u.washington.edu> - 2004-07-21 15:35:37
|
On Wednesday 21 July 2004 02:29 am, mi...@ph... wrote: > > Regarding the discussion about late evaluation: I would like > > to _object_ to late evaluation. Late evaluation will intruduce > > much more problems in user space than it will solve. Like what, for example? So far I don't know of any bad side to this capability. Sure, most code won't use it - but that is the case for many features. And the beauty of this way of doing things is that it is actually simpler to handle the general case than it is to add a lot of special cases. > The late evaluation can be just a flag to "set label". That's where we > expect this feature to happen most frequently. > set label 1 sprintf("fitted values a=%g b=%g") at 1,1 lateeval Yes, that would be possible. But why only labels? Why not plot titles, key titles, fit logfile names, and all the rest? -- Ethan A Merritt |
From: Ethan M. <merritt@u.washington.edu> - 2004-07-21 20:58:37
|
[Another long post. Please read it in the hope that all will become clear. Or at least become clearer than it is at present.] Let me start with a plea to drop the term "late evaluation", which I think is side-tracking the current discussion. Whatever it may mean to different people, I don't think it is a useful way of thinking about my patchsets. The evaluation of expressions containing string functions is no different from the evaluation of all other gnuplot functions. The evaluation is never "late"; it happens when it happens. Consider the existing case with no strings: # user function is defined f(x) = sin(x) # user function is evaluated and result goes in var1 var1 = f(3.0) # user function is stored with plot for use in plotting, # evaluation happens at plot time plot 'data' using 1:(f($2)) In the first two cases the function is evaluated at the time the command line is parsed. In the third case the function is evaluated once per data point, but not until the file is read in during plotting. This is obvious, I hope. So.... What I did was expand the variety of functions to include string-valued functions. But otherwise the evaluation works the same way # (1) User function is defined f(x) = sprintf("%7.5f",x) # (2) Pre-defined function is evaluated and result stored var1 = sprintf("%2.1f",x) # (3) Same thing using user function instead var2 = f(x) # (4) User function is stored with plot for use in plotting plot 'data' using 1:2:(f($2)) (1) Needs a bit more code to make it work, but is straightforward (2) This is, I thought, what everyone was asking for but now we seem to be arguing about (3) Combination of (1) and (2); evaluation is done in two steps but still at the time the command line is parsed. (4) Evaluation of f(x), which is really sprintf("fmt",x) in this case, does not happen until the data is read in. In fact the command I show here as an example is currently broken because of issues with assumed column contents (see other thread). But I was planning to fix this for the next version of the patchsets. The point is that in each case, string or no string, the evaluation happens at the time it is needed. For 'set <foo>' commands this is usually at the time the command line is parsed, but for 'plot/splot/fit' commands it happens elsewhere. That inevitably leads to the question "how does gnuplot know that evaluation is needed?", and that's where the fun (or argument) started. In all the examples I gave above it is conceptually all the same whether the variables and functions take on int- or real- values or whether they take on string values. But there are a few operations where the two possibilites diverge. For example, real-valued variables are used (and changed) during fitting. I can't imagine that being a meaningful operation for a string variable. Conversely, one thing we do with strings is to print them, something that doesn't happen to numbers or at least only via an intermediate string. Printing a string constant is easy; we do that already. But how do you store, or mark, a string-valued function so the gnuplot knows it must be evaluated and printed? That is the heart of this discussion. I picked on up a suggestion from Hans to mark a string-valued function by inserting some "magic characters" inside the quotes. I thought that was a clever idea because it is transparent to the existing code. I am bemused by the controversy this seems to have generated. There are two issues: how it looks on the command line, and how it is stored internally. Look, I really don't care how we do it. This seemed to me a clever representation. In addition to being compatible with the current code, it means that the internal representation is the same as the command-line representation (quote + magic chars). But these are mere conveniences, and are totally separate from the core functionality. Feel free to propose other represenations, and explain why they are better. And now, finally to Volker's specific questions.... On Wednesday 21 July 2004 12:33 am, Volker Dobler wrote: > I think 98 sites which do memory allocation themself is more > dangerous than replacing 98 call sites with one tested and > correct call to m_quote_capture. I have no comment at this time. I will revisit your earlier patch and see what you did there. Are you willing to help fix things if this mass substitution breaks them? > Regarding the discussion about late evaluation: I would like > to _object_ to late evaluation. Late evaluation will intruduce > much more problems in user space than it will solve. Petr said this too. I am totally not understanding this concern. But then again, I don't even know what you think "late evaluation" means. Could you give an example? > If I construct a string (regardless if done by the '' + trick or sprintf > or anyting else) I expect it to be the way I constructed it; > at least that's the behavior of all programming languages I am > familiar with. Yes. That's fine. That's what the sprintf() function does - it constructs a string permanently and stores it in a variable. That variable always contains the string you constructed. It does not change. Ever. The point which is confusing is that string may in fact contain a command. But even this is not new - the hot-key bindings are strings that contain commands. The command they contain is executed "when needed", which in the case of key bindings is in response to an event generated externally. If you type 'bind' it will print a list of these stored commands. It's the same with the string variables. If you type 'show VAR' it will show you what command is stored there. But there are other places in the code that can actually execute that command when needed. Users are under no obligation to store commands in string variables any more than they are required to program new hot-key bindings. It's just something you *can* do if it proves useful. -- Ethan A Merritt |