Re: [ gnuplot-Patches-992149 ] String variables revisited

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

> > I may be missing some major class of possible uses for strings,
> > but it seems to me that about covers it right there.  What else did
> > you have in mind?
> 
> Every usage of strings anywhere in gnuplot.  Datafile names, output file
> names, 'print' strings, plot elements (axis labels, tick labels, labels,
> title, key title, key entries), 

Please try it out.  It catches those already except for the tic labels and
key entries, which ought to go through write_multiline() but don't currently.
I'll add that to the patch, or just add it to cvs separately.

> In short, all 98 calls of isstring(), and all 19
> calls of m_quote_capture and all 62 of quote_str()

No.  You are misunderstanding, or I am totally failing to describe
things properly.  The great thing about this approach, which you
suggested yourself, is that none of those need to be changed.

> fit 'update' and 'via' files, save/load/call file names, 'cd' names,
> loadpath names, terminal-wide or label-wise font names.

I agree that filenames are not yet handled in the patchsets.
I was not trying for 100% complete coverage in the first go-round.

I will describe the patchsets from a new angle; let's see if 
I can make it more understandable this time:

Patchset 1:
-----------------

This patchset does 3 things.

(1) It adds STRING as a legal "value"  (gp_types.h data structure)
and overloads concatenation onto the + operation applied to STRINGS.

(2) It adds a single string-valued function, sprintf(...).
This works automagically anywhere that the gnuplot parser accepts
a function name. So simple assigment statements, like
	LABEL = sprintf("whatever",var1,var2,var3)
just work, with no new code needed.

(3) It adds a check for this string-valued function in exactly two
places where a function name was not previously accepted by
the parser. These are
	set label <something>
and
	set [xyz...]label <something>

The latter routine handles titles as well, so the coverage is
more complete than you might think at first blush.

Patchset 2
---------------

Adds a string-function evaluator. I don't know what the right term
is for this, but basically it just triggers the existing function evaluation
code. It is directly modelled on do_line(), except that it requires
the top level function being evaluated to return a string.

At a single call site in write_multiline(), it checks for a magic
leading sequence of characters in every string that is printed.
If the sequence is recognized, it filters the string through the
string-function evaluator.

This is a beautifully simple change.  At one shot it implements
variable substitution into all strings printed via write_multiline().
It also has the effect that the *current* value of variables is
substituted in, which is a major bonus in its own right.   This
is what nearly everyone expects gnuplot to do now, but it
doesn't.

Possible next steps (hypothetical patchset 3)
-------------------------------------------------------

Indeed there are other places that the current syntax does not
allow a function name including, as you point out, the specification
of file names.  This could be addressed in several ways.

(1) We can expand the legal syntax at these places one by
one as they are needed or requested.  Here is the code that
went in for labels; other places would be more or less the same:

+#ifdef GP_STRING_VARS
+    /* Allow creation of label text using sprintf() */
+    if (equals(c_token,"sprintf")) {
+       struct value a = {STRING,{NULL}};
+       (void) const_express(&a);
+       this_label->text = a.v.string_val;
+    } else
+#endif
+
+    /* get text from string */
     if (!END_OF_COMMAND && isstring(c_token)) {

(2) We could replace fopen() with an newly-defined gp_fopen() and
add the string evaluation code only in the new routine.

(3) We could instead use the approach of my older "userstrings" patch,
which essentially implements command-line macro definitions. This would
not require changing the  individual parsing fragments, but would open
up a new front in the argument^h^h^h^h err.. discussion.
Since the macro-substitution is done on the input line as a text string,
it is independent of the parsing routines that will later come into play.

> I honestly don't see the need for late evaluation from the user interface
> side of things. I don't see a compelling reason why
> 
> 	set title {some expression involving variable i}
> 	i = 5
> 	plot something
> 	i = 6
> 	plot something else
> 
> has to produce two different title strings, without even offering the
> option of getting the same title on both of them.  

Straw man argument.  There are plenty of options for having the
title come out the same.  What we are talking about is how to get
it to come out differently.

If you want a more obvious, frequently requested, example:

set title 'sprintf("Fit cycle %d finished at `date`, A = %7.4f B = %7.4f", \
			ncyc,A,B)'
n = n+1; fit f(x) 'data' via A,B; plot f(x)
n = n+1; fit f(x) 'data' via A,B; plot f(x)
<wash, rinse, repeat as often as you like>

> 	x = 6
> 	set label 'sprintf("%g", x)' at x, f(x)
> 	x = 1
> 
> show the label at position (6, f(6)), but print the string as '1.0'?
> From where I sit, that makes no sense whatsoever.

Well, so don't do that.  There is no requirement that you have to
select plot-time evaluation of strings if it doesn't make sense.

> So here's a new summary of the matter: late evaluation should be a 
> subject kept separate from that of string variables.

That's why I split things into two patchsets. 
But I hardly think they are independent. 

> If we add late evaluation, it should be added it for both strings and numbers.

Please. Just try the patchsets.   
It does work for both strings and numbers, at least within the realm
it is trying to address.  I'm sure you can come up with things it 
doesn't do at all, but that by itself is not a good argument against
using it for the things it *does* do.

And on the tangential matter of quote styles:

> > [aside:  About the only thing I use the existing single-quote mode for
> > is to allow inclusion of double-quotes in the string without having to
> > escape them with backslashes.  But I don't think that was the original
> > intent. What else is it useful for?]
> 
> E.g. for input of strings in terminals like LaTeX or PostScript enhanced,
> where you really don't want to have to type every \ character twice, and
> for DOS/Windows filenames, where \n processing would produce rather
> surprising results.

No comment on the DOS/Windows issue, but from my unix-centric
perspective this is not at all what the single-quote convention is 
expected to do.   Following the conventions of both sh- and csh-
derived shells,  enclosing a character string inside single quotes
should mean that absolutely nothing at all is done to it.  No fiddling
with back-slashes, no substitution of variables, no execution of
shell escapes.  IMHO gnuplot should do the same thing - save the
single-quoted string as is with no fiddling.   Right now gnuplot
takes the more complicated and error-prone route of trying to
figure out what would have been required to type the string as a
double-quoted string instead, and saving that.  For what gain?
It should just save the string as entered, with a flag that it was 
given in single quotes.

-- 
Ethan A Merritt       merritt@u.washington.edu
Biomolecular Structure Center
Mailstop 357742
University of Washington, Seattle, WA 98195

Re: [ gnuplot-Patches-992149 ] String variables revisited

A portable, multi-platform, command-line driven graphing utility

Re: [ gnuplot-Patches-992149 ] String variables revisited