The keyword 'sample' is not documented (in the form of a formal Syntax: definition) in the current manual. When undertaking to do so an anomaly is uncovered.
In what follows I clarify the currently ambiguous terminology:
{ranges} ==> {<axis-range-list>}
(not directly relevant to this bug but is one source of ambiguity)
{sampling range} ==> {<sampling-range-list>}
It is tempting to presume that within the context of a plot/splot command, the syntax for a sampling-range-list always permits the optional keyword 'sample' to appear before the first sampling-range specifier within that list. The actual behavior is not so. Instead it is messy, viz.
<plot-element>:
{{<iterator>} ... {<iterator>}}
<definition> | {<sampling-range-list>} <function> | <data-source> | keyentry
{axes <axes>} {title <title-spec>} {with <style>}
<iterator>:
for [<iteration-specifier>]
<sampling-range-list>:
{sample} [<sampling-range>] {[<sampling-range>]}
The messiness is that the keyword, "sample" is:
1. required when the <axis-range-list> is incomplete
2. tolerated when the <axis-range-list> is complete
3. forbidden when the prior token is a 'for' iterator</axis-range-list></axis-range-list>
The above documents the current parser behavior and here I am proposing that the special case when a 'for' iterator has been used immediately prior is a bug and should be eliminated.
Some examples crafted from lines found in the suite of demo scripts (the first one is from the attached script prepared by me) are included below. I've focused on those situations where the for iterator is followed by a sampling-range specifier but there are probably other examples wherein the correct number of range-specifiers in an axis-range-list is followed by a sampling-range specifier.
splot [-3:3][-2.5:2.5][-1:11]for[p=0:4]sample[r=1:10] '+' using (φ=pi*p/4., cos(r+φ)):(sin(r+φ)):(r) with lines lc p title sprintf("φ=%d.pi/4",p)
# $grep 'plot.*for' *.dem |grep '\]\s*\['
#BesselJ.dem:plot for [i=0:8] [t=0:20] besjn(i,t),t with lines lw 2 # title sprintf("J_%d(x)",i)
plot for [i=0:8] sample [t=0:20] besjn(i,t),t with lines lw 2 # title sprintf("J_%d(x)",i)
#fenceplot.dem:splot for [x=-4:4][y=-50:50:3] '+' using (x):($1/100.):(-1):(-1):(sinc($1/10., 1.+x)) with zerrorfill
splot for [x=-4:4]sample[y=-50:50:3] '+' using (x):($1/100.):(-1):(-1):(sinc($1/10., 1.+x)) with zerrorfill
#map_projection.dem:plot for [λ=-180:180:10] [φ=-90:90] '+' using (x_W3(λ,φ)):(y_W3(λ,φ)) with lines lc "cyan" lw .5 notitle, \
plot for [λ=-180:180:10] sample[φ=-90:90] '+' using (x_W3(λ,φ)):(y_W3(λ,φ)) with lines lc "cyan" lw .5 notitle, \
#map_projection.dem:plot for [λ=-180:180:10] [φ=-90:90] '+' using (x_Hammer(λ,φ)):(y_Hammer(λ,φ)) with lines lc "cyan" lw .5 notitle, \
plot for [λ=-180:180:10] sample[φ=-90:90] '+' using (x_Hammer(λ,φ)):(y_Hammer(λ,φ)) with lines lc "cyan" lw .5 notitle, \
#map_projection.dem:plot for [λ=-180:180:10] [φ=-60:90] '+' using (x_Albers(λ,φ)):(y_Albers(λ,φ)) with lines lc "cyan" lw .5 notitle, \
plot for [λ=-180:180:10] sample[φ=-60:90] '+' using (x_Albers(λ,φ)):(y_Albers(λ,φ)) with lines lc "cyan" lw .5 notitle, \
#zerror.dem:splot for [i=-5:4][y=-50:50:5] '+' using (i):($1/100.):(-1):(-1):(sinc($1/10., 1.+i)) with zerrorfill
splot for [i=-5:4]sample[y=-50:50:5] '+' using (i):($1/100.):(-1):(-1):(sinc($1/10., 1.+i)) with zerrorfill
It seems 'bug-like' that a keyword (that explicitly signals that one or more sampling-range definitions follows) should throw an error (or not) depending upon which optional token precedes it. Recall that this very same keyword is useful for terminating what I am calling an axis-range-list when the parser-expected number of placeholder ranges are not present in that list.
It is counter-intuitive that an optional keyword that is, in fact, required in one scenario becomes the cause of a parser error in another scenario.
Note: In the process of isolating this bug I observed that the comma that is expected to terminate a plot-element is not always required by the current parser. See examples 'e' and 'g' in the attached demo script. This note is tangential to the bug being reported here. I don't yet have enough experience with gnuplot to fully understand whether this behavior is by design or is, perhaps, another bug.
I think there may be a misunderstanding here. It may be that you do understand what the
sample
keyword means and I am misunderstanding your description of a problem, or it may be that you misunderstand what the keyword means.1) There is only one x-axis, one x2-axis, etc per plot no matter how many lines or boxes or whatever are drawn in that plot command. So the entire plot has only one xrange. You can let it default or you can set it before hand with
set xrange
. For historical reasons you can also provide it as the very first thing in aplot
orsplot
command. That historical option was IMHO a big mistake that has caused much grief over the past 20 years or so as it was preserved in the name of backwards compatibility. Be that as it may, it can cannot appear more than once. The same applies to the x2, y, y2, and z axis ranges.2) However all components of a plot command are free to generate samples for plotting. The problem is that defining the sample set uses a syntax that looks like a range. So if you put it at the beginning of a plot command the program cannot tell whether you are providing an axis range or a sample condition. This ambiguity cannot arise anywhere else in the plot command because an axis range is not possible anywhere else in a plot command.
With those points in mind, I don't think your messiness bullet points correctly express what is going on:
Complete or incomplete is not relevant. The
sample
keyword is required at the beginning of a plot command because otherwise the[beg:end]
would be mistaken as an axis range when it is really something else.No. There is never a case where
sample
is "tolerated". It is either required or it is incorrect. You may have a point that there are places it would be harmless to accept it even when not neede, and we can get back to that in a minute....No. This is point (1) above. It is never possible to have more than one x-axis range in a plot, so an axis range can never appear inside a
for
iteration because that would attempt to define it multiple times. Therefore the syntax you show is not possible. However it is possible to define a new sampling rule in each iteration, so in a command like thisplot for [i=1:N] [q=10:20:2] '+' using ($1+i):($1) with points
the section beginning
q=
is unambiguously a sampling rule, since it cannot possibly be an axis range. Also the presence of a third item inside the square brackets is another giveaway.If it were entirely up to me and if backwards-compatibility were not a requirement, I would solve all of this confusion by forbidding axis ranges inside a plot command. Then the
sample
keyword would not be needed because all the square-bracket thingies[something:something:foo]
in a plot command would unambiguously refer to sampling anyhow. Alas that's not where we are ;-/So anyhow, yes I suppose it would be possible to silently accept an unnecessary
sample
keyword inside a plot iteration. In that case I suppose it would logically also ignore it as an unnecessary keyword when it appears multiple times in a plot command. But does it really reduce user confusion? I imagine users would then be wondering "what is this keyword that seems to be accepted but has no visible effect on the plot?".Last edit: Ethan Merritt 2023-10-28
Re commas:
The reason a comma may or may not appear after a definition is simply that the plot may consist of
<definition> <first plot component>, <second plot component>
or
<definition> <empty plot component>, <first plot component>, ...
I.e. it's not that the comma is optional, it's just that the plot component immediately following the definition may be
<null>
. Am I making sense?So I had another thought....
Perhaps the keywords
sample
should instead have been something meaningend_of_axis_ranges
. That's a terrible keyword, but maybe it is more obvious where it must go. If nothing else, maybe the documentation could take this approach to explaining it.Oh yes! That is exactly what I've been struggling to articulate these past couple of days.
end_of_axis_ranges
is equivalent to my Strawman 2.Read my reply below and I think you'll see that Strawman 1 is even better than Strawman 2.
Last edit: najevi 2023-10-29
To be honest I lost the plot (sic!) while writing this up! Preparing my reply to your responses helped bring clarity.
Please consider the difference between what a keyword descriptively MEANS and the effect that a keyword CAUSES. In the case of this keyword
sample
I think it is fair to say that it is a misnomer.This particular keyword causes the gnuplot parser to stop waiting for
[ = : ]
lexical constructs/elements that are to be interpreted as axis-range specifiers. Thissample
keyword is not the only lexical element that causes the parser to stop waiting for axis-range specifiers. The script attached to this bug report demonstrates that both a<definition>
and thefor
keyword of an iterator also cause the gnuplot parser to "stop waiting" for axis-range specifiers.The current description for the axis-range-list provides for a so-called placeholder in the form of the empty axis-range specifier. viz.
So this
sample
keyword (that undeniably carries semantic baggage or implicit description) is causing an effect (on the parser) that 3, 4 or 5 empty ranges could equally well achieve.In this type of situation I find it helpful to exhaustively enumerate all possible (regardless of how probable) lexical element sequences may give rise to the kind of ambiguity that
sample
was apparently designed to resolve. viz.Within each of the following four groupings of (s)plot command line fragments, each line is equivalent to the other lines in that one group:
I did not think it relevant to enumerate those cases where an intervening
for
or<definition>
appear since those cases give rise to no ambiguity and so thesample
keyword is not required in those cases.As a human parsing Table 1, I cannot avoid arriving at the conclusion that
sample
is compensating for a "badly formed" (i.e. incomplete) axis-range-list.It really has nothing to do with announcing a sampling-range-list.
If it's "practical cause" or "primary role" were to announce a sampling-range-list then that same announcement should not throw an error when used in other scenarios (other sequences of lexical elements) that permit a sampling-range-list to follow. The script attached to this bug report identifies those scenarios using ## comments.
(Please don't misunderstand the primary thrust of my proposal. I do not recommend tolerating the
sample
keyword in places where it is redundant. I use that demonstrable fact to highlight the misnomer that is thesample
keyword.)This sentence of your reply was the most reassuring for me to read.
I understand your remark alludes to the following paragraph from the manual:
Noting that x is the independent variable when not in parametric mode, do you agree that the very same can be said of:
"The range specifier (singular) for sampling on either t or x can include an explicit sampling interval to control the number and spacing of samples?"
Based on the above common understanding, I think we can agree that the gnuplot parser cannot possibly mistake a
[dum=beg:end:int]
construct for an axis-range specifier.If we do agree on the above then it should be easy to recognize that an equally effective resolution to the ambiguity arising from (what I persistently argue is) an incomplete axis-range-list situation is to ensure that all sampling-range specifiers adhere to a two-colon syntax:
[ {=} : : ]
.Said differently and, I would suggest, far more succinctly than the existing "messiness" (see Strawman 0, later) :-
I observe the following of the gnuplot parser:
1. While waiting for axis-range specifiers, the gnuplot parser of today throws an error when that second colon is encountered. (Error:
']' expected
)2. While waiting for a sample-range specifier, the gnuplot parser of today throws an error when that second colon is not followed by an expression. (Error:
invalid expression
)So if the gnuplot parser can throw the first error then it ought just as easily use that same event to "stop waiting" for axis-range specifiers ... n'est-ce pas?
Assuming you buy into that proposition then we still need to consider the situation where the interval,
<int>
, is not specified (i.e. is blank) for a sampling-range specifier.Is it any more complex to have the parser recognize a required second colon and treat the absence of an expression before the closing ] brace in just the same way as it currently treats the absence of a second colon before that closing brace?
sample
keyword in a non-relevant lexical element.In this "alternate syntax universe" the formerly ambiguous scenarios described in Table 1 now look as follows:
Isn't that a whole lot less confusing than before?
To reiterate the key point of my proposal:
At the same point when the current gnuplot parser throws an error at the second colon the alternate universe parser recognizes that second colon as the signal to "stop waiting" for more axis-range specifiers.
If at some future release the axis-range-list is (we can only hope!) deprecated then the (albeit one character longer) mandatory components of the sampling-range-list syntax need not be changed.
Next I should probably ask, "Which derivative of Backus-Naur Form does the gnuplot manual aspire to adhere to?"
I struggle to recognize the abbreviated form but I think I have the gist of using the limited syntactic toolbox described in Part I of the manual.
I ask you to consider how a derivative B-N form of syntax definition might best document the current situation with the
sample
keyword.I've given this considerable thought and there is no way I can come up with a description that involves the syntax definition for the sampling-range-list.
The difficulty I keep running into is finding some precedent for what I can best describe as a "conditionally required" keyword.
Usually keywords are either mandatory or optional.
Conditionally mandatory keywords are not something I can remember encountering.
In this case the condition is:
(!expected_size_axis_range_list_parsed && !for_keyword_parsed && !definition_parsed)
or the logically equivalent condition:
!(expected_size_axis_range_list_parsed || for_keyword_parsed || definition_parsed)
I've seen some XML-based examples that handle this type of "stateful" situation but nothing that resembles what I see being used in the gnuplot manual.
The best I can come up with using just the syntactic toolbox described in
Part I gnuplot - Syntax
, involves what is currently called{ranges}
and what I prefer to callaxis-range-list
.It is truly messy ... I hesitate to even share it here ... however doing so helps me take a step toward proffering Strawman 2.
The "but not quite exactly" remark refers to that situation when an intervening
for
keyword or an intervening<definition>
is discovered by the parser.Please indulge me this minor point since this is really only a stepping stone to Strawman 2. (Eyes on the prize: Strawman 1 is what I am advocating.)
Such a token ought to be similar in form to an axis-range specifier yet different in form from a sampling-range specifier.
It could be a 'smily face emoji' if you have a sense of humor or it could be a mundane PERIOD aka, full stop!
So, for the sake of argument, I'm proffering this alternate, Strawman 2, for comparison.
[.]
be the terminating token at the end of an axis-range-list. (No, I am not serious but please, go with this idea just for now if only to appreciate the comparison.) Feel free to replace the period . character with your favorite punctuation mark!With that choice the above Strawman 0 syntax can be greatly simplified to just:
Then the exhaustive list (Table 1 above) of previously ambiguous scenarios now looks like:
Personally, if given the choice between Table 3 and Table 2 I'd choose Table 2 in a heart beat!
The "range-ified"
[.]
period looks like a clumsy appendage just as thesample
keyword looks like a messy misnomer!!So please consider the virtues of Strawman 1 :
1. It is brief/succinct to describe/learn and to script.
2. It shifts burden away from the script writer and toward the parser.
3. It does not ask the parser to keep track of state any more than the parser does today.
4. It reinforces the visible difference between an axis-range specifier and a sampling-range specifier.
5. It does not need to change if/when ever the problematic axis-range-list is deprecated.
What I've done in this reply is offered up a couple of strawmen.
This method of argument can sometimes carry the risk of the proffered strawman overlooking some important premise. As I am new to gnuplot that is a risk I am keenly aware of.
Please feel free to set me straight if I've overlooked an important premise.
I never mind wearing egg on my face ... when it's warranted! ;-)
Last edit: najevi 2023-10-29
I wish you had been around to contribute to this discussion ten years ago when the mechanism of giving sampling ranges for '+' '++' or autogeneration of samples inside the plot command was first introduced for version 5.
Anything we do or change now is constrained by the requirement not to have version 6 break existing scripts written for version 5. It's OK to mark a bit of pre-version 6 syntax "deprecated" if it has been replaced by a better alternative, but the old deprecated syntax still has to be accepted. That means we can't just require that all sampling specifiers contain three colon-separated values, even though in retrospect that would have avoided this mess.
I do like the idea of recognizing a 2-colon sampling specifier as such if it is encountered at the start of a plot command rather than issuing the error message
Error: ']' expected
. Unfortunately after a first look I don't see an easy way to do that without adding a lot of code. The issue is that parsing and storing the first two fields can have side-effects (changes axis ranges and autoscale settings). If a second colon is then encountered, those side-effects would have to be reverted before continuing. A dumb-but-simple look-ahead to see if there is another colon coming up would work most of the time, but would fail in corner cases likeplot [x=(foo?min1:min2) : (foo?max1:max2)] f(x)
I will take a closer look later; maybe the code can be refactored to make backing out not so painful.
And yes, probably an empty expression after the second colon could be treated the same as not having a colon at all (defaults to either 1 or to range/samples depending on the context). I'll need to test this carefully.
I understand the importance of backward compatibility in the context of a ubiquitous phenomenon like x86 assembly code. I am not as clear about the virtue of maintaining backward compatibility for a tool such as gnuplot. However, I do understand that "it is what it is" so ... I'll just move on!
end_of_axis_ranges
token!You wrote that a
<definition>
(especially one without a following comma) is a legitimate lexical element between the<axis-range-list>
and the first<plot-element>
. (That fact is now explicitly documented via my recent edits of the syntax definition for (s)plot.) So it seems to me that absolutely any dummy definition will serve as an effectiveend_of_axis_ranges
token. Can you think of any reason why this is not so?a=b
is 3 char shorter thansample
and exactly the same length as the ugly appendage[.]
that I proffered for Straw man 2. So in my head I am running with that idea for now. It is not as elegant as Straw man 1 but it comes at zero cost.Now, this might be one of those lingering handy uses of string macro expansion. Consider:
r="xyzzy=1"
... so@r
is now a convenientend_of_axis_ranges
token and is just two chars to type! (at first I did try a greek letter like Ω but the parser balked at@Ω
)Heh, I consider that a bug, if rather low priority. The @-as-macro code is old and almost certainly predates the utf8-ification of strings and variable names. Care to fille it as a separate tracker item?
The manual descriptions are intended to be BNF-like, but they are certainly not very strict about it.
One of the oldest TODO suggestions in the code repository is this one "from way back when":
No one ever took that on as a project. The program has grown tremendously since then, so the project would be much bigger now.
I am nowhere near ready to "look under the hood" at code for gnuplot but, ... never say never ... maybe someday!
It would be foolhardy for anybody to pick up the lexical scanner task you mentioned without first having a comprehensive gnuplot syntax chart to work from.
An "Appendix" of BNF-style syntax definitions is something I would like to contribute (really for my own understanding but also for the above) however I won't make in-roads on that until after Valentine's Day. For now the best I can manage is to incorporate as much as I can glean from the various main paragraph text into the most relevant snippets of BNF-like Syntax definitions that are in place for each subsection. This will necessarily grow the number of lines in those "Syntax:" headed subsections so if that verbosity is undesirable then please advise and I'll start some Appendix-like repository for the longer-winded syntax definitions and leave the more casual "syntax-by-example" sections as they currently are.
It progresses in fits and starts because I cannot feel confident about my edits until I've hands-on tested the various commands and features that I'm writing about seven ways from Sunday! Fortunately the plethora of demo files makes for a great bootstrap in doing just that.
5 Dec is a hard cut off (to my availability) at so for the time being I am focused on making as many (hopefully valued) edits as I can to the existing documentation.
Update:
The latest commit series in 6.1 takes a couple of steps in the direction you are advocating for. When the plot command is parsed, If a range specification is seen to contain three fields
[min:max:increment]
rather than only two, it is recognized as a sampling range rather than an axis range. This removes the need for thesample
keyword in many cases, but it does require that you use the three-field form. To make this easier, an empty third field can be used to indicate that the default sampling increment should be used. This is either 1 or (max-min)/samples depending on context. I have modified the relevant documentation sections to desribe the new behaviour.Thus all three of the plot commands below are now equivalent:
I am not 100% certain there are no unintended side-effects of this change since it involves a try-once-and-revert-if-it-fails step in the parsing code. There may also be some odd cases where it makes a difference whether you do or do not include a variable name at the beginning of the range. And perhaps there is some way to further obviate the need for a separate
sample
keyword.For these reasons I think it is not ready for inclusion in the initial 6.0 release. Let it cook in the development branch for a while.
Last edit: Ethan Merritt 2023-12-07