From: Peter J. <pet...@gm...> - 2022-03-02 14:37:44
|
On Wed, Mar 2, 2022 at 5:22 AM Ethan A Merritt <me...@uw...> wrote: > On Tuesday, 1 March 2022 10:48:41 PST Juhász Péter wrote: > > Dear gnuplot list members, > > > > today I tried to modify a plot so that it applies a default value if > > one of the columns was empty, and for the longest time I couldn't > > understand why it didn't work. I managed to make it work in the end, > > but I still don't understand it. There are strange differences in > > behavior, depending on whether `column(x)` or `$x` is used, and it even > > depends on the datafile separator. I've distilled the issues I've found > > into the following minimal example: > > I will explain what I can, but I may not be noticing what it is > that you are calling strange. > > 1) Difference between whitespace and csv files. > There is only one instance of this, right? and it shows what happens > when there is only one number on the line of data. > This is a real difference and I think the reason is clear. > In a csv file you can count field separators and know for certain > that if there are N separators there are N+1 fields. > If fields are separated only by whitespace you cannot distinguish > between a field that is present, but empty, from a field that is > missing altogether. The former is handled by whatever is in place > to catch zero or missing data entries, the latter is considered an > error that invalidates the entire line. You might think that marking > the entire line invalid because the 2nd of two numbers is blank is > excessive, but what if there were supposedly 5 fields and the 2nd > was blank? - fields 3-5 would be misinterpreted as 2-4; bad. > OK, this is reasonable, thanks. > > 2) I see only one test case where there is a difference in output > between $2 and column(2). Is there another one that I missed? > > > # Curve 0 of 1, 3 points > > # Curve title: "$data_with_whitespace u 1:(column(2)>0?column(2):0)" > > # x y type > > 1 1 i > > 2 NaN u > > 3 3 i > > The 'u' in the table output tells you that this point is undefined > (as opposed to "missing", in which case it would not appear at all). > I am uncertain why $2 doesn't behave identically, but in any case > if you set the property > set datafile missing NaN > then both $2 and column(2) are treated as missing data. > The difference is worth investigating. I'll see what I can find. > Actually, the presence of the NaN is explained by what you wrote in the previous point. > > > Observations: > > - it's as if the mere presence of a $X in the specification causes the > > datum to be marked as invalid, and dropped entirely, if column X > > doesn't contain data, no matter what the rest of the specification is. > > Which piece of output shows this? > Uh, all of them that have $2 in the command? But compare just these two: # Curve 0 of 1, 2 points # Curve title: "$data_with_semicolons u 1:(valid(2)?$2:0)" # x y type 1 1 i 3 3 i # Curve 0 of 1, 3 points # Curve title: "$data_with_semicolons u 1:(valid(2)?column(2):0)" # x y type 1 1 i 2 0 i 3 3 i The only difference in the command is `column(2)` vs `$2`. From the $2 table, line 2 is missing. I'd argue that this is a bug, but even if it isn't, it's not documented (or if it is, documented poorly, because I couldn't find it), and even if it were a documented feature, it would be a confusing, unjustifiable and undiscoverable one at that. > > > Bonus: > > `reset` doesn't reset `set table`. The documentation of `reset` doesn't > > mention that as an exception. > > The help description is > The `reset` command causes all graph-related options that can be set > with the `set` command to return to their default values. > > It is only expected to affect the settings for graphics state. > It doesn't affect file output (set loadpath, set print, set table, > set output, ...). It also doesn't affect "set datafile columnheaders" > although maybe it should, or maybe that should be added to the explicit > list of exceptions. > OK, this is acceptable - but perhaps the documentation could be amended to be clearer, by mentioning that it doesn't affect output options, or have a complete list of exceptions. > > cheers, > Ethan > > > In any case, thanks for looking into this. best regards, Peter |