|
From: Dima K. <gn...@di...> - 2024-01-28 01:11:19
|
Hi. I'm observing a data-parsing problem that I think is a bug... but
looking at the code, maybe it isn't? This has been an issue for a long
time; it's not something new in 6.0.
To reproduce, I have this script:
set terminal dumb 80,40
unset grid
plot '-' with linespoints
1 -
2 10
3 5
4 13
5 4
6 13
7 7
8 15
e
Note the bad-data line "1 -". This produces this plot:
8 +-----------------------------------------------------------------------+
| + + + + + + ** |
| '-' ***A*** |
| ** |
| ** |
7 |-+ *A +-|
| ** |
| ** |
| ** |
| ** |
6 |-+ *A* +-|
| ** |
| ** |
| ** |
| ** |
5 |-+ A +-|
| ** |
| ** |
| * |
| ** |
| ** |
4 |-+ *A +-|
| ** |
| ** |
| ** |
| ** |
3 |-+ *A +-|
| ** |
| ** |
| ** |
| ** |
2 |-+ *A* +-|
| ** |
| ** |
| ** |
|** + + + + + + |
1 +-----------------------------------------------------------------------+
0 1 2 3 4 5 6 7
Note that I would expect an up/down zigzag, which ignores the whole
bad-data line. Removing that line produces the expected output:
16 +----------------------------------------------------------------------+
| + + + + + |
| '-' ***A*** |
| |
| |
| *|
14 |-+ +*|
| * |
| * |
| A A * |
| * ** * |
| * * * * * |
12 |-+ * * * * * +-|
| * * * * * |
| * * * * * |
| * * * * * |
| * * * * * |
| * * * * * |
10 |-+ * * * * * +-|
|* * * * * * |
| * * * * * * |
| * * * * * * |
| * * * * * * |
| * * * * * * |
8 |-+ * * * * * * +-|
| * * * * * * |
| * * * * * * |
| * * * * A |
| * * * * |
| * * * * |
6 |-+ * * * * +-|
| * * * * |
| * * * * |
| A * * |
| * * |
| + + * + + |
4 +----------------------------------------------------------------------+
2 3 4 5 6 7 8
I can also "fix" it by adding to the script:
set datafile missing "-"
What should the default behavior be? Should it not be ignoring the
bad-data line entirely? The block of code that handles this is here:
https://github.com/gnuplot/gnuplot/blob/fbeb88eadedf927a4d778b41dd118e373f33eacb/src/datafile.c#L2380
With this data we have df_column[1].good == DF_BAD. The code checks for
DF_MISSING and DF_UNDEFINED, but not DF_BAD. We end up exiting the loop here:
https://github.com/gnuplot/gnuplot/blob/fbeb88eadedf927a4d778b41dd118e373f33eacb/src/datafile.c#L2425
which leaves output==1. And then we set df_no_use_specs = output = 1 a
bit later:
https://github.com/gnuplot/gnuplot/blob/fbeb88eadedf927a4d778b41dd118e373f33eacb/src/datafile.c#L2458
So the first row of data has one valid column, and we then expect
exactly one valid column from every subsequent row of data as well.
Checked here:
https://github.com/gnuplot/gnuplot/blob/fbeb88eadedf927a4d778b41dd118e373f33eacb/src/datafile.c#L742
Is this what we want?
Thanks much
|
|
From: Ethan A M. <me...@uw...> - 2024-01-28 05:59:40
|
On Saturday, 27 January 2024 16:26:09 PST Dima Kogan wrote:
>
> Hi. I'm observing a data-parsing problem that I think is a bug... but
> looking at the code, maybe it isn't? This has been an issue for a long
> time; it's not something new in 6.0.
>
> To reproduce, I have this script:
>
> set terminal dumb 80,40
> unset grid
> plot '-' with linespoints
> 1 -
> 2 10
> 3 5
> 4 13
> 5 4
> 6 13
> 7 7
> 8 15
> e
>
> Note the bad-data line "1 -". This produces this plot:
>
[snip]
>
> Note that I would expect an up/down zigzag, which ignores the whole
> bad-data line. Removing that line produces the expected output:
Not a bug.
It would be more obvious what is happening if you replace the
first column of values with some other range that doesn't start at 1.
For instance:
set terminal dumb 80,10
unset grid
plot '-' with linespoints
11 -
12 10
13 5
14 13
15 4
16 13
17 7
18 15
e
18 +----------------------------------------------------------------------+
17 |-+ + + + + + *****G**** +-|
16 |-+ *****G**** '-' ***G***-|
15 |-+ *****G**********G**** +-|
13 |-+ *****G**** +-|
12 |-+ *****G**** + + + + + +-|
11 +----------------------------------------------------------------------+
0 1 2 3 4 5 6 7
Note that the x range runs from 0 to 7 rather than from 11 to 18.
That is because the program found only a single number of the first line
of data, so it assumed you implicitly wanted "plot foo using 0:1 with lp".
I.e. if only a single column of data is given it is used as a y value and
the corresponding x value is taken from the row number, a.k.a. column 0.
The same data plotted with the command
plot '-' using 1:2 with linespoints
gives
16 +----------------------------------------------------------------------+
14 |-+ + *G* + *G* + ***-|
12 |-+ *** ** ** ****using 1:2 ***G***-|
10 |** *** ** ** **** *** +-|
8 |-+****** *** *** ** **G* +-|
6 |-+ ***G* + ** + ** + + +-|
4 +----------------------------------------------------------------------+
12 13 14 15 16 17 18
Both the x range and the shape of the plot is as expected.
>
> So the first row of data has one valid column, and we then expect
> exactly one valid column from every subsequent row of data as well.
> Checked here:
>
> https://urldefense.com/v3/__https://github.com/gnuplot/gnuplot/blob/fbeb88eadedf927a4d778b41dd118e373f33eacb/src/datafile.c*L742__;Iw!!K-Hz7m0Vt54!me4XQdEz6ddDkXcDlBCDA3x0Nh-puPO6RfaqP9afG6Z-DMAiShcRiRh2zLMlBLfTzAetVK89V0o9M82vM3Azbm7y31YbKA$
>
> Is this what we want?
It has always been that way.
If you know what columns you want to take the data from,
that's what "using" is for.
Otherwise the program has to guess, and it uses the first
row of data to make that guess.
Ethan
|
|
From: Dima K. <gn...@di...> - 2024-01-31 17:39:22
|
Right. OK. Thanks for pointing this out. It looked very strange before lookin at the source, and I suppose gnuplot has to decide to do SOMETHING with funky data. Thanks much. |