This is semi-related to [bugs:#2129] in that this bug appears to be what was tickling the bug that was causing the segfault.
Basically, loading a file of tab-separated multicolumn data and then plotting a subset of the columns into a datablock appears to preserve the original number of tab-separators in the new datablock which appears to cause the stats command to inconsistantly report the number of columns.
Steps to reproduce below. Notice that the reported matrix is [1 X 10] but the value of 'STATS_columns' is 5.
> printf "" > in.tsv; for i in {1..10}; do printf "%i\tfoo%i\ta\tb\tc\n" $i $i >> in.tsv; done
> head -n3 in.tsv
1 foo1 a b c
2 foo2 a b c
3 foo3 a b c
> gnuplot -d -e "set datafile separator tab; set table \$BLOCK; plot 'in.tsv' using (column(1)) with table; unset table; stats \$BLOCK matrix; show variables"
* FILE:
Records: 10
Out of range: 0
Invalid: 0
Column headers: 0
Blank: 9
Data Blocks: 1
* MATRIX: [1 X 10]
Mean: 5.5000
Std Dev: 2.8723
Sample StdDev: 3.0277
Skewness: 0.0000
Kurtosis: 1.7758
Avg Dev: 2.5000
Sum: 55.0000
Sum Sq.: 385.0000
Mean Err.: 0.9083
Std Dev Err.: 0.6423
Skewness Err.: 0.7746
Kurtosis Err.: 1.5492
Minimum: 1.0000 [ 0 0 ]
Maximum: 10.0000 [ 0 9 ]
COG: 0.0000 6.0000
User and default variables:
pi = 3.14159265358979
GNUTERM = "x11"
NaN = NaN
$BLOCK = <10 line data block>
STATS_records = 10
STATS_invalid = 0
STATS_headers = 0
STATS_blank = 9
STATS_blocks = 1
STATS_outofrange = 0
STATS_columns = 5
STATS_mean = 5.5
STATS_stddev = 2.87228132326901
STATS_ssd = 3.02765035409749
STATS_skewness = 0.0
STATS_kurtosis = 1.77575757575758
STATS_adev = 2.5
STATS_mean_err = 0.908295106229247
STATS_stddev_err = 0.642261628933256
STATS_skewness_err = 0.774596669241483
STATS_kurtosis_err = 1.54919333848297
STATS_sum = 55.0
STATS_sumsq = 385.0
STATS_min = 1.0
STATS_max = 10.0
STATS_index_min_x = 0
STATS_index_min_y = 0
STATS_index_max_x = 0
STATS_index_max_y = 9
STATS_size_x = 1
STATS_size_y = 10
Gnuplot built from branch-5-2-stable:
> git log -n1
commit 668bcbee7d760388eebc2d30611837b0ba76789b (HEAD -> branch-5-2-stable, origin/branch-5-2-stable)
Author: Bastian Maerkisch <bmaerkisch@web.de>
Date: Tue Feb 19 08:34:09 2019 +0100
Always update mouse variables on bound keys
Previously, this was only done if the "allwindows" option was used.
Bug #2133
> uname -a
Darwin pinion.local 18.2.0 Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 2018; root:xnu-4903.231.4~2/RELEASE_X86_64 x86_64
What you are continuing to discover is that the "stats" command was not designed with the keyword "matrix" in mind. The "matrix" keyword triggers a separate data input path that bypasses the normal read-one-line-of-ascii-data-at-a-time subroutine. Unfortunately that bypassed subroutine tracks a number of things that later get loaded into STATS_foo variables, and even more unfortunately the matrix input subroutine doesn't track these at all. So the value of STATS_columns you see comes from the most recent non-matrix plot command.
This example also illustrates a larger sort point that the STATS_foo variables are not guaranteed to be current. They should be wiped clean at the start of every "stats" command so that values not generated by the current command do not exist at all. For example, if a command
stats $FOO using 2:3
is followed by a commandstats $BAR using 4
, the no-longer current values of STATS_correlation, STATS_sumxy, etc are still present and may be misinterpreted as belonging to the most recent command.Fixing this has been on my TODO list for a long time, but it turns out to be tricky because the "name" option to stats changes where the values are stored and we don't yet know that on entry.
Thanks again Ethan. Just now had the opportunity to test. Confirmed the fix. I appreciate all your time and dedication.