gnuplot / Bugs / #1968 substr() does not work with columnhead()

Ethan Merritt - 2017-09-13

I think that is not fixable, or at least not without reworking the sequence in which plot commands are parsed and executed.

The problem is that "title columnhead(col)" looks like a keyword "title" followed by a normal string-valued function columnhead(). But it isn't. Instead columnhead() is a special purpose function that loads a field in the per-plot data structure. Later on this field is used to construct a title. The point is that at the time the command is parsed, and the substring operator would have to execute, the file has not yet been read in so its content, including column headings, is unknown. I'm a bit surprised the substring operator returns anything at all!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Karl Ratzsch - 2017-09-13
  
  I see this would be very tricky. gnuplot (because we can do multiple plots in one command) has to parse the whole command before reading in any (header) data.
  
  So the only thing that can be used together with the columnhead() function is the string concatenation operator "." ?
  
  It'd be good if the help would make the limitations clearer:
  
  columnhead(x) may only be used to set a plot title as part of a plot, splot, or stats command. It returns a string containing the content of column x in the first line of a data file. Due to internal limitations, it cannot be processed further, except by the string concatenation operator '.'
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Ethan Merritt - 2017-09-14
    
    It's not quite that bad. When a data file is opened, the routine df_open() processes the command line options up to the comma separating it from the next plot element. It needs to do that so that it knows how to process the data as it is read from the file, but it doesn't have to process the entire rest of the command line.
    
    After thinking about this some more, I may see a way to defer execution of the string expression following the title keyword until after the file has been read in. However, it would introduce a bunch of extra bookkeeping so it may not be worth the effort. What do you think?
    
    A separate but easier change could be made to allow commands like
    
    plot 'foo' using 1:2:(func(columnhead($2)))
    
    That command is sufficiently strange that it may never be used, but it's an easier change than allowing the same think for the plot title because the column headers are known by the time we are processing data lines later in the file. Worth it?
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Karl Ratzsch - 2017-09-14
      
      You mean the alternative would be to have an option to read the title from (the last value of) an extra column in the "using" statement, instead of processing it in as an option to "title"?
      
      But then, can't we "fake" this behaviour and transparently always treat plot titles like that? What ever comes behind the keword "title" becomes an implicit, additional "using" column?
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Ethan Merritt - 2017-09-14
        
        No, that's not what I meant. The title is a separate issue.
        
        Right now the internal routine df_columnhead(n) always returns a placeholder string "@COLUMNHEADXXXX@" where XXXX is the column number. That's pretty useless for any general purpose, including access inside a using spec.
        
        I could change the code so that if a column header is already known, then df_columnhead() will return it. It would only return the placeholder string if no column headers are known yet, which would continue to be the case for title.
        
        The change would mean that a reference to columnhead(n) inside a using spec would return the actual column header. Right now it just returns the useless placeholder string.
        
        In fact, here's a jiffy patch if you want to try it out. It [probably] works, the question is whether it's useful for any real case.
        
        To make this work for the plot title we'd have to change the evaluation order of the plot command. That's a totally different kind of change. I think it might be possible but much messier than this relatively simple change to a single routine.
        
        debug_columnhead_inside_using.patch
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Ethan Merritt - 2017-09-14
        
        And here's an example of using it. So I guess that's a "real case" although there are other ways you could make this plot
        
        $DATA << EOD A B C D 4 8 6 7 8 5 7 3 3 8 4 6 1 2 9 10 EOD set nokey set offset 1,1,1,1 plot for [col=1:4] $DATA using 0:(column(col)):(columnhead(col)) with labels title columnhead
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Karl Ratzsch - 2017-09-14
        
        Ah, Ok. I was already one step further.
        
        That addition could certainly be useful, I remember once having some info in the header line to normalise data. At least the function would then match it's current description. I'll give the patch a try.
        
        But having that, can't we always send a placeholder instead of an actual title string, postpone evaluating the title to after processing the data (assuming that's a simple change), and give the result to the actual drawing routine when it comes across the placeholder?
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Ethan Merritt - 2017-09-14
        
        The problem is not saving and later replacing a placeholder (obviously,
        since that's what we do now). The problem is dealing with the piece of the
        command line that starts with the keyword title. We're expecting a
        string-valued expression to come next, so there are two choices. (1)
        Parse it and evaluate as we go. That's what the code currently does by
        calling try_to_get_string(). (2) Parse it but instead of evaluating, create
        a new function (a.k.a. "action table") to hold it by calling the routine
        perm_at(). OK, now we've got a function that can be called later to
        generate a plot title. Where do we store that function? When is it safe
        to call it? For instance we would like to know how much space to reserve
        for the plot title. Currently that is usually a string constant, so
        lenstr(title) does the trick. If we replace that string constant with a
        function that may or may not return a valid string if called before the
        data is read in, does the space reservation still work? Also we have to
        make sure the resources used by the action table are freed after the plot
        either completes or exits early on an error. All of this is possible, but
        it means adding new bookkeeping code in a bunch of places.
        
        On the other hand, the store-placeholder-string-and-replace-it-later
        mechanism we've got now also involves bookkeeping code in a bunch of
        places. That code could presumably go away. So there might be a net
        improvement in messiness, at the cost of changing twice as many places in
        the code. Hey, I'm almost talking myself into it. Sounds like a project
        for 5.3 rather than 5.2.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Ethan Merritt - 2017-09-18

Fix in CVS for 5.3

The altered code affects on data plots, since columnhead() means nothing for a function plot.

A potential consequence of defering evaluation of the title expression until after data loading is that in principle the title could make use of information that we learned while preparing the plot. E.g. number of points, min/max values, etc. Currently you would have to find these using a "stats" command before plotting.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Ethan Merritt - 2017-09-18

status: open --> pending-fixed

Group: -->

Priority: -->
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Ethan Merritt - 2017-11-17

Status: pending-fixed --> closed-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

substr() does not work with columnhead()

A portable, multi-platform, command-line driven graphing utility

Priority

Searches

Help

#1968 substr() does not work with columnhead()

Discussion