#918 'smooth' does not interpolate z (colormap) coordinate

closed-wont-fix
nobody
2D plot (258)
3
2011-03-11
2010-09-10
eartoaster
No

When setting line colors via z coordinate (plot 'file.dat' w l palette z), smoothing the data via the 'smooth' option does apparently not
interpolate the z/color coordinate.
result is: if the datafile contains data for N points, the colors are assigned to the _first_ N interpolated points.

I would have expected the z coordinate to be interpolated in the same way as the y coordinate...

Discussion

  • eartoaster
    eartoaster
    2010-09-10

    simple test example.

     
    Attachments
  • Péter Juhász
    Péter Juhász
    2010-09-10

    I can reproduce the problem.

    However, I wouldn't consider this a bug - more like a missing feature.

    Implementing it would not be straightforward, because the spline interpolation code works with exactly one set of x and one set of y coordinates. It doesn't even know about the color coordinates. To have those interpolated too, we'd have to call the spline code one more time, with the same set of x coordinates, but with the color as y coordinates.

    I'll look into it. Until then, here is a workaround:
    Use gnuplot's "set table" feature to dump the interpolated x/y pairs into a file, then repeat this step, but specifying "using 1:3" this time to interpolate the colors.
    Then use the "paste" UNIX utility or some equivalent to merge the two files. Finally plot the resulting file, now with "lc palette z".

     
  • Péter Juhász
    Péter Juhász
    2010-09-10

    example for workaround

     
    Attachments
  • Péter Juhász
    Péter Juhász
    2010-09-10

    Patch to makes interpolation automagically work on variable colors

     
  • Péter Juhász
    Péter Juhász
    2010-09-10

    Further analysis of the issue:
    The behavior described by the OP is specific to version 4.5 - this is because the implementation of variable color was changed recently in that version. What actually happens is that the varcolor array never gets extended, even though the number of points has changed because of the interpolation. So the garbage colors on the plot come from uninitialized memory (I wonder why doesn't it segfault). This can be considered a bug.
    4.4.0 and 4.2.6 behave differently: they display a solidly colored line.

    The workaround offered below works with all versions.

    Nevertheless, here is an ugly patch for the 4.5 version that fixes the bug and implements the feature. I didn't test it very thoroughly but it appears to work with the test case.

     
  • Ethan Merritt
    Ethan Merritt
    2010-09-10

    Although smoothing does make some sense in the case of 'lc palette z', I think it does not make sense for any other variable color type. For these other cases I suspect the best thing is simply to turn off the variable color and document that smoothing does not apply to color in general.

     
  • Péter Juhász
    Péter Juhász
    2010-09-11

    I agree - in fact it doesn't even make sense for all smoothing options.

    I can extend the patch to cover the sensible cases only ('lc palette z' with 'smooth {csplines|acsplines|bezier|sbezier}' - is there more?), but perhaps it is indeed simpler to turn off variable color altogether if this_plot->plot_smooth != SMOOTH_NONE, and set a solid line color instead.

    Anyway, there is no rush, because there is a relatively easy workaround.

     
  • Ethan Merritt
    Ethan Merritt
    2010-09-21

    I have temporarily disabled variable color in plots with smoothing, since failing to do so can currently trigger a segfault.

     
  • Ethan Merritt
    Ethan Merritt
    2010-09-21

    • priority: 5 --> 3
    • status: open --> open-postponed
     
  • Péter Juhász
    Péter Juhász
    2010-09-21

    Have you managed it to actually trigger a segfault? I see why it could, but it never actually did, even when I tried with large amounts of data.

    Also, do you have anything against checking in my patch below?
    (obviously, it won't apply now as it is, but I'd modify it to cover the sensible cases only)

     
  • Ethan Merritt
    Ethan Merritt
    2010-09-21

    Running under valgrind documents access to unallocated memory. At that point it's just a matter of luck whether you get a segfault - it depends on the chunk size and location of the previous allocation.

    As to the patch - I'm not happy about the idea of copying the entire plot structure just so that one quantity can be sorted in parallel with another.

    Furthermore, a quick look leads me to believe that the whole smoothing code needs some rethinking. I bet no one has looked at it in years! Why is cp_implode() averaging xlow/xhigh/ylow/yhigh? I understand why it's averaging y itself; that's the whole point of the exercise. But I'm scratching my head trying to think of when it is ever correct to average the others. Wouldn't it be more correct to keep the extreme values rather than the average? Are these quantities ever used in any flavor of smoothed plot? Conversely, why is it _not_ averaging z? Note that gen_interp() later overwrites z with (-1) for some reason. Why does it do that?

    So I'd like to understand better why cp_implode() and gen_interp() are written the way they are. I'm suspicious that the current code does not handle any of the fields in (struct coord) correctly other than x and y. I also suspect that the easiest way to add support for variable color is to copy varcolor[i] into z[i] before smoothing, and then copy it back out again afterwards. Since the current code destroys z, it's hard to think what would break by using that field for variable color instead.

    What do you think?

     
  • Péter Juhász
    Péter Juhász
    2010-09-21

    > As to the patch - I'm not happy about the idea of copying the entire plot
    > structure just so that one quantity can be sorted in parallel with another.

    Yes, this is very wasteful. It seemed like the quickest and most painless way to hack it together.

    > Are these quantities ever used in any flavor of smoothed
    plot?

    Given that the first thing eval_plots() do when it encounters a valid smoothing option is to set this_plot->plot_style to LINES, I don't see that any of (x|y)(low|high) has any sensible role in smoothed plots.

    > Conversely, why is it _not_ averaging z? Note that gen_interp()
    > later overwrites z with (-1) for some reason. Why does it do that?

    It seems that SMOOTH_ACSPLINES uses the z field to store the weights. But I have no idea why do all the do_* functions fill it with -1. There must be some reason to it, or it wouldn't be there in all the functions. The value -1 is suspicious, it seems to be a flag, but I don't know what does it signify and where is that information used.

    > I also suspect that the easiest way to add support for variable color is to
    > copy varcolor[i] into z[i] before smoothing, and then copy it back out
    > again afterwards. Since the current code destroys z, it's hard to think
    > what would break by using that field for variable color instead.

    See above: ACSPLINES uses the z field, so we have to be careful with it.
    Copying the variable color data wouldn't do the trick anyway, because gen_interp() and the functions called by it operate on the y field only. That's why I wrote the patch like I did: I had to trick the interpolation code (which I didn't dare to touch) into thinking that it works with y data as usual.

    (Unrelated note: The gnuplot.info main page should be updated with 4.4.1 release info...)

     
  • Ethan Merritt
    Ethan Merritt
    2010-09-22

    > the first thing eval_plots() do when it encounters a valid
    > smoothing option is to set this_plot->plot_style to LINES

    Hmm. Then we have a minor mystery.
    I see that line of code, but I also see that the command
    plot 'foo.dat' using 1:2:3 smooth unique with errorbars
    does in fact result in a smoothed plot with error bars.
    So there must be some code path that avoids that line of code and does not reset plot style.
    The error bars are averaged, which is not a proper statistical operation, but that's a separate issue.
    > Copying the variable color data wouldn't do the trick anyway, because
    > gen_interp() and the functions called by it operate on the y field only.

    Sure. But my thought was that the cleaner solution was to add a flag to cp_tridiag() that tells it which quantity to interpolate. For most cases it would be called only once, for y, as it is now. But in the case of variable color it would be called once for y and a second time for [some slot where we stuffed varcolor]. This has the minor side benefit that we could also handle smoothing of "with filledcurves" by calling it twice: once for y and once for yhigh.

    Alternatively we could leave everything the way it is now and document that color isn't smoothed. It doesn't really seem like such an important feature to have.

     
  • Péter Juhász
    Péter Juhász
    2010-09-22

    > Hmm. Then we have a minor mystery.
    > I see that line of code, but I also see that the command
    > plot 'foo.dat' using 1:2:3 smooth unique with errorbars
    > does in fact result in a smoothed plot with error bars.

    I just realized that there is no mystery here, just a nasty dependence on order.
    That line is just there for providing a default style (lines) for smoothed plots, but a 'with' modifier can redefine the style. In fact, this is the correct way of things, since smooth.dem depends on it ("smooth freq with boxes").
    However, this works only if the 'smooth' keyword comes before the 'with'. In the reversed order all smoothed plots are plotted with lines, which is incorrect if the user gave an explicit 'with whatever'.

    > But my thought was that the cleaner solution was to add a
    > flag to cp_tridiag() that tells it which quantity to interpolate.

    Plus we'd have to rewrite do_cubic() and the rest to make them handle other fields than y (since those functions do the actual work).

    > Alternatively we could leave everything the way it is now and
    > document that color isn't smoothed. It doesn't really seem like
    > such an important feature to have.

    I admit defeat on this one - it just doesn't worth the hassle.
    There is a workaround anyway.

     
  • Ethan Merritt
    Ethan Merritt
    2010-09-22

    Given the limited resolution of color reproduction, perhaps we really could do what the original reporter expected: smooth the color "in the same way as the y coordinate"? That is, apply the spline coefficients calculated for y to both y and z.

    Even if we don't add back color smoothing, this investigation has turned up other issues that should be fixed. I am not happy that the current code is averaging ylow and yhigh to draw "smoothed" error bars. One could make an argument for either keeping the extreme values or calculating the root mean square, but I don't see how taking the mean can ever be the right thing to do. So I think we should either change it and document the change, or patch the hole that lets plot style other than LINES be smoothed.

    On the other hand, either of those break your example "smooth freq with boxes". But what is that command supposed to do, exactly? I suppose it plots the average height, but the width is lost because as we already noted z (which holds box width) is reset to -1.

     
  • Péter Juhász
    Péter Juhász
    2010-09-22

    > That is, apply the spline coefficients
    > calculated for y to both y and z.

    While this is less wasteful than my solution, I still don't like it because it would further decrease the modularity of the code. In its present state the smoothing code doesn't have to know much about the data. If we were to apply this idea, we would have to look for variable color (in fact, the right combination of variable color and smoothing options) inside the functions that do the smoothing.

    > I am not happy that the current code
    > is averaging ylow and yhigh to draw "smoothed" error bars.

    Probably smoothing doesn't make any sense with error bars. But it may make sense with other styles!

    I think there is a conceptual problem here. When we talk about smoothing in general, we are actually talking about a set of data post-processing methods that have different goals and use cases in mind. For example, the purpose of 'smooth csplines' is to turn a discrete series of points into a (pseudo)-continuous curve - to use it with anything other than lines would be weird. On the other hand, it is perfectly reasonable to use 'smooth uniq' and 'smooth freq' with points, or even boxes (as in the smooth.dem demo).

    Therefore this:
    > patch the hole that lets plot style other than LINES be
    > smoothed.
    would be the wrong thing to do!
    In fact we have to do the opposite: make sure that an explicit 'with somestyle' statement is honored even if smoothing is on and the 'smooth' keyword comes after the 'with'. The silly cases (like error bars with csplines) should give a warning and revert to a sensible default.

     
  • Ethan Merritt
    Ethan Merritt
    2011-03-11

    • status: open-postponed --> closed-wont-fix