From: Hans-Bernhard B. <br...@ph...> - 2004-07-10 01:07:17
|
On Fri, 9 Jul 2004, Ethan Merritt wrote: > #ifdef EAM_HISTOGRAMS > /* EAM FIXME - There are places in df_readline where it would be really */ > /* nice to know what kind of plot we are making, so I think that */ > /* current_plot should be a parameter to df_readline. I strictly disagree with that reasoning. If df_readline needs to know what the the plot style is, then IMHO that's a sign that there's something seriously wrong with df_readline(). It's separations like this being broken down or "tunnelled", tieing every piece of the program to all others, that got gnuplot into the mess it's still in. Bluntly put, adding a global variable is essentially never the right solution for any problem other than stack size limitations. Instead of designing a correct structure of "who is responsible for what", creating a global means that we've given up. In the case at hand, df_readline is the core interface between the datafile itself and the next stage in data processing: the 'using' specifiers. It should never need to know anything about plot styles, which are at least two steps removed from its work. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-11 00:20:43
|
On Sat, 10 Jul 2004, Ethan A Merritt wrote: > Hans-Bernhard Broeker <br...@ph...> wrote: > > Ethan A Merritt wrote: > > > The code in datafile.c and elsewhere (df_readline is not so much > > > the culprit here) makes an assumption that it can deduce the style of > > > plot based on the number of input columns requested. > > > > Could you show a concrete example of that? > > The worst offenders are one level up, in plot[23]d.c in the routines > get_data and get_3ddata. Each of these contains a huge switch > statement that controls the interpretation of input data columns > based on the total number of returned columns. Within each > 'case <ncols>:' of the switch statement, the code then tries to > run through all possible plot modes that could have generated > that number of input columns. I think this is an example of the tail > wagging the dog; the main switch statement should rather be > over the plot types. I agree wholeheartedly that that code needs to be revamped in a major way. But that has no relevance at all to the job to be done by datafile.c. As long as the 'using' syntax is based on an unstructured list of expressions rather than named columns like this plot 'data' using (x 1, y 2, yhigh 3, ylow 4) , the mess in get_{3d,}data() is going to stay as it is. It's not pretty, admitted, but it has nothing to do with the point at hand. Actually, the fact that this stuff is *not* done in datafile.c is what I'm talking about. It's not in there, and it shouldn't be. So, let's see: the general plan (before datastrings) is *) datafiles have colums *) 'using' processes columns into records of at most 7 numbers *) get_data() maps numbers from that record to data points Of these three actions, only the first two take place in datafile.c At some point in the process, datafile handling *will* become dependent on the plot style, but datafile.c is not where that point should be. I think we may have to turn all that on its head: let the caller tell the df_... routines in advance how many using specs it needs, and what values (axis assignment, time/date flag, datastring flag) each of those should represent. > The problematic code in df_readline() itself is the bit which decides > where to stuff the input values based on the number of requested input > columns. It allows for an implicit x coordinate value, but only in the > case of 'plot ... using <y> with <something that only takes 2 cols>' > > /* FIXME EAM - Trap special case of only a single 'using' column. */ > /* But really we need to handle general case of implicit column 0 */ > if (output == 1) > xpos = (axcol == 0) ? df_datum : v[axcol-1]; > else > xpos = v[axcol]; Hmmm... then how did that work before EAM_DATASTRINGS? I can't seem to find any similar snippet in 4.0 df_readline(). -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Ethan A M. <merritt@u.washington.edu> - 2004-07-11 04:23:39
|
On Saturday 10 July 2004 05:20 pm, you wrote: This is going to be a long explanation, so please bear with me. > So, let's see: the general plan (before datastrings) is > > *) datafiles have colums > *) 'using' processes columns into records of at most 7 numbers > *) get_data() maps numbers from that record to data points Right. df_readline() fills in a vector v[] whose elements are in order of the requested data items. The datastrings patchset added a parallel vector df_tokens[] whose elements point instead to a string representing the particular requested data item. get_data() can use these to fill in the 'data points' in the '[s]plot ... with labels' style. So far, so good. Then it gets a bit messier because the new code also handles entities such as an in-line plot title or tic labels, and the lines below are part of that. > > /* FIXME EAM - Trap special case of only a single 'using' column. */ > > /* But really we need to handle general case of implicit column 0 */ > > if (output == 1) > > xpos = (axcol == 0) ? df_datum : v[axcol-1]; > > else > > xpos = v[axcol]; > > Hmmm... then how did that work before EAM_DATASTRINGS? I can't seem to > find any similar snippet in 4.0 df_readline(). For completeness sake, let me explain what this code is doing: At this point in the code we know that the string value we are looking at is supposed to be an axis tic label. We also know which axis it is a tic label for. We want to create a pair ["ticlabel", axis-value] just as in a 'set [xyz]tics { "label" pos, ...}' command. But which input column contains the value for that axis? That is, if we are reading a ytic label, how do we know which column contains y? Problem #1: Most of the plot styles are consistent in that the first column is x, the second column is y, and the third column is z. But the 3D plot modes can violate this order and in particular, as I mentioned before, it is problematic which column will be used to generate a color value. Given that there is not currently any way to tell what the plot style is (or even if it is 3D) all I can do is hope that the x:y:z order holds. But currently 'plot ... using ...cbticlabels(<n>)' doesn't work, because there is no way to know which column the color info is in without access to more information about the plot. Problem #2: The mapping of columns 1:2:3 to x:y:z coordinate value is broken by the use of implicit x coordinates. Consider the command 'plot ... using 2:yticlabel(1)'. In this case the first (and only) column contains a y coordinate rather than an x coordinate. Since no currently supported plot styles really use only a single input coordinate, it is easy to detect this simple case, and that is exactly what the code section extracted above is doing. The code is marked FIXME because it cannot detect, let alone deal with, any more complicated cases where the order of axes cannot be deduced from the number of columns. In practice this is not much of a limitation, though eventually someone is bound to trip over it and report it as a bug. If it becomes important to handle the oddball cases as well, then I think it will be necessary to have access to the current plot and style information. That is why I used it as an example of wanting a pointer to the current plot. -- Ethan A Merritt Department of Biochemistry & Biomolecular Structure Center University of Washington, Seattle |
From: Daniel J S. <dan...@ie...> - 2004-07-11 12:49:07
|
Hans-Bernhard Broeker wrote: >So, let's see: the general plan (before datastrings) is > >*) datafiles have colums >*) 'using' processes columns into records of at most 7 numbers >*) get_data() maps numbers from that record to data points > >Of these three actions, only the first two take place in datafile.c At >some point in the process, datafile handling *will* become dependent on >the plot style, but datafile.c is not where that point should be. > >I think we may have to turn all that on its head: let the caller tell the >df_... routines in advance how many using specs it needs, and what values >(axis assignment, time/date flag, datastring flag) each of those should >represent. > From what I remember, the issue in the image case had to do with how to route the columns in the instance where the user doesn't give the "using" details. The question for me is should the user be required *always* to enter specifically how columns are to be used via "using"? I mean, there are some logical default methods of treating the data if the user doesn't give the "using". However, that kind of "logical default" is plot_type dependent. Dan |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-11 00:24:17
|
On Sat, 10 Jul 2004, Daniel J Sebald wrote: > > > Ethan A Merritt wrote: > > >For what it's worth - > >After I apply the with-image patch df_readline contains 653 lines of > >code, of which only 303 are shared by the two modes. > > > >The largest block of shared code is the section (lines 3362-3465) > >checking for blank lines, skipped lines, and EOF. Is this even relevant > >to binary mode? > > > > That's correct. But my point is that they do pretty much similar > things, i.e., loading the variables to prepare the next stage of the > code that determines where to reroute them. Well, achieving the same goal is not the same thing as doing the same job, and thus doesn't call for having the two routines combined. I tend to agree with Ethan here: if they share so little actual code, they had better be in separate routines, and the common code broken out into a common subroutine. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Daniel J S. <dan...@ie...> - 2004-07-11 01:58:14
|
Hans-Bernhard Broeker wrote: >On Sat, 10 Jul 2004, Daniel J Sebald wrote: > > > >>Ethan A Merritt wrote: >> >> >> >>>For what it's worth - >>>After I apply the with-image patch df_readline contains 653 lines of >>>code, of which only 303 are shared by the two modes. >>> >>>The largest block of shared code is the section (lines 3362-3465) >>>checking for blank lines, skipped lines, and EOF. Is this even relevant >>>to binary mode? >>> >>> >>> >>That's correct. But my point is that they do pretty much similar >>things, i.e., loading the variables to prepare the next stage of the >>code that determines where to reroute them. >> >> > >Well, achieving the same goal is not the same thing as doing the same job, >and thus doesn't call for having the two routines combined. I tend to >agree with Ethan here: if they share so little actual code, they had >better be in separate routines, and the common code broken out into >a common subroutine. > Right. They are not intermixed in a convoluted fashion. If I recall, there is an if statement "if (df_general_binary)" that pretty much breaks things into two chunks of code that easily could be put into separate routines. The point is I didn't want to upset the apple cart too much, by creating all kinds of new subroutines and the like... I still make the point that higher level plot{23}d.c routines should hot have to know whether the file is binary or ascii, so there should still just be "df_readlin()" from that perspective. A much cleaner and organized "df_readline()" is a different matter. Dan |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-13 00:29:55
|
Re-send of message that collided with a problem at SF.net on Sunday: On Sat, 10 Jul 2004, Daniel J Sebald wrote: > Hans-Bernhard Broeker wrote: > > >So, let's see: the general plan (before datastrings) is > > > >*) datafiles have colums > >*) 'using' processes columns into records of at most 7 numbers > >*) get_data() maps numbers from that record to data points > > > >Of these three actions, only the first two take place in datafile.c At > >some point in the process, datafile handling *will* become dependent on > >the plot style, but datafile.c is not where that point should be. > From what I remember, the issue in the image case had to do with how to > route the columns in the instance where the user doesn't give the > "using" details. The option of omitting the using specification is legacy behaviour of gnuplot. How about this proposal, then: require binary file always to have using specifiers, and be done with that complication. > The question for me is should the user be required > *always* to enter specifically how columns are to be used via "using"? I have no problems with such a requirement, as far as newly invented features are concerned --- we already require it for time/date parsing, e.g. |
From: Petr M. <mi...@ph...> - 2004-07-13 08:36:05
|
> The option of omitting the using specification is legacy behaviour of > gnuplot. How about this proposal, then: require binary file always to > have using specifiers, and be done with that complication. > > > The question for me is should the user be required > > *always* to enter specifically how columns are to be used via "using"? Requirement for using specifiers always for plot ... binary would be rather cumbersome. Current syntax is OK, e.g. plot 'x.edf' binary with image --- PM |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-13 12:45:28
|
On Tue, 13 Jul 2004, Petr Mikulik wrote: > > The option of omitting the using specification is legacy behaviour of > > gnuplot. How about this proposal, then: require binary file always to > > have using specifiers, and be done with that complication. > > > > > The question for me is should the user be required > > > *always* to enter specifically how columns are to be used via "using"? > > Requirement for using specifiers always for plot ... binary would be rather > cumbersome. Current syntax is OK, e.g. > plot 'x.edf' binary with image Well, it turns out that this requirement causes significant design problems in the code. We have to trade off between ease of the user interface and clear design of the code. At the current stage, I would really favour the latter. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Petr M. <mi...@ph...> - 2004-07-13 13:59:43
|
> > Requirement for using specifiers always for plot ... binary would be rather > > cumbersome. Current syntax is OK, e.g. > > plot 'x.edf' binary with image > > Well, it turns out that this requirement causes significant design > problems in the code. We have to trade off between ease of the user > interface and clear design of the code. At the current stage, I would > really favour the latter. What's the problem with the above simple command? Why to do anything more? Gnuplot is command-line driven, thus we should keep command as simple as possible. Thus, I prefer the former (unless there are ambiguities, of course). --- PM |
From: Ethan M. <merritt@u.washington.edu> - 2004-07-13 16:30:27
|
On Tuesday 13 July 2004 06:59 am, Petr Mikulik wrote: > > > Requirement for using specifiers always for plot ... binary would be rather > > > cumbersome. Current syntax is OK, e.g. > > > plot 'x.edf' binary with image > > What's the problem with the above simple command? Why to do anything more? I thought the binary mode was intended to handle all plot types, not just the new "with image". Or have I got that wrong? If indeed it is to handle all plot styles, then it suffers from the same ambiguities as the non-binary form of plot commands. Simplest case: plot 'data' using 0:5 with lines # y value is found in col 2 plot 'data' using 5 with lines # same y value is now found in col 1 If you omit the using spec, it is ambiguous where to find the y value. So I don't see how plot 'data' binary with lines is supposed to work unambiguously. That said, I would be perfectly happy to restrict things so that binary mode does not permit an implicit x-val column. -- Ethan A Merritt merritt@u.washington.edu Biomolecular Structure Center Mailstop 357742 University of Washington, Seattle, WA 98195 |
From: Daniel J S. <dan...@ie...> - 2004-07-13 17:37:20
|
Ethan Merritt wrote: >On Tuesday 13 July 2004 06:59 am, Petr Mikulik wrote: > > >>>>Requirement for using specifiers always for plot ... binary would be rather >>>>cumbersome. Current syntax is OK, e.g. >>>> plot 'x.edf' binary with image >>>> >>>> >>What's the problem with the above simple command? Why to do anything more? >> >> > >I thought the binary mode was intended to handle all plot types, >not just the new "with image". Or have I got that wrong? > You're correct. >If indeed it is to handle all plot styles, then it suffers from the same >ambiguities as the non-binary form of plot commands. >Simplest case: > plot 'data' using 0:5 with lines # y value is found in col 2 > plot 'data' using 5 with lines # same y value is now found in col 1 > >If you omit the using spec, it is ambiguous where to find the y value. >So I don't see how > plot 'data' binary with lines >is supposed to work unambiguously. > >That said, I would be perfectly happy to restrict things so that >binary mode does not permit an implicit x-val column. > I can go either way on having or not having implicit columns. If things were organized well in the code, and explained well in the documentation, defaults might be no problem. If a table containing the default implicit values existed, it could probably be dumped to the screen in some fashion to help the user. But this is all too much of a big step right now to ponder a nice set up. This one example plot 'x.edf' binary with image is in fact a special case if the "auto" or "edf" file type is selected because column information will be had from the file itself. This is a different issue requiring discussion, but it shows the potential to do such things. Dan |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-13 22:43:56
|
On Tue, 13 Jul 2004, Ethan Merritt wrote: > On Tuesday 13 July 2004 06:59 am, Petr Mikulik wrote: > > > > Requirement for using specifiers always for plot ... binary would be rather > > > > cumbersome. Current syntax is OK, e.g. > > > > plot 'x.edf' binary with image > > > > What's the problem with the above simple command? Why to do anything more? > > I thought the binary mode was intended to handle all plot types, > not just the new "with image". Or have I got that wrong? > If indeed it is to handle all plot styles, then it suffers from the same > ambiguities as the non-binary form of plot commands. > Simplest case: > plot 'data' using 0:5 with lines # y value is found in col 2 > plot 'data' using 5 with lines # same y value is now found in col 1 That particular ambiguity could be fixed trivially in get_data(). All it would take would be to convert 'using 5' into 'using 0:5' before any of the datafile.c functions were even called. The truly tricky ones are to be found elsewhere, though: acsplines, errorbars dy vs. ymin:max, and explicit colour columns. In a nutshell, we may already have piled too many new features on top of the existing code base --- the foundation is starting to crumble. > So I don't see how > plot 'data' binary with lines > is supposed to work unambiguously. The same way it always did. Actually, I think 'binary' with no using spec should probably be reserved to mean the old gnuplot binary format, if only for backward compatibility. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Daniel J S. <dan...@ie...> - 2004-07-14 04:06:04
|
Hans-Bernhard Broeker wrote: >> So I don't see how >> plot 'data' binary with lines >> is supposed to work unambiguously. >> > > > The same way it always did. Actually, I think 'binary' with no using > spec should probably be reserved to mean the old gnuplot binary format, > if only for backward compatibility. > That's the way it is currently programmed. Binary files for "plot" (as opposed to "splot") weren't implemented, so in that case the patch complains that more information in a "using" field is required. That could be changed so that gnuplot binary defaults. Dan |
From: Daniel J S. <dan...@ie...> - 2004-07-10 04:20:16
|
Hans-Bernhard Broeker wrote: >On Fri, 9 Jul 2004, Ethan Merritt wrote: > > > >>#ifdef EAM_HISTOGRAMS >> /* EAM FIXME - There are places in df_readline where it would be really */ >> /* nice to know what kind of plot we are making, so I think that */ >> /* current_plot should be a parameter to df_readline. >> >> > >I strictly disagree with that reasoning. If df_readline needs to know >what the the plot style is, then IMHO that's a sign that there's something >seriously wrong with df_readline(). > >It's separations like this being broken down or "tunnelled", tieing every >piece of the program to all others, that got gnuplot into the mess it's >still in. > >Bluntly put, adding a global variable is essentially never the right >solution for any problem other than stack size limitations. > >Instead of designing a correct structure of "who is responsible for what", >creating a global means that we've given up. In the case at hand, >df_readline is the core interface between the datafile itself and the next >stage in data processing: the 'using' specifiers. It should never need to >know anything about plot styles, which are at least two steps removed from >its work. > Well, global variables is certainly not good. The issue is that df_readline() has a number of "local" variables that must be configured by "df_open", e.g., what quantities to put into what variable. The question is, should the user have to specify every detail about that, *or* should there be some defaults, i.e., the logical extension to 3 space, for example. If you think the latter is good, then there is some information about the plotting style required. Dan |
From: Ethan A M. <merritt@u.washington.edu> - 2004-07-10 06:16:01
|
On Friday 09 July 2004 06:07 pm, Hans-Bernhard Broeker <br...@ph... wrote: > I strictly disagree with that reasoning. If df_readline needs to know > what the the plot style is, then IMHO that's a sign that there's something > seriously wrong with df_readline(). Let me present two arguments for passing a pointer to the current plot. The code in datafile.c and elsewhere (df_readline is not so much the culprit here) makes an assumption that it can deduce the style of plot based on the number of input columns requested. Maybe that used to be true back when gnuplot supported fewer plot styles, but it no longer holds. I maintain that if a piece of code needs to know the current plot style it should look in the proper place, the plot structure, rather than trying to guess it from incomplete information. The second reason arose when I added the "datastrings" code that allows reading meta-information related to the plot from the datafile, including axis tic labels, plot key titles, and so on. If you want to argue that these are a bad idea, or propose a different way to accomplish this, please feel free. But if you accept that df_readline is now reading information that has to be stored in the plot structure eventually, then it seems cleaner to me that it receive a pointer to the structure so that it can be stored there directly upon input. I have not looked much at Daniel's "with image" code, except for the terminal API. I am a bit surprised that it needs to call df_readline() at all, since I would have expected that reading a binary file is a distinct operation from reading successive lines of formatted input. Daniel - don't you think it would be cleaner to provide an entirely separate routine, df_readbinary(), and call it instead of df_readline() when needed? You can never mix the two modes of input (is that right?), so it seems a bad idea to mix the code that handles them. -- Ethan A Merritt eme...@es... |
From: Daniel J S. <dan...@ie...> - 2004-07-10 07:57:28
|
Ethan A Merritt wrote: >I have not looked much at Daniel's "with image" code, except for >the terminal API. I am a bit surprised that it needs to call df_readline() >at all, since I would have expected that reading a binary file is a >distinct operation from reading successive lines of formatted input. > >Daniel - don't you think it would be cleaner to provide an entirely >separate routine, df_readbinary(), and call it instead of df_readline() >when needed? You can never mix the two modes of input (is that right?), >so it seems a bad idea to mix the code that handles them. > That is correct, and I sort of started out that way. However, the more I thought about it, it just seemed that the formatted and binary input really weren't too much different. One can still imagine data inside a binary file as being in columns, with implicit columns filled in by the line number, etc.; just like with the formatted I/O. And at the bottom of df_readline() is the important code which knows how and where to put data in the v[] variables, and there may even be some scripts there, as I don't really understand the "at" stuff and that. I thought about splitting it off into binary data input, say some derivative of "df_3dmatrix()" or whatever it is called. However, then the flexibility like in df_readline() wouldn't be there, unless I copied parts of df_readline(). Also, if there were a hypothetical "df_readbinary()" then at the higher levels like plot2d.c and plot3d.c one would have to know if it is reading a binary data file or a formated ascii file. You then get these kind of limiting conditional statements in strange places, and a somewhat big df_3dmatrix() routine. That is why I proposed earlier that it may be possible to modify what I've done inside df_readline() so that it reads gnuplot binary. I don't think it would take too much. The trickiest part is that along the x-dimension can be a *non-uniform* sampling as described by the very first line of the gnuplot binary file. So, that needs to be stored in memory. I've got some ideas, but it probably isn't worth discussing now... or risking breaking the current df_3dmatrix() way of doing things. Looking at df_3dmatrix() in datafile.c and all that supporting code in binary.c, I couldn't see where to put in new binary code. Judging from the comments with those routines, it looks like someone had in mind to create arbitrary binary files but maybe gave up. It seems the paradigm of storing coordinates in the point->x, y, z, dx, dy, etc. format will not change in the near future, if at all. The free_matrix(), extend_matrix() etc. seems like a powerful set of routines, but outside the paradigm of point->x, y, z, dx, dy. Dan |
From: Dave D. <dde...@es...> - 2004-07-10 15:20:28
|
Daniel J Sebald <dan...@ie...> writes: > the bottom of df_readline() is the important code which knows how and > where to put data in the v[] variables, and there may even be some > scripts there, as I don't really understand the "at" stuff and that. > The "at" stuff is just the way expressions are stored. All (?) expression evaluation in gnuplot is done by compiling the expression into an action table, then stepping through the actions to compute a result. There is (or used to be) an undocumented command show at to show the internal representation of an expression. gnuplot> f(x)=x+2 gnuplot> show at f(sin(x)-cos(y+z)) push x sin push y push z plus cos minus call f pushd1 f dummy pushc 2 plus action tables are clearly stack-based, and user-defined functions are (not surprisingly) stored as action tables. For plot ... using (expression), each expression gets compiled into an action table, and then the action table is evaluated for each datafile line to calculate the required values. In the version I'm looking at, parse.c does the compiling of expressions, and eval.c does the evaluation (by just running through the function pointers stored in the at) Looks like a global variable "undefined" gets set if something goes wrong during the evaluation. dd -- Dave Denholm <dde...@es...> http://www.esmertec.com |
From: Ethan A M. <merritt@u.washington.edu> - 2004-07-10 18:33:11
|
On Saturday 10 July 2004 08:20 am, Dave Denholm wrote: > > There is (or used to be) an undocumented command show at > to show the internal representation of an expression. Thanks for that pointer! I had not noticed that feature at all. It may prove to be invaluable in debugging of string variable code. -- Ethan A Merritt Department of Biochemistry & Biomolecular Structure Center University of Washington, Seattle |
From: Ethan A M. <merritt@u.washington.edu> - 2004-07-10 18:31:10
|
On Saturday 10 July 2004 01:21 am, Daniel J Sebald wrote: > Ethan A Merritt wrote: > > > >Daniel - don't you think it would be cleaner to provide an entirely > >separate routine, df_readbinary(), and call it instead of df_readline() > >when needed? > > That is correct, and I sort of started out that way. However, the more > I thought about it, it just seemed that the formatted and binary input > really weren't too much different. One can still imagine data inside a > binary file as being in columns, with implicit columns filled in by the > line number, etc.; just like with the formatted I/O. For what it's worth - After I apply the with-image patch df_readline contains 653 lines of code, of which only 303 are shared by the two modes. The largest block of shared code is the section (lines 3362-3465) checking for blank lines, skipped lines, and EOF. Is this even relevant to binary mode? > And at the bottom > of df_readline() is the important code which knows how and where to put > data in the v[] variables Yes, that is clearly important to both modes. But perhaps it should be split out into a common subroutine shared by two different input routines. At that point you might even find that 2 input stages + 1 shared output stage have fewer lines of code jointly than the current single routine with its tangle of #ifdefs. -- Ethan A Merritt Department of Biochemistry & Biomolecular Structure Center University of Washington, Seattle |
From: Daniel J S. <dan...@ie...> - 2004-07-10 20:17:38
|
Ethan A Merritt wrote: >For what it's worth - >After I apply the with-image patch df_readline contains 653 lines of >code, of which only 303 are shared by the two modes. > >The largest block of shared code is the section (lines 3362-3465) >checking for blank lines, skipped lines, and EOF. Is this even relevant >to binary mode? > That's correct. But my point is that they do pretty much similar things, i.e., loading the variables to prepare the next stage of the code that determines where to reroute them. >>And at the bottom >>of df_readline() is the important code which knows how and where to put >>data in the v[] variables >> >> > >Yes, that is clearly important to both modes. But perhaps it should be >split out into a common subroutine shared by two different input routines. > >At that point you might even find that 2 input stages + 1 shared output >stage have fewer lines of code jointly than the current single routine with >its tangle of #ifdefs. > I've no problem with a common subroutine, as that even makes it more obvious the common idea between the two. Recall, my premise with the #ifdefs was to have the code *exactly unaltered* when BINARY_DATA_FILE is disabled. Dan |
From: Hans-Bernhard B. <br...@ph...> - 2004-07-10 12:59:56
|
Ethan A Merritt wrote: > The code in datafile.c and elsewhere (df_readline is not so much > the culprit here) makes an assumption that it can deduce the style of > plot based on the number of input columns requested. Could you show a concrete example of that? > I maintain that if a piece of code needs to know the > current plot style it should look in the proper place, the plot structure, > rather than trying to guess it from incomplete information. *If* it needs them. What I'm debating here is whether this need is real, rather than just a consequence of a design weakness. > The second reason arose when I added the "datastrings" code > that allows reading meta-information related to the plot from the datafile, > including axis tic labels, plot key titles, and so on. If you want to > argue that these are a bad idea, or propose a different way to > accomplish this, please feel free. That indeed is a valid reason. I'm just still not convinced that we should be passing in the entire plot struct for that, as opposed to just those parts of it that the routines are actually going to manipulate. We should try to keep those inter-linkages between different parts of the program as narrow and focused as possible. Passing in the entire plot struct wholesale is the opposite of that. -- Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |
From: Ethan A M. <merritt@u.washington.edu> - 2004-07-10 19:07:51
|
Hans-Bernhard Broeker <br...@ph...> wrote: > Ethan A Merritt wrote: > > The code in datafile.c and elsewhere (df_readline is not so much > > the culprit here) makes an assumption that it can deduce the style of > > plot based on the number of input columns requested. > > Could you show a concrete example of that? The worst offenders are one level up, in plot[23]d.c in the routines get_data and get_3ddata. Each of these contains a huge switch statement that controls the interpretation of input data columns based on the total number of returned columns. Within each 'case <ncols>:' of the switch statement, the code then tries to run through all possible plot modes that could have generated that number of input columns. I think this is an example of the tail wagging the dog; the main switch statement should rather be over the plot types. The problematic code in df_readline() itself is the bit which decides where to stuff the input values based on the number of requested input columns. It allows for an implicit x coordinate value, but only in the case of 'plot ... using <y> with <something that only takes 2 cols>' /* FIXME EAM - Trap special case of only a single 'using' column. */ /* But really we need to handle general case of implicit column 0 */ if (output == 1) xpos = (axcol == 0) ? df_datum : v[axcol-1]; else xpos = v[axcol]; I even have a real-life example of why this sort of assumption is limiting our options. I would have liked to allow the new 'splot with labels' style to accept the same sort of pm3d coloring options that other plotting styles offer. But I gave up trying to work around the assumptions in the current code that if you have a color value at all, then it must be in the 4th input column. 'splot with labels' needs the 4th specified input column to be the label itself; if there were an additional color value it would be in an optional 5th input column. Yeah, I could hack this to work somehow, but it would be a whole lot easier if the code were organized based on plot style rather than on number of columns. -- Ethan A Merritt Department of Biochemistry & Biomolecular Structure Center University of Washington, Seattle |
From: Daniel J S. <dan...@ie...> - 2004-07-10 20:25:48
|
Ethan A Merritt wrote: >Hans-Bernhard Broeker <br...@ph...> wrote: > > >The problematic code in df_readline() itself is the bit which decides >where to stuff the input values based on the number of requested input >columns. > [...] This is correct. I thought long and hard about the implications of this, trying to get something that made sense. You know, "if this column is present but not this, then one should be routed here and the other there, etc." But then you reach a contradiction that comes to light when it messes up some other demo. Of course, in binary files, there is no end of line to determine how many columns are present. It has to be specified by the user. But still, it's the same idea that Ethan is expressing. Dan |