From: pieter claassen <pieter@cl...>  20050202 14:03:44

Andre, comments inline. ... > > So let us start with the data. What kind of data you have in the > beginning? You kind of collected the histogram data before you start > with PyX, don't you? We could start with that, but we don't need to. > In principle it would be possible to calculate the histogram out of > some scattered data points *within* the graph style. I'm just not > sure whether this would be a good idea or not ... my data is obtained by running a number of simultions (100 to 1000000) and to record a time result of each simulation (0 to 200). I then process the data by grouping values into descrete intervals (between 5 and 10 seconds each). The result is thus a list of the following format [[interval[0],value[0]],[interval[1],value[1],.....[interval[n],value[n]] e.g. [[0,0],[5,0],[10,0].......[95,55],[100,45]] here are some use cases 1. Top plot this data so that it can be seen that there were 55 trips that took 95 minutes and 45 minutes that took 100 minutes. 2. To show that the distribution of the results are mostly on the high end. To recap the current problem: When this large number of data points are plotted on a bar graph, the xaxis labels overwrite each other. Current suggested solution: ?? Any suggestions on how to proceed? Cheers, Pieter 
From: pieter claassen <pieter@cl...>  20050202 14:03:44

Andre, comments inline. ... > > So let us start with the data. What kind of data you have in the > beginning? You kind of collected the histogram data before you start > with PyX, don't you? We could start with that, but we don't need to. > In principle it would be possible to calculate the histogram out of > some scattered data points *within* the graph style. I'm just not > sure whether this would be a good idea or not ... my data is obtained by running a number of simultions (100 to 1000000) and to record a time result of each simulation (0 to 200). I then process the data by grouping values into descrete intervals (between 5 and 10 seconds each). The result is thus a list of the following format [[interval[0],value[0]],[interval[1],value[1],.....[interval[n],value[n]] e.g. [[0,0],[5,0],[10,0].......[95,55],[100,45]] here are some use cases 1. Top plot this data so that it can be seen that there were 55 trips that took 95 minutes and 45 minutes that took 100 minutes. 2. To show that the distribution of the results are mostly on the high end. To recap the current problem: When this large number of data points are plotted on a bar graph, the xaxis labels overwrite each other. Current suggested solution: ?? Any suggestions on how to proceed? Cheers, Pieter 
From: Andre Wobst <wobsta@us...>  20050203 09:33:44

Hi Pieter, On 03.02.05, pieter claassen wrote: > A few questions if I may (this might be answered earlier in the list, so > tell me to stop if that is the case): I think we've never distributed that kind of knowledge. Some things are described (more or less) at various places, but let me just give some answers to your questions right away ... > 1. How does pyx make use of TeX? What is the integration and purpose of > the integration? PyX starts a (La)TeX instance as soon as it needs to typeset some text. PyX can't typeset text without TeX, so the purpose of the TeX integration is to be able to insert text into the output. PyX typesets the requested text in a box, analyses the box extents and writes the box content into TeX's dvifile on a separate page for each box. Later on it can read the information from that page and insert the appropriate PostScript code into the output. > 2. What is the basic idea behind the library design? I can see from the > documentation the basic workflow, but what about the underlying principles > for the implementation? Well, PyX is just a collection of usefull stuff:  a canvas to paint on  paths with various features (splitting, intersection, etc.)  decorators which assign stroke or fill operations (and other things) to paths  attributes for the stroke or fill operations (like colors etc.)  deformers to modify paths (like transformations)  graphs built on top of that features Graphs themselfs are constructed out of various components:  graph.graph takes care about the geometric arragement  graph.data contains classes to define graph data (by reading from a file or calculating it from a function etc.)  graph.style contains graph styles (lines, symbols, etc.)  graph.key to create graph keys  axis for the conversion of data values to graph coordinates (graph coordinates are fixed to the interval 0..1) The axis are build out of various components themself:  graph.axis.axis does the conversion and global bookkeeping  graph.axis.tick are ticks to be placed on axes  graph.axis.parter calculates ticks for a given axis range  graph.axis.texter creates the label text to be placed at the ticks  graph.axis.painter paints the whole axis May I've missed the one or the other, but that's the basic technology. The idea is to plug everything together to make it working ... ;) > As to labels and tics: This I think is a difficult subject because in many > libraries it is not an easy subject and the implementor requires a > significant amount of knowledge to do anything other than use defaults. To > make it more user friendly, I have the following suggestions (and will > help implement where I can) > 1. To develop a number of usecases and appropriate examples for manual > overried that include: We intent to use examples for that. Those are complete usecases and they work as functional tests for us. That's why we try to keep comprehensive examples out of the manual ... > 1.1. Example of no labels on all types graphs Several possibilities here: The most easiest variant would be to just suppress the painting of the labels. Set the labelattrs of the painter to None ... that's it. You could also create parters which place ticks at the axis, but no labels. > 1.2. Example of manual labels on all types of graphs Set the parter to None and use manualticks. > 1.3. ... There will be answers to that too ... ;) > 2. Would it be possible to provide a manual override parameter to all axis > that allows users to provide a list of values or [[value,label],..] that > will then be converted to chartcoordinates and plotted at value location > on the axis? This would provide ultimate flexibility. manualticks does that. But you can also use the axispos methods to get positions at the axis and do whatever you want: ... g.dolayout() x, y = g.axespos["x"].tickpos(100) g.text(x, y, "here we are") Note that a specialized painter would be able to do such things as well and by that you can plugin your needs into the graph design. I.e. the solution as shown above will almost always be total crap, although it might look as the most easiest and simplest way to get your needs done in the first. But its only the most simplest way if you need it only a single time. As soon as you want to use this twice, a painter will save you time ... > I have been thinking about a first set of requirements for the histogram > class and here is my stab at them. > > REQUIRED > 1. To plot steps like gnuplot. This is absolutely trivial to implement, but my concerns are, that you need to adjust the data to what's needed by the plotting system and not vice versa ... > 2. To plot multiple series on a graph. Use several plot commands on a graph or use a list of data in a single plot command. > 3. To provide manual overrides on the xaxis. manualticks will do ... > 4. To cater for both continuous as well as discreet xvalues (this again > is a little unclear but I can imagine that not all statistical sampling > would be agains continues values) Well, there are continous variables and those which aren't. Noncontinuous variables are for example "Germany", "US", "Japan", "Russia" ... to be used on a bar graph. Continous variables are numbers. Sure you can use integer numbers as discrete variables, but that's almost always a misuse. Also times and dates are usually continuous variables, although, in some usecases days of months or such are used as discrete variables. A histogram should be build on continous variables. I don't think its worth a discussion ... BTW: you can use continous and bargraph axes in the same graph. PyX is able to handle an arbitrary number of axes for each graph dimension. > DESIRED > 1. To provide the facility to swap axis (this I cannot justify, but can > imagine that it would be valuable. Definitely. A graph style should never take into account the axis names and the ordering of the different axis dimensions. A graph style does not know anything about the graph layout ... and then x and y axes are exchangeable in the style without any additional efforts ... > http://en.wikipedia.org/wiki/Histogram Nice idea to look into the wikipedia. It exactly shows the whole problem of a histogram: To fully describe a histogram, you need three informations per data point: a position, a width and a hight. The only problem is, that typically you do not have all the three informations kept in a data file. Thats why I don't know how to implement such a style. That's all. So we're back on my original question: What data should we start with? André  by _ _ _ Dr. André Wobst / \ \ / ) wobsta@..., http://www.wobsta.de/ / _ \ \/\/ / PyX  High quality PostScript figures with Python & TeX (_/ \_)_/\_/ visit http://pyx.sourceforge.net/ 
From: pieter claassen <pieter@cl...>  20050203 14:56:18

Andre, >> 4. To cater for both continuous as well as discreet xvalues (this again >> is a little unclear but I can imagine that not all statistical sampling >> would be agains continues values) > > Well, there are continous variables and those which aren't. > Noncontinuous variables are for example "Germany", "US", "Japan", > "Russia" ... to be used on a bar graph. Continous variables are > numbers. Sure you can use integer numbers as discrete variables, but > that's almost always a misuse. Also times and dates are usually > continuous variables, although, in some usecases days of months or > such are used as discrete variables. > > A histogram should be build on continous variables. I don't think its > worth a discussion ... > Point taken. > Nice idea to look into the wikipedia. It exactly shows the whole > problem of a histogram: To fully describe a histogram, you need three > informations per data point: a position, a width and a hight. The only > problem is, that typically you do not have all the three informations > kept in a data file. Thats why I don't know how to implement such a > style. That's all. So we're back on my original question: What data > should we start with? In general, the following data is available: [[x1,y1],[x2,y2]......[xn,yn]] Graph width/height Would it be possible to express the step width as following: step_width = axis_length / num_data_points And do the x location of middle of each step: g_x[n] = gmin_xaxis + (n * step_width)/2 The y value of each step: g_y = gmin_yaxis + ((f(x)  fmin) / (fmax  fmin)) * gmax_yaxis [my IQ might let me down here!] What are your thoughts on this? Pieter 
From: Andre Wobst <wobsta@us...>  20050202 16:02:32

Hi Pieter, welcome on pyxdev (which is kind of low traffic most of the time as pyxuser as well) ... ;) On 02.02.05, pieter claassen wrote: > 1. Top plot this data so that it can be seen that there were 55 trips that > took 95 minutes and 45 minutes that took 100 minutes. > 2. To show that the distribution of the results are mostly on the high end. > > To recap the current problem: > When this large number of data points are plotted on a bar graph, the > xaxis labels overwrite each other. > > Current suggested solution: > ?? Well, not really. As I said before, you should try to use a xyplot for that. A bar graph is just the wrong thing for that. > Any suggestions on how to proceed? You'll need a histogram style. There are basically two ways to get it. You could create a style from graph.style.linestyle, overwrite the drawpoint method and call the graph.style.linestyle's drawpoint several times for each point with appropriately modified sharedata.vposi data to get the steplike shape. The other possibility would be to write your own style from scratch. A simple, first working example could look like: import random from pyx import * class histogram(graph.style._styleneedingpointpos): needsdata = ["vpos", "vposmissing", "vposavailable"] defaultlineattrs = [] def __init__(self, lineattrs=[]): self.lineattrs = lineattrs def selectstyle(self, privatedata, sharedata, graph, selectindex, selecttotal): if self.lineattrs is not None: privatedata.lineattrs = attr.selectattrs(self.defaultlineattrs + self.lineattrs, selectindex, selecttotal) else: privatedata.lineattrs = None def initdrawpoints(self, privatedata, sharedata, graph): privatedata.path = path.path() privatedata.lastvpos = None def drawpoint(self, privatedata, sharedata, graph): if sharedata.vposavailable: if privatedata.lastvpos: midvxpos = 0.5 * (privatedata.lastvpos[0] + sharedata.vpos[0]) privatedata.path.append(path.lineto_pt(*graph.vpos_pt(midvxpos, privatedata.lastvpos[1]))) privatedata.path.append(path.lineto_pt(*graph.vpos_pt(midvxpos, 0))) privatedata.path.append(path.lineto_pt(*graph.vpos_pt(midvxpos, sharedata.vpos[1]))) privatedata.path.append(path.lineto_pt(*graph.vpos_pt(*sharedata.vpos))) else: privatedata.path.append(path.moveto_pt(*graph.vpos_pt(*sharedata.vpos))) privatedata.lastvpos = sharedata.vpos[:] else: privatedata.lastvpos = None def donedrawpoints(self, privatedata, sharedata, graph): if privatedata.lineattrs is not None and len(privatedata.path.path): graph.stroke(privatedata.path, privatedata.lineattrs) g = graph.graphxy(width=8) g.plot(graph.data.list([(5*i, random.random()) for i in range(1, 20)], x=1, y=2), [histogram()]) g.writeEPSfile("histogram") However, its just a starting point. We do not correctly adjust the vertical range. We do not know how to handle the edge points (currently particial boxes are plotted  we could, of course, just plot steps (like gnuplot and others, but I'm not sure whether this is a good idea)). We do not cut the path at the graph border. We can't exchange x and y axis ... etc. Any comments how to proceed? What's needed out there? (I do not need histograms at all, otherwise I would have implemented such a style before already, but since we're now on the subject, we might get it done once and for all ...) Although its working well that way, its not that easy to make it a robust graph style. As usual with graph styles. Its easy to implement one, but to make it general perpose, some more work needs to be done ... André  by _ _ _ Dr. André Wobst / \ \ / ) wobsta@..., http://www.wobsta.de/ / _ \ \/\/ / PyX  High quality PostScript figures with Python & TeX (_/ \_)_/\_/ visit http://pyx.sourceforge.net/ 
From: Andre Wobst <wobsta@us...>  20050202 16:16:25

Hi again, On 02.02.05, Andre Wobst wrote: > Well, not really. As I said before, you should try to use a xyplot > for that. A bar graph is just the wrong thing for that. Just to avoid confusion: A bargraph is a xyplot as well, but is has a special axis, which doesn't handle continuous variables. In that sense its not a xyplot with continous variables at the axes ... André  by _ _ _ Dr. André Wobst / \ \ / ) wobsta@..., http://www.wobsta.de/ / _ \ \/\/ / PyX  High quality PostScript figures with Python & TeX (_/ \_)_/\_/ visit http://pyx.sourceforge.net/ 
From: pieter claassen <pieter@cl...>  20050203 08:22:54

Andre, I had a go at looking at your code, but without understanding the larger implementation, I am lost. However, I am interested in this and will have a go at it again later. A few questions if I may (this might be answered earlier in the list, so tell me to stop if that is the case): 1. How does pyx make use of TeX? What is the integration and purpose of the integration? 2. What is the basic idea behind the library design? I can see from the documentation the basic workflow, but what about the underlying principles for the implementation? As to labels and tics: This I think is a difficult subject because in many libraries it is not an easy subject and the implementor requires a significant amount of knowledge to do anything other than use defaults. To make it more user friendly, I have the following suggestions (and will help implement where I can) 1. To develop a number of usecases and appropriate examples for manual overried that include: 1.1. Example of no labels on all types graphs 1.2. Example of manual labels on all types of graphs 1.3. ... 2. Would it be possible to provide a manual override parameter to all axis that allows users to provide a list of values or [[value,label],..] that will then be converted to chartcoordinates and plotted at value location on the axis? This would provide ultimate flexibility. I have been thinking about a first set of requirements for the histogram class and here is my stab at them. REQUIRED 1. To plot steps like gnuplot. 2. To plot multiple series on a graph. 3. To provide manual overrides on the xaxis. 4. To cater for both continuous as well as discreet xvalues (this again is a little unclear but I can imagine that not all statistical sampling would be agains continues values) DESIRED 1. To provide the facility to swap axis (this I cannot justify, but can imagine that it would be valuable. http://en.wikipedia.org/wiki/Histogram Pieter > Hi Pieter, > > welcome on pyxdev (which is kind of low traffic most of the time as > pyxuser as well) ... ;) > > On 02.02.05, pieter claassen wrote: >> 1. Top plot this data so that it can be seen that there were 55 trips >> that >> took 95 minutes and 45 minutes that took 100 minutes. >> 2. To show that the distribution of the results are mostly on the high >> end. >> >> To recap the current problem: >> When this large number of data points are plotted on a bar graph, the >> xaxis labels overwrite each other. >> >> Current suggested solution: >> ?? > > Well, not really. As I said before, you should try to use a xyplot > for that. A bar graph is just the wrong thing for that. > >> Any suggestions on how to proceed? > > You'll need a histogram style. There are basically two ways to get it. > You could create a style from graph.style.linestyle, overwrite the > drawpoint method and call the graph.style.linestyle's drawpoint > several times for each point with appropriately modified > sharedata.vposi data to get the steplike shape. The other possibility > would be to write your own style from scratch. A simple, first working > example could look like: > > import random > from pyx import * > > class histogram(graph.style._styleneedingpointpos): > > needsdata = ["vpos", "vposmissing", "vposavailable"] > > defaultlineattrs = [] > > def __init__(self, lineattrs=[]): > self.lineattrs = lineattrs > > def selectstyle(self, privatedata, sharedata, graph, selectindex, > selecttotal): > if self.lineattrs is not None: > privatedata.lineattrs = > attr.selectattrs(self.defaultlineattrs + self.lineattrs, > selectindex, selecttotal) > else: > privatedata.lineattrs = None > > def initdrawpoints(self, privatedata, sharedata, graph): > privatedata.path = path.path() > privatedata.lastvpos = None > > def drawpoint(self, privatedata, sharedata, graph): > if sharedata.vposavailable: > if privatedata.lastvpos: > midvxpos = 0.5 * (privatedata.lastvpos[0] + > sharedata.vpos[0]) > privatedata.path.append(path.lineto_pt(*graph.vpos_pt(midvxpos, > privatedata.lastvpos[1]))) > privatedata.path.append(path.lineto_pt(*graph.vpos_pt(midvxpos, > 0))) > privatedata.path.append(path.lineto_pt(*graph.vpos_pt(midvxpos, > sharedata.vpos[1]))) > privatedata.path.append(path.lineto_pt(*graph.vpos_pt(*sharedata.vpos))) > else: > privatedata.path.append(path.moveto_pt(*graph.vpos_pt(*sharedata.vpos))) > privatedata.lastvpos = sharedata.vpos[:] > else: > privatedata.lastvpos = None > > def donedrawpoints(self, privatedata, sharedata, graph): > if privatedata.lineattrs is not None and > len(privatedata.path.path): > graph.stroke(privatedata.path, privatedata.lineattrs) > > > g = graph.graphxy(width=8) > g.plot(graph.data.list([(5*i, random.random()) for i in range(1, 20)], > x=1, y=2), [histogram()]) > g.writeEPSfile("histogram") > > > However, its just a starting point. We do not correctly adjust the > vertical range. We do not know how to handle the edge points > (currently particial boxes are plotted  we could, of course, just > plot steps (like gnuplot and others, but I'm not sure whether this is > a good idea)). We do not cut the path at the graph border. We can't > exchange x and y axis ... etc. > > Any comments how to proceed? What's needed out there? (I do not need > histograms at all, otherwise I would have implemented such a style > before already, but since we're now on the subject, we might get it > done once and for all ...) Although its working well that way, its not > that easy to make it a robust graph style. As usual with graph styles. > Its easy to implement one, but to make it general perpose, some more > work needs to be done ... > > > André > >  > by _ _ _ Dr. André Wobst > / \ \ / ) wobsta@..., http://www.wobsta.de/ > / _ \ \/\/ / PyX  High quality PostScript figures with Python & TeX > (_/ \_)_/\_/ visit http://pyx.sourceforge.net/ > > >  > This SF.Net email is sponsored by: IntelliVIEW  Interactive Reporting > Tool for open source databases. Create drag&drop reports. Save time > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. > Download a FREE copy at http://www.intelliview.com/go/osdn_nl > _______________________________________________ > PyXdevel mailing list > PyXdevel@... > https://lists.sourceforge.net/lists/listinfo/pyxdevel > 