From: sfeam (E. Merritt) <eam...@gm...> - 2012-10-20 16:41:48
|
On Friday, 19 October 2012, Dima Kogan wrote: > > On Wed, 3 Oct 2012 10:21:19 -0700 > > Ethan A Merritt <sf...@us...> wrote: > > > > On Wednesday, October 03, 2012 01:39:08 am Dima Kogan wrote: > > > > On Sat, 29 Sep 2012 17:07:51 -0700 > > > > Ethan Merritt <merritt@u.washington.edu> wrote: > > > > > > > > > 6. Removing duplicate messages (such as duplicate consecutive V commands) sounds > > > > > great. We should do it > > > > OK. Low priority because it's relatively rare for normal plots. > > > > > > I did a first pass at this. Patch attached. Good news is that as expected, the > > > traffic drops dramatically. Inboard timing drops from about 0.9s to about 0.45s. > > > The outboard, however, drops from 0.85s to 0.01s! The reason the inboard didn't > > > drop as much is that it still has to parse the original huge data file. I have > > > some lingering concerns about the patch I'm attaching. > > > > > I use ftell() to check to see that the V commands are indeed consecutive. > > > This might have a non-negligible cost, so I'd check before committing this. > > > > That's clever, but I agree that there is potential cost. > > Other terminal drivers do the same job without resorting to ftell(). > > The trick is that any command that potentially affects the current > > active position must either update or invalidate the inboard copy > > of x_last and y_last. For example, term->put_text() would set > > x_last = y_last = INVALID; /* #define INVALID -1 */ > > before leaving. > > > > The down side is that unlike your ftell() version this approach requires > > finding all the places that might affect current position. I've attached a > > first-pass patch that catches most of them, but I probably missed some. > > Other terminal drivers can serve as a model. > > > > > > > At this point, my test case is clearly broken since we've been able to optimize > > > away all its complexity. > > > > Right. I modified your original data generation script to produce longer > > vectors and some gaps, so the the plot would contain move commands as well > > as vector commands. This gives a more realistic mix of commands: > > > > perl -e 'for(0..2000000) \ > > { print "$_ " . sin($_/1000) . "\n"; print "\n" if $_ % 100 == 0; }' \ > > > breaks.ascii > > > > With this test data the reduction in size from removing redundant commands > > is less than 10%. I tested using the "uniq" command rather than patching > > the driver source code. 10% max didn't seem very significant to me, > > which was why I said it was low priority. > > > > > Is the same optimization valid for P commands? > > > > I don't think so. But it does apply to M commands. > > > > > I'm thinking of just generating a bunch of discrete points, and sending > > > them over as P commands. That sounds good, right? > > > > I don't think that the active position after drawing a point symbol is > > guaranteed to be at the center of the point. So in the sequence > > Move(x,y); Point(x,y); Move(x,y); Vector(x1,y1); > > the second Move is not redundant. > > > > Of course, we could change the code so that Point(x,y) it _is_ guaranteed > > to leave the active position at (x,y). > > > OK. I revisited this (duplication suppression). Patch attached. I believe this > optimization is equally applicable to P, M, and V. Note that this is all purely > inboard, so there's no active position at all; that's an outboard concept. So > for instance, a duplicated P command would normally draw the same point glyph > multiple times in the same exact position, thus removing the duplication doesn't > change the output. Tell me if I'm misunderstanding. I think you are not unstanding what I was trying to say. The issue is not whether repeated P commands are redundant, the question is whether or not a M command is needed after the P. Scenario: one might think that a series of points connected by lines could be drawn as P(x1,y1) V(x2,y2) P(x2,y2) V(x3,y3) ... But that doesn't work because P(x1,y1) doesn't leave the current position at (x1,y1). So instead one needs to do P(x1,y1) M(x1,y1) V(x2,y2) P(x2,y2) M(x2,y2) V(x3,y3) ... My point is that all the M commands may seem redundant but they are not. [NB: This is not the ordering produced by "with linespoints"] > Conclusions: > > 1. ftell() is way too slow > 2. the code with the attached patch is significantly faster than before in the > best case, and about the same in the worst case I've been too busy to have a serious look at your non-redundancy patch, but at first glance it looks more complicated than necessary. Do you really need to set X11_IPC_LASTDATARUN_NONE for every single command that doesn't change the current position? Other terminal drivers manage just fine without this. If the small set of commands that _do_ change the position (M, V, P, T, ??) track the current position then no one else needs to care. Ethan > Next, I'm going to look into binary communication again. > > dima > |