From: Steve B. <Ste...@an...> - 2007-05-16 09:33:32
|
Filling in a little on Robin's suggestion...... The general idea would be: a) to use the current command line interface (or minor variant =20 thereof) to allow the user to specify some convergence criteria. b) to simply report all runs (not doing extra runs after the criteria =20= is met) c) as a corollary of b) the user makes their own off-line decision =20 how to deal with the data from b) (they may choose minimum, median, =20 mean, or whatever of the N compliant runs). d) to consider a) to be a "termination criteria" and implement it via =20= the existing "callback" mechanism (it simply tells the base harness =20 when to stop iterating the benchmark) e) a corollary of d) is that the user has complete freedom to =20 construct their own termination criteria by overriding the methods in =20= the callback if they don't like the base convergence criteria we =20 provide (we'll provide an example). We will flesh this out in the next short while and present a concrete =20= proposal RSN. The basic idea is that it would generalize over what =20 we have now and allow complete customization if the user so desired. =20= This should not be a lot of work. Thanks to every one for their feedback. This is what we like to see, =20= and reflects one of the key the goals of the project---to be =20 responsive! :-) Thanks, --Steve On 16/05/2007, at 5:45 PM, Robin Garner wrote: > Our feeling is that the Right Thing would be to allow the user =20 > callback > to control termination of the benchmark. I'll post again when I =20 > have a > better idea of how that will look. > > cheers > > Eric Bodden wrote: >> Yes, thanks that would be great. I think this exactly what I would >> need. Still, I would really appreciate some comment on this topic =20 >> from >> the DaCapo developers: Would you be open to incorporating such a >> change to the next DaCapo release? >> >> Cheers, >> Eric >> >> On 15/05/07, chris grzegorczyk <gr...@cs...> wrote: >>> Hi Eric, >>> >>> I was confused about the same issues. Your explanation of the =20 >>> existing >>> approach is correct from what I recall. >>> My approach was to rewrite parts of the harness to introduce the >>> following functionality: >>> >>> 1) adjust the computation of the coefficient of variation >>> 2) change the harness to halt only after the coefficient of =20 >>> variation >>> accounts for less than 3% of the sample mean over the last -=20 >>> window runs. >>> 3) report the sample mean, unbiased estimator of the variance, =20 >>> and the >>> actual sampled execution times (needed for ANOVA) >>> >>> If you are interested I would be happy to send you a patch for my =20= >>> hacked >>> together version of the harness (against dacapo-MR2) in a few =20 >>> days (things >>> are very crazy right now). >>> >>> cheers, >>> chris >>> >>> p.s. I am not on the mailing list >>> >>> ---------- Forwarded message ---------- >>> From: Sunil Soman <su...@cs...> >>> Date: May 15, 2007 5:06 PM >>> Subject: Fwd: [dacapobench-researchers] Confused about -converge =20 >>> option >>> To: chris grzegorczyk <de...@li...> >>> >>> Hi, >>> >>> I think you ran into this issue too. Can you reply to Eric ? >>> >>> thanks >>> Sunil >>> >>> Begin forwarded message: >>> >>> From: "Eric Bodden" < eri...@ma...> >>> Date: May 14, 2007 11:16:01 AM PDT >>> To: dac...@li... >>> Subject: [dacapobench-researchers] Confused about -converge option >>> Reply-To: dac...@li... >>> >>> Hi all. >>> >>> I have a question about the -converge option. I though, I had >>> understood this at some point but I do not seem to understand any >>> more... >>> >>> If I understand correctly, -converge makes the Harness run the >>> benchmarks n up to MAX times, until all runtime numbers collected so >>> far show a variance of less than some maximal value (3%?). This =20 >>> seems >>> to work for me, however I am not 100% sure what to do with the =20 >>> values >>> I get. >>> >>> Right now I am getting output as follows: >>> =3D=3D=3D=3D=3D DaCapo chart starting warmup =3D=3D=3D=3D=3D >>> =3D=3D=3D=3D=3D DaCapo chart completed warmup in 19323 msec =3D=3D=3D= =3D=3D >>> =3D=3D=3D=3D=3D DaCapo chart starting warmup =3D=3D=3D=3D=3D >>> =3D=3D=3D=3D=3D DaCapo chart completed warmup in 16392 msec =3D=3D=3D= =3D=3D >>> =3D=3D=3D=3D=3D DaCapo chart starting warmup =3D=3D=3D=3D=3D >>> =3D=3D=3D=3D=3D DaCapo chart completed warmup in 16376 msec =3D=3D=3D= =3D=3D >>> =3D=3D=3D=3D=3D DaCapo chart starting warmup =3D=3D=3D=3D=3D >>> =3D=3D=3D=3D=3D DaCapo chart completed warmup in 16368 msec =3D=3D=3D= =3D=3D >>> =3D=3D=3D=3D=3D DaCapo chart starting =3D=3D=3D=3D=3D >>> =3D=3D=3D=3D=3D DaCapo chart PASSED in 16331 msec =3D=3D=3D=3D=3D >>> >>> I guess this now means that the harness stopped and timed the last >>> iteration because VAR(19323,16392,16376,16368) < 3%, right? My >>> question is now: Ok, we know that the variance of the first four =20 >>> runs >>> was under 3% but how do we know that the last run is within that >>> boundary, too? It could be entirely different, couldn't it? >>> >>> Would it now make more sense to just accumulate n measurements until >>> the variance drops under the given target value and then average >>> (arithmetic mean) over all those measurements that we got? >>> >>> Cheers, >>> Eric >>> >>> >>> -- >>> Eric Bodden >>> Sable Research Group >>> McGill University, Montr=EF=BF=BDal, Canada >>> >>> --------------------------------------------------------------------=20= >>> ----- >>> This SF.net email is sponsored by DB2 Express >>> Download DB2 Express C - the FREE version of DB2 express and take >>> control of your XML. No limits. Just data. Click to get it now. >>> http://sourceforge.net/powerbar/db2/ >>> _______________________________________________ >>> dacapobench-researchers mailing list >>> dac...@li... >>> https://lists.sourceforge.net/lists/listinfo/dacapobench-researchers >>> >>> >>> >> >> > > > --=20 > Robin Garner > Dept. of Computer Science > Australian National University > http://cs.anu.edu.au/people/Robin.Garner/ > > ----------------------------------------------------------------------=20= > --- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ > dacapobench-researchers mailing list > dac...@li... > https://lists.sourceforge.net/lists/listinfo/dacapobench-researchers |