You can subscribe to this list here.
| 2001 |
Jan
(3) |
Feb
(1) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Timothy D. W. <wo...@os...> - 2002-07-29 18:20:07
|
Of course the systems in STP are able to be used by anybody who signs up. If you want to test you patch against a RDBMS performance setup. Tim On Mon, 2002-07-29 at 11:15, Craig Thomas wrote: > OSDL has created a database workload that simulates an ecommerce book > store. The back end of the database is SAP. The tool is named DBT-1 > and is derived from the TPC-W benchmark. The code is Open Source and > can be found on sourceforge: > > http://sourceforge.net/projects/osdldbt/ > > > We have posted initial performance statistics on the following web site: > > http://www.osdl.org/projects/dbt1prfrns/results > > The table will eventually grow as more kernels are tested on different > systems. Currently we have performance characteristics for a 2-way > system, 1 GB memory, utilizing 10 disk spindles for its database. The > other performance measurements are conducted on a 8-way system, 16 GB > memory, utilizing 12 disk spindles for its database. These runs involve > the same number of users accessing the database. > > In addition to capturing database transaction numbers, data is captured > for I/O, memory usage, and CPU utilization. Currently, the performance > work is being conducted on 2.4 kernels. > > However, STP is running a smaller version of this test on 2.5 kernels. > The first results from this run is located at > > http://khack.osdlab.org/stp/3441/ > > It would be interesting to see how this database workload performed on > other types of systems. Is there anyone willing to try this out? > > -- > Craig Thomas phone: 503-626-2455 ext. 33 > Open Source Development Labs email: cr...@os... > > > > ------------------------------------------------------- > This sf.net email is sponsored by: Dice - The leading online job board > for high-tech professionals. Search and apply for tech jobs today! > http://seeker.dice.com/seeker.epl?rel_code=31 > _______________________________________________ > Lbs-tech mailing list > Lbs...@li... > https://lists.sourceforge.net/lists/listinfo/lbs-tech -- Timothy D. Witham - Lab Director - wo...@os... Open Source Development Lab Inc - A non-profit corporation 15275 SW Koll Parkway - Suite H - Beaverton OR, 97006 (503)-626-2455 x11 (office) (503)-702-2871 (cell) (503)-626-2436 (fax) |
|
From: Craig T. <cr...@os...> - 2002-07-29 18:15:46
|
OSDL has created a database workload that simulates an ecommerce book
store. The back end of the database is SAP. The tool is named DBT-1
and is derived from the TPC-W benchmark. The code is Open Source and
can be found on sourceforge:
http://sourceforge.net/projects/osdldbt/
We have posted initial performance statistics on the following web site:
http://www.osdl.org/projects/dbt1prfrns/results
The table will eventually grow as more kernels are tested on different
systems. Currently we have performance characteristics for a 2-way
system, 1 GB memory, utilizing 10 disk spindles for its database. The
other performance measurements are conducted on a 8-way system, 16 GB
memory, utilizing 12 disk spindles for its database. These runs involve
the same number of users accessing the database.
In addition to capturing database transaction numbers, data is captured
for I/O, memory usage, and CPU utilization. Currently, the performance
work is being conducted on 2.4 kernels.
However, STP is running a smaller version of this test on 2.5 kernels.
The first results from this run is located at
http://khack.osdlab.org/stp/3441/
It would be interesting to see how this database workload performed on
other types of systems. Is there anyone willing to try this out?
--
Craig Thomas phone: 503-626-2455 ext. 33
Open Source Development Labs email: cr...@os...
|
|
From: Bill B. <bra...@us...> - 2002-05-21 20:11:49
|
SPECcpu is not a good benchmark for the Linux kernel since during the
benchmark, one is system state less than 5% of the time.
The SPECcpu benchmarks are most effected by the compiler and the CPU
organization and speed.
I agree that the Linux community needs their benchmarks to be freely
available to allow anyone to replicate the results.
I am still not sure what your goal is? Please elaborate.
Bill
William C. Brantley, Ph.D.
Linux Technology Center, Performance,
512-838-8505, t/l 678, fax -5573
Hiro Yoshioka
<hyoshiok@miracle To: Bill Brantley/Austin/IBM@IBMUS
linux.com> cc: lbs...@li..., hyo...@mi...
Subject: Re: [Lbs-tech] free spec benchmark?
05/20/2002 09:02
PM
Bill,
Thanks for your help.
SPEC benchmarks are not free.
http://www.spec.org/cgi-bin/osgorder
They are inexpensive.
I'd like to run SPEC CPU95 (It is an obsolete benchmark.)
Because there are a lot of detailed results published, for example,
research papers. What I'd like to do is, 1) to compare the
published results with running on the linux kernel. 2) to measure
the detail behavior of the kernel, e.g., CPI, cache miss,
memory traffics, TLB miss, and so on. 3) to exchange benchmark
results with the linux community.
My concern is the linux community may not have
the SPEC benchmarks. If so, exchanging the benchmark
results might be difficult.
I think the motivation of development of OSDL-DBT is
very similar.
http://www.osdl.org/projects/performance/osdldbt.html
Doesn't it make sence to have open source CPU benchmark suits?
Regards,
Hiro
|
|
From: Hiro Y. <hyo...@mi...> - 2002-05-21 02:09:19
|
Bill, Thanks for your help. SPEC benchmarks are not free. http://www.spec.org/cgi-bin/osgorder They are inexpensive. I'd like to run SPEC CPU95 (It is an obsolete benchmark.) Because there are a lot of detailed results published, for example, research papers. What I'd like to do is, 1) to compare the published results with running on the linux kernel. 2) to measure the detail behavior of the kernel, e.g., CPI, cache miss, memory traffics, TLB miss, and so on. 3) to exchange benchmark results with the linux community. My concern is the linux community may not have the SPEC benchmarks. If so, exchanging the benchmark results might be difficult. I think the motivation of development of OSDL-DBT is very similar. http://www.osdl.org/projects/performance/osdldbt.html Doesn't it make sence to have open source CPU benchmark suits? Regards, Hiro From: "Bill Brantley" <bra...@us...> Subject: Re: [Lbs-tech] free spec benchmark? Date: Mon, 20 May 2002 09:26:18 -0500 Message-ID: <OFA...@po...> > Hiro, SPEC benchmarks are not free. They are licensed. But there are > inexpensive licenses for educational institutions and non-profit > organizations. But even the full price licenses for some of the SPEC > benchmarks is cheaper that a full single user license for popular office > suites. > > In which specific SPEC benchmark are you interested? > Bill > William C. Brantley, Ph.D. > Linux Technology Center, Performance, > 512-838-8505, t/l 678, fax -5573 > > ----- Forwarded by Bill Brantley/Austin/IBM on 05/20/2002 09:14 AM ----- > > Sandra J Baylor > To: LinuxPerformance > 05/20/2002 08:51 cc: > AM From: Sandra J Baylor/Silicon Valley/IBM@IBMUS > Subject: Re: [Lbs-tech] free spec benchmark?(Document link: WBrantley > mail) > > > > > > > > > > Is everyone on this mailing list? Can anyone on our team respond to Hiro's > question? > > Regards, > Sandra Johnson Baylor, Ph.D. > Manager, Linux Performance > Linux Technology Center > (512) 838-4983, T/L 678-4983 > (512) 838-4663 - FAX, T/L 678-4663 > san...@us... > > > > > Hiro Yoshioka > <hyo...@mi... To: lbs...@li... > m> cc: hyo...@mi... > Sent by: Subject: [Lbs-tech] free spec benchmark? > lbs...@li... > ceforge.net > > > 05/20/2002 04:54 AM > > > > > > Hi, > > I have a question. > > Do you know a free implementation version of SPEC benchmarks? > I know the SPEC benchmarks are set of free software and > some proprietry specifications. > > I'd like to run SPEC compatible benchmarks. > > Thanks in advance, > Hiro > > _______________________________________________________________ > Hundreds of nodes, one monster rendering program. > Now that's a super model! Visit http://clustering.foundries.sf.net/ > > _______________________________________________ > Lbs-tech mailing list > Lbs...@li... > https://lists.sourceforge.net/lists/listinfo/lbs-tech > > > > > |
|
From: Bill B. <bra...@us...> - 2002-05-20 14:28:19
|
Hiro, SPEC benchmarks are not free. They are licensed. But there are
inexpensive licenses for educational institutions and non-profit
organizations. But even the full price licenses for some of the SPEC
benchmarks is cheaper that a full single user license for popular office
suites.
In which specific SPEC benchmark are you interested?
Bill
William C. Brantley, Ph.D.
Linux Technology Center, Performance,
512-838-8505, t/l 678, fax -5573
----- Forwarded by Bill Brantley/Austin/IBM on 05/20/2002 09:14 AM -----
Sandra J Baylor
To: LinuxPerformance
05/20/2002 08:51 cc:
AM From: Sandra J Baylor/Silicon Valley/IBM@IBMUS
Subject: Re: [Lbs-tech] free spec benchmark?(Document link: WBrantley
mail)
Is everyone on this mailing list? Can anyone on our team respond to Hiro's
question?
Regards,
Sandra Johnson Baylor, Ph.D.
Manager, Linux Performance
Linux Technology Center
(512) 838-4983, T/L 678-4983
(512) 838-4663 - FAX, T/L 678-4663
san...@us...
Hiro Yoshioka
<hyo...@mi... To: lbs...@li...
m> cc: hyo...@mi...
Sent by: Subject: [Lbs-tech] free spec benchmark?
lbs...@li...
ceforge.net
05/20/2002 04:54 AM
Hi,
I have a question.
Do you know a free implementation version of SPEC benchmarks?
I know the SPEC benchmarks are set of free software and
some proprietry specifications.
I'd like to run SPEC compatible benchmarks.
Thanks in advance,
Hiro
_______________________________________________________________
Hundreds of nodes, one monster rendering program.
Now that's a super model! Visit http://clustering.foundries.sf.net/
_______________________________________________
Lbs-tech mailing list
Lbs...@li...
https://lists.sourceforge.net/lists/listinfo/lbs-tech
|
|
From: Hiro Y. <hyo...@mi...> - 2002-05-20 10:01:18
|
Hi, I have a question. Do you know a free implementation version of SPEC benchmarks? I know the SPEC benchmarks are set of free software and some proprietry specifications. I'd like to run SPEC compatible benchmarks. Thanks in advance, Hiro |
|
From: Shailabh N. <na...@us...> - 2001-03-06 15:25:10
|
Thanks for the updates John !
2) is a useful reminder that there are other machines out there :-)
3) The global variable increments inside local_exec() are a hangover from
code testing. Removing them is good to get cleaner code (from a cache
viewpoint). But it might also be interesting to introduce a *controlled*
amount of cache pollution in that (or some other) function to better model
realistic applications. Any ideas ?
I'll update the benchmark on the website alongwith some minor code cleanup
(mostly to make it more readable)...
Shailabh Nagar
Enterprise Linux Group, IBM TJ Watson Research Center,
914-945-2851
John Hawkes <ha...@ba...>@lists.sourceforge.net on
03/05/2001 02:29:05 PM
Sent by: lse...@li...
To: lse...@li...
cc:
Subject: [Lse-tech] patch for "reflex" v1.1 benchmark
I attach a patch to Shailabh Nagar's (na...@us...) "reflex" benchmark
to
remedy several flaws and features of version 1.1.
1) Threads are now cloned without the CLONE_FILES directive. That
eliminates
the universal sharing of "struct files", which produces an awful
cacheblock
ping-ponging in fget() as every I/O flogs at the rwlock_t in that
struct.
2) The total[] array elements are now padded out to 128 bytes, rather than
32 bytes, in order to avoid usermode cacheblock ping-ponging on systems
with
128-byte L2 cachelines (like the SGI mips64 Origin2000).
3) The local_exec() cputime-eater was previously sharing a rapidly
incrementing
counter among all the cloned threads, which produced massive usermode
cacheblock ping-ponging. The new local_exec() now touches only
thread-private memory.
4) The calibration() routine now accurately calibrates the "inner loop",
which
means that the -r argument more accurately declares the microsecond
length of each "round".
5) And I did some trivial other changes that don't materially affect the
behavior of the benchmark.
This version of "reflex" is (in my opinion) a much better vehicle for
examining cpu scheduling behavior.
John Hawkes
ha...@en...
diff --exclude-from=ignore.reflex -Naur reflex_1.1/reflex.c
reflex_1.2/reflex.c
--- reflex_1.1/reflex.c Mon Feb 12 14:13:09 2001
+++ reflex_1.2/reflex.c Mon Mar 5 10:28:29 2001
@@ -1,3 +1,5 @@
+/* #define DEBUG_CALIBRATION 1 */
+/* #define DEBUG_TIMING 1 */
/*
* reflex.c - flexible benchmark for Linux SMP scheduler
*
@@ -47,7 +49,7 @@
#define STACK_SIZE (8192)
-#define CLONE_FLAGS (CLONE_VM | CLONE_SIGHAND | CLONE_FS |
CLONE_FILES)
+#define CLONE_FLAGS (CLONE_VM | CLONE_SIGHAND | CLONE_FS)
#define DEF_PERCENT (1)
#define NUM_WARMUP (1)
#define MIN_TRIALS (NUM_WARMUP+5)
@@ -83,9 +85,9 @@
void run_test_time(void);
int bouncer(void *arg);
int (*worker) (void *arg);
-double local_exec(void) ;
+int local_exec(void) ;
double probrange(unsigned long top);
-void calibration(void) ;
+int calibration(void) ;
float variance(int n, float sum, float sum2);
double child_avg();
double child_var();
@@ -96,9 +98,7 @@
char *child_stack ;
-struct timezone tz1;
struct timeval tv1;
-struct timezone tz2;
struct timeval tv2;
struct timeval tvr;
@@ -118,7 +118,6 @@
int valid_test = 1;
int TOKENSZ; /* size of message treated as token */
double rounds_per_microsecond = 0.0 ; /* obtained through calibration */
-int local_exec_count = 0; /* unused */
int comp_nonyield_rounds ; /* number of rounds of execloop for
nonyield*/
int comp_yield_rounds ; /* number of rounds of execloop for
yield*/
@@ -153,7 +152,7 @@
struct _total
{
unsigned long long count;
- char pad[24];
+ char pad[120];
} *total;
@@ -185,9 +184,6 @@
}
}
-
-
-
if ((num_seconds <= 0) ||
(num_children <= 0) || (num_children > MAX_CHILDREN) ||
(num_active <= 0) || (num_active > num_children) ||
@@ -231,13 +227,14 @@
}
/* calibrate internal loops */
- calibration();
- probyield = local_exec_count ;
+ exit_rc = calibration();
+ if (exit_rc) {
+ goto exit_main3;
+ }
TOKENSZ = sizeof(char);
probyield = (100.0-(double)weight_reschedule_idle)/100.0 ;
probyieldmult = probyield / ((1-probyield)*(1-probyield));
-
/*
comp_nonyield_rounds = (int) (uniform(rnd_compute_time) *
rounds_per_microsecond) ;
comp_yield_rounds = (int) (comp_nonyield_rounds * probyieldmult) ;
@@ -560,14 +557,17 @@
restart:
- if (!fquiet) printf (".");
+ if (!fquiet) {
+ printf (".");
+ fflush(stdout);
+ }
prev_y = 0;
y = 0;
/* get the start time */
- rc = gettimeofday (&tv1, &tz1);
+ rc = gettimeofday (&tv1, NULL);
if (rc) {
stop_test = 1;
exit_rc = errno;
@@ -580,11 +580,10 @@
for (i = 0 ; i < num_children ; i++) prev_y += total[i].count;
sleep (num_seconds);
for (i = 0 ; i < num_children ; i++) y += total[i].count;
- // printf("Rerun : Across children : Avg %15.2f \t Var
%15.2f\n",child_avg(),child_var());
/* get end time */
- rc = gettimeofday (&tv2, &tz2);
+ rc = gettimeofday (&tv2, NULL);
if (rc) {
stop_test = 1;
exit_rc = rc;
@@ -596,6 +595,8 @@
/* compute microseconds per yield */
+ // printf("Rerun : Across children : Avg %15.2f \t Var
%15.2f\n",child_avg(),child_var());
+
timersub(&tv2, &tv1, &tvr); /* tvr now contains result of tv2-tv1 */
x = (unsigned long long)tvr.tv_sec * 1000000;
@@ -603,6 +604,14 @@
results[iterations].data = (float)x;
results[iterations].data /= (float)(y - prev_y);
+#ifdef DEBUG_TIMING
+ printf("Counts:");
+ for (i = 0 ; i < num_children ; i++) {
+ printf(" %d:%d",i,(int)total[i].count);
+ }
+ printf("\nTotalCount:%d\n",(int)(y-prev_y));
+#endif /* DEBUG_TIMING */
+
iterations++;
if (confidence(iterations)) {
stop_test = 1;
@@ -612,6 +621,8 @@
if (!fquiet) printf (" Test Completed.\n");
+ // process_data();
+
switch (foutput) {
case 1:
@@ -644,40 +655,77 @@
}
-double local_exec()
+int local_exec()
{
unsigned int a = 0, b=0;
// memcpy(&a,&b,1);
- local_exec_count++;
+ return a+b;
}
-void calibration(void)
+int calibration(void)
{
/* figure out how many loops we can execute per micro second */
int i;
- int count = 0;
- unsigned long n_initial = 100000;
- unsigned long clock1, clock2, clockterm;
-
- clock1 = clock() ;
- clockterm = clock1 + 5*CLOCKS_PER_SEC;
- do {
- for(i=0; i< n_initial; i++)
- {
- local_exec();
- }
- clock2 = clock();
- count++;
- } while (clock2 < clockterm);
+ int count;
+ unsigned long microsecs;
+ int rc, exit_rc;
+
+ count = 10000000;
+ rc = gettimeofday (&tv1, NULL);
+ if (rc) {
+ exit_rc = errno;
+ perror ("gettimeofday failed on tv1 ");
+ return exit_rc;
+ }
+ for (i=1; i<=count; i++) {
+ local_exec();
+ }
+ rc = gettimeofday (&tv2, NULL);
+ if (rc) {
+ exit_rc = errno;
+ perror ("gettimeofday failed on tv2 ");
+ return exit_rc;
+ }
+ timersub(&tv2, &tv1, &tvr); /* tvr now contains result of tv2-tv1 */
- n_initial *= count;
+ microsecs = (tvr.tv_sec * 1000000) + tvr.tv_usec;
+ rounds_per_microsecond = (double)count / (double)microsecs;
- rounds_per_microsecond = ((double)n_initial*CLOCKS_PER_SEC) /
((double)(clock2-clock1) * 100000.0);
+#ifdef DEBUG_CALIBRATION
+ {
+ int test_rounds = rnd_compute_time * rounds_per_microsecond;
+ printf("%d rounds in %u usec is rounds_per_microsecond:%f\n",
+ count, microsecs, rounds_per_microsecond);
+ gettimeofday (&tv1, NULL);
+ for (i=0; i<test_rounds; i++) {
+ local_exec();
+ }
+ gettimeofday (&tv2, NULL);
+ timersub(&tv2, &tv1, &tvr);
+ printf("test: %d rounds in %d secs, %u usecs\n",
+ test_rounds, tvr.tv_sec, tvr.tv_usec);
+ gettimeofday (&tv1, NULL);
+ for (i=0; i<test_rounds; i++) {
+ local_exec();
+ }
+ gettimeofday (&tv2, NULL);
+ timersub(&tv2, &tv1, &tvr);
+ printf("test: %d rounds in %d secs, %u usecs\n",
+ test_rounds, tvr.tv_sec, tvr.tv_usec);
+ gettimeofday (&tv1, NULL);
+ for (i=0; i<test_rounds; i++) {
+ local_exec();
+ }
+ gettimeofday (&tv2, NULL);
+ timersub(&tv2, &tv1, &tvr);
+ printf("test: %d rounds in %d secs, %u usecs\n",
+ test_rounds, tvr.tv_sec, tvr.tv_usec);
+ }
+#endif
-// rounds_per_microsecond = (n_initial*1000000) /
((double)(clock2-clock1)*CLOCKS_PER_SEC);
- /* printf(">> [%ld] %ld %ld
%lf\n",CLOCKS_PER_SEC,n_initial,(clock2-clock1),rounds_per_microsecond); */
+ return 0;
}
/******************** Statistical functions
**********************************/
_______________________________________________
Lse-tech mailing list
Lse...@li...
http://lists.sourceforge.net/lists/listinfo/lse-tech
|
|
From: Shailabh N. <na...@us...> - 2001-02-12 21:27:15
|
As a step towards getting more fine-tuned benchmarks to compare schedulers,
here's a new benchmark that I wrote.
Briefly, it attempts to address some of the problems seen with the
chat_room benchmark and gives some more parameters to play
with.
I'll post some numbers comparing 2.4.1-pre8 and the current MQ-scheduler
using an earlier version of this benchmark soon.
Comments please....
Shailabh Nagar
Enterprise Linux Group, IBM T.J.Watson Research Center,
914-945-2851
reflex 1.0.0 : benchmark for evaluating the Linux scheduler
===========================================================
by Shailabh Nagar (na...@us...)
Based on sched_test_yield by Bill Hartner, bha...@us...
Algorithm :
-----------
The benchmark's goal is to provide a sufficient number of tunable
knobs by which to test kernel scheduler performance and introduce minimal
overheads and dependencies.
The program begins with the parent cloning several threads. The threads group
themselves into "active sets". Only one thread from each active set is on the
runqueue at any given time (except when a handoff of the token is performed -
see below). By choosing the number of threads spawned and number of active
sets, the average number of runnable tasks can be controlled.
Each thread executes a "round" consisting of
Either Path I :
1. compute
2. yield
Or Path II :
1. compute
2. send token to successor
3. block (by waiting for token from predecessor)
The probability of following Path I or II
is decided by the input parameter w, which is the weight to be assigned to
reschedule_idle(). Path II causes more calls to reschedule_idle() as it
causes more sleeps/wakeups (on messages in a pipe), while Path I exercises the
schedule() part of the scheduler.
Tokens are one byte messages and are handed off to other members of the active set through pipes. Pipes are used instead of
semaphores/sockets etc. because the lock contention on these is lower
(its restricted to contention amongst the readers/writers of the pipe alone -
not a system-wide IPC lock as is the case with semaphores etc.)
The compute phase of a round consists of a small calibrated function being
executed repeatedly for a user-specified amount of time (in the microsecond
range).
The compute part is present to model realistic workloads and to control the
number of scheduler invocations (caused by the yield/block parts of the round)
Slight randomness is introduced in the compute time for each thread
to avoid scheduling patterns being formed.
The overall performance metric is the total number of rounds done by all
children. This is expressed as microseconds/round : higher the number,
worse is the performance of the scheduler. Naturally, this metric should be
used carefully if its the sole arbiter of scheduler performance as it may not
adequately reflect "desired" scheduler behaviour.
Runtime parameters :
--------------------
1) w : legal values 0-100
Runtime paramter w determines the weight of "reschedule_idle" as follows :
probability of sched_yield = (100-w)/100
The more the w, the lesser number of times sched_yield is called and hence
lower is the number of invocations to schedule(). The relative number of calls
to reschedule_idle and schedule has to be determined experimentally for a given
number of threads by tweaking the w parameter.
2) r : legal values 0..infinity
Runtime parameter r determines the number of microseconds to be spent in the
compute phase.
3) c : number of children/threads to be launched
Currently this has to be even (see below)
4) a : number of active children == number of active sets
The subset of c which are going to be on the runqueue. If an active set has
only one member in it, that thread will never block. For a very brief interval,
there could be two threads on the runqueue from the same active set (since
handoff of a token isn't atomic) but the path length from token handoff to
blocking is short enough that this shouldn't be a problem.
5) t : Time (in seconds) for which each run of the test should be performed
Several runs are done till the primary metric, microseconds/round, reaches a 95% confidence level.
6) o : output format
2 is most suitable for processing output using scripts.
7) q : quiet mode (no extraneous printfs)
Improvements/todo's :
---------------------
The effect of parameters w,r,c,a on the metric us/round, has been
tested on a 4-way Pentium-II SMP using simple profiling of the kernel
scheduler. Lower values of w cause schedule() calls to increase and
values around 75-90 cause the number of schedule() and reschedule()
calls to be roughly the same. Higher r values increase us/round.
Values of t in the range 20-50 seconds is adequate to reach 95%
confidence in reasonably short number of runs - this might need to increase
on 8-ways.
Todos :
- verifying that the dependencies of token passing in a set do not affect the
scheduler (other than the desired ones of causing wakeups, regulating #threads
on runqueue etc.)
===============================================================================
Sample run :
reflex -c 80 -a 40 -t 40 -w 90 -r 1 -o 2 -q
results in output that looks like
80, 40, 50.65
where the first two numbers identify the parameter set and the last one is
the metric, us/round.
Automated runs (over several c values) can be done by :
runreflex <directory_for_results>
and modifying the "for tc in ......." line appropriately (to run over the
set of thread counts that the test should be run). The runreflex script needs
some work.
===============================================================================
Comments on assumptions/performance welcomed.So is data from running this
benchmark on different SMP systems.
- Shailabh Nagar
na...@us...
(914) 945-2851
(See attached file: reflex.tar)
|
|
From: cardente, j. <car...@em...> - 2001-01-26 19:41:09
|
> [snip] > I can think of a totally synthetic benchmark in which threads do a >(possibly random) amount of compute followed (randomly) by > a sleep (for a random time) or a yield or even some I/O. The computation > could be constructed to pollute the cache a bit etc. > The number of threads spawned would of course, be parameterized as well. > > I guess the key would be in the choice of the parameters, but what would it > mean in terms of acceptability as a kernel scheduler benchmark ? > > Shailabh Nagar > (914) 945 2851 > na...@us... I ran into a similar benchmark "appropriateness" issue while doing some (rudimentary) performance evaluations of the 2.2.x kernel's affinity support in SMP systems. Specifically I was trying quantify how cache pollution effected performance and how effective the static bonus given in the goodness value was at minimizing cache pollution. Unfortunately I was under a time limit and couldn't find an off the shelf multi-threaded compute bound benchmark so I coded up a synthetic one. I ended up seeing some interesting behaviors (i.e. the affinity bonus reducing performance at #threads=#cpus) but because it was synthetic judging the worth of the results was not easy. I've yet to revisit the experiment with the 2.4 kernel and better benchmarks. Perhaps a reasonable approach to this problem would be the development of a configurable synthetic benchmark that could be used to emulate the behavior of a target real-world app. That might allow a way to preserve the scheduler specific behaviors of a workload without dragging along any parasitic issues (i.e. like the networking problems that have been brought up). Of course the degrees of configurability may too large and complex to completely be able to emulate all possible workloads but I'm guessing something could be banged together that would render a good first order approximation. Wrap that with scripts that could create a config file from performance tools profiling a running app and the world would be a better place....... well easier to benchmark at least... ;-) I personally regard micro-benchmarks as precision tools for chasing sub-system specific issues identified by running more complex real-world benchmarks and therefore not the proper tool for estimating performance. Re-running the real-world benchmark is the only way to determine the impact of any changes and periodic check-pointing avoids chasing micro benchmark specific behaviors (at least not for too long). Why use them at all then? Well to avoid unrelated issues (again the network stuff) and any overhead of setting up the real benchmark (maybe not significant for the chat script but what about a TCP-C/D workload?). Of course I'm probably preaching to the choir.... Thanks John P.s. If I did return to my affinity experiments anybody got any suggestions for a workload? How about monitoring tools. I basically hacked the kernel with stats and threw in a /proc interface but I'm sure there's tools already available. ---------------------------------------------------------------------------- - John Cardente car...@em... Principal Engineer 508-898-7340 EMC Enterprise Engineering 4400 Computer Dr, Westboro, MA 01580 |
|
From: Shailabh N. <na...@us...> - 2001-01-26 15:49:47
|
A few thoughts on the nature of chat room in the context of its use as a scheduler benchmark : Why are messages being exchanged between the server and client threads ? If it is to primarily cause threads to do a sleep/wakeup (and thereby exercise reschedule_idle()), can we achieve the same result by having threads do timed sleeps, with the sleep times randomly chosen from an appropriate distribution ? This would probably eliminate the TCP related problems we're seeing, but would it still be meaningful ? I can think of a totally synthetic benchmark in which threads do a (possibly random) amount of compute followed (randomly) by a sleep (for a random time) or a yield or even some I/O. The computation could be constructed to pollute the cache a bit etc. The number of threads spawned would of course, be parameterized as well. I guess the key would be in the choice of the parameters, but what would it mean in terms of acceptability as a kernel scheduler benchmark ? Shailabh Nagar (914) 945 2851 na...@us... |
|
From: Ray B. <ra...@au...> - 2001-01-22 21:15:23
|
Ooops, I posted this to the wrong list. I meant to send it to lbs-tech, but my fingers took over and posted it to lse-tech. Sorry. ---- Best Regards, Ray Bryant IBM Linux Technology Center ra...@au... 512-838-8538 http://oss.software.ibm.com/developerworks/opensource/linux We are Linux. Resistance is an indication that you missed the point. "...the Right Thing is more important than the amount of flamage you need to go through to get there" --Eric S. Raymond |