From: Alon K. <al...@cs...> - 2002-07-31 18:20:51
|
Hi Perry (and everyone else), As I understand the structure of the deterministic thread switching compiler, it is not preemptive. That is, the yield points are explicit and only occur after 1000 "ticks" of the thread switch counter, which is decremented on backedges and epilogues. Could you please clarify what you meant about the 10ms quantum? The problem that I'm hitting is upon running a suite of benchmarks and profiling the time it takes to complete 10,000 context switches, such a block may vary from 5 seconds to 13 seconds on a 1.7GHz processor. I attribute this to the different benchmarks being radically different in structure. I'd like for these blocks to be more consistent in run time. Any advice would be appreciated. Thanks, -Alon. >The distribution of the yield points does affect the promptness with which >a thread responds to a request to yield. You are right that a large loop >body or any large piece of non-cyclic basic blocks would respond more >slowly than a small loop body but keep in mind that the scheduling quantum >is, by default, 10 ms which is probably much bigger than even executing a >large call-less loop body. For example, assuming the large loop body has >10,000 instructions with an average CPI of 2.0, then the delay is still 20K >cycles, which on a 1Ghz machine, is 50 us or 0.5% of the time slice. > >How large of a disparity are you observing? > >Perry |