Thread: RE: [Linux-hls-devel] HLS errors appearing with cpu reservations
Status: Pre-Alpha
Brought to you by:
lucabe
|
From: Paul K. <pko...@au...> - 2003-08-29 04:24:57
|
Hi Luca,
Yes, the patch to bottom.c seems to have fixed the problem with
hourglass.
I'm using 2.4.18 with HLS gensched patch, and (still for now) RML's
variable-hz, preemption and lock-break patches.
Typing=20
hourglass -n 2 -t 1 -rh 10ms 50ms -w PERIODIC 10ms 50ms
gave the following output:
...
thread 1 will have hard res of amount 10000 and period 50000
thread 1 will run workload PERIODIC
thread 1 will have amount 10000
thread 1 will have period 50000
5.722046 MB allocated for trace records
Hourglass 0.5 : 2 threads; 10.000000 seconds; 1194.408907 MHz processor
max gap is 2388 cycles
this test will last for 10.000000 seconds
numthreads: 2
work done by thrd 0 : 125491982
work done by thrd 1 : 25940366
thread 1: missed 170 deadlines, hit 20
there were 10009 out of 300000 trace records; 3.336333 % used.
in cycles, trace start 718129657440, end 730083432498, duration
11953775058
trace duration 10.008109 seconds
thread 0 recorded 7.990135 seconds
thread 1 recorded 1.890859 seconds
total thread time 9.880994 seconds
time slots in ms: thread start end duration gap:
tracerec: 1 1.012524 1.987656 0.975132 601243.730409
tracerec: 1 1.999265 2.988340 0.989075 0.011609
tracerec: 1 2.999037 3.989064 0.990026 0.010697
tracerec: 1 3.999727 4.989773 0.990046 0.010663
...=20
The 10s trace looks as I expected apart from a 500ms portion near the
start. Task 1 gets the CPU for its first period and then nothing until
550ms; thereafter it gets 10ms every 50ms.
(BTW, the missed deadlines count drops to only 1 if 11ms is reserved for
task 1 rather than 10ms exactly; John pointed the need to do this in an
hourglass conference paper I have.)=20
However, dmesg still shows us some HLS issues:
HLS MP initializing (HLS_DBG_PRINT_LEVEL =3D 1).
[738959619], 801 : sched 'ROOT' registered in slot 0
[812802016], 801 : sched 'JOIN' registered in slot 1
[886628438], 801 : sched 'TH' registered in slot 2
[957733126], 801 : sched 'RR' registered in slot 3
[1028835251], 801 : sched 'PS' registered in slot 4
[1101303379], 801 : sched 'RES' registered in slot 5
already PRIVATE_DATA !=3D NULL???
already PRIVATE_DATA !=3D NULL???
HLS Error: 902 has private_data =3D NULL???
../hls/hls_hooks.c:56: failed HLS assertion: "(State =3D=3D
TASK_UNINTERRUPTIBLE) || (State =3D=3D TASK_INTERRUPTIBLE)".
../hls/hls_hooks.c:56: failed HLS assertion: "(State =3D=3D
TASK_UNINTERRUPTIBLE) || (State =3D=3D TASK_INTERRUPTIBLE)".
HLS ERROR: not panicing, but continuing...
HLS ERROR: not panicing, but continuing...
hls_ctl: Moving to res1
...and setting the parameters!
When I get a chance I'll rebuild the 2.4.18 kernel without RML's
preemption and lock breaking patches and see if the HLS error msgs go
away.
Regards,
Paul Koufalas
Senior Communications Engineer
AUSPACE Limited
Level 1 Innovation House
Technology Park MAWSON LAKES SA 5095
AUSTRALIA
T +61 8 8260 8236
F +61 8 8260 8226
M +61 404 837 122
www.auspace.com.au
This email is for the intended addressee only. If you have received this
e-mail in error, you are requested to contact the sender and delete the
e-mail. Nothing in this email shall bind Auspace Limited in any contract
or obligation.
-----Original Message-----
From: Luca Abeni [mailto:luc...@em...]=20
Sent: Friday, August 29, 2003 2:38 PM
To: lin...@li...
Subject: Re: [Linux-hls-devel] HLS errors appearing with cpu
reservations
Hi Paul,
I reproduced the bug, and I think I fixed it.
This was the problem: hourglass does first a sched_setscheduler(p,
SCHED_FIFO, ...), and then does another sched_setscheduler() to schedule
it with the res scheduler. But the first setsched() transformed the task
in a regular linux task, and the second one complained about this fact
without changing the task to HLS...
Now, it should work. Have a look at the patch on the cvs commit mailing
list (it will appear in the anonymous cvs in 1 or 2 days).
This brings up a question: what should we do when a sched_setscheduler()
is performed on an HLS task changing the policy to SCHED_FIFO, SCHED_RR,
or SCHED_OTHER? Two possibilities:
1) do nothing (the task remains an HLS task)
2) transform the task in a regular linux task. This is a little bit
conterintuive, because regular linux task are scheduled in background
respect to HLS tasks...
Other possibilities would require to know the scheduler hierarchy in
advance...
Ideas and suggestions are welcome.
Thanks,
Luca
--=20
________________________________________________________________________
_____
Copy this in your signature, if you think it is important:
N O W A R ! ! !
--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f
Sponsor:
Non hai ancora aperto Conto Arancio? Allora clicca qui.
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=3D667&d=3D28-8
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf _______________________________________________
Linux-hls-devel mailing list Lin...@li...
https://lists.sourceforge.net/lists/listinfo/linux-hls-devel
|
|
From: Luca A. <luc...@em...> - 2003-08-29 05:25:49
|
Hi Paul,
On Thu, 2003-08-28 at 21:24, Paul Koufalas wrote:
> Hi Luca,
>
> Yes, the patch to bottom.c seems to have fixed the problem with
> hourglass.
Good to know. Thanks for reporting the problem and testing the fix.
> Typing
>
> hourglass -n 2 -t 1 -rh 10ms 50ms -w PERIODIC 10ms 50ms
>
> gave the following output:
>
> ...
> thread 1 will have hard res of amount 10000 and period 50000
> thread 1 will run workload PERIODIC
> thread 1 will have amount 10000
> thread 1 will have period 50000
[...]
> time slots in ms: thread start end duration gap:
> tracerec: 1 1.012524 1.987656 0.975132 601243.730409
> tracerec: 1 1.999265 2.988340 0.989075 0.011609
> tracerec: 1 2.999037 3.989064 0.990026 0.010697
> tracerec: 1 3.999727 4.989773 0.990046 0.010663
> ...
>
> The 10s trace looks as I expected apart from a 500ms portion near the
> start. Task 1 gets the CPU for its first period and then nothing until
> 550ms; thereafter it gets 10ms every 50ms.
I think this can be due to the fact that the reservation period and the
task period are not synchronized (if I understand well, hourglass has no
way to ensure the fact that the two periods are in phase... Or am I
wrong? John, please correct me...).
> (BTW, the missed deadlines count drops to only 1 if 11ms is reserved for
> task 1 rather than 10ms exactly; John pointed the need to do this in an
> hourglass conference paper I have.)
In fact... Scheduling a (10, 50) task with a (10, 50) reservation is a
little bit risky, because there always could be up to 1jiffy (1ms if
HZ=1000) allocation error. A solution to the problem could be to
increase HZ; the "right way" to address the issue is to use
high-resolution-timers.
Integrating HLS with high-resolution-timers is on my todo list, but it
requires some important changes to the code... It will take some time
(although I know how to integrate the two things, because I already did
something similar for another project).
> However, dmesg still shows us some HLS issues:
[...]
> already PRIVATE_DATA != NULL???
> already PRIVATE_DATA != NULL???
Until here, everything is ok, I think
> HLS Error: 902 has private_data = NULL???
This is expected, as I did not remove the printk()
> ../hls/hls_hooks.c:56: failed HLS assertion: "(State ==
> TASK_UNINTERRUPTIBLE) || (State == TASK_INTERRUPTIBLE)".
> ../hls/hls_hooks.c:56: failed HLS assertion: "(State ==
> TASK_UNINTERRUPTIBLE) || (State == TASK_INTERRUPTIBLE)".
> HLS ERROR: not panicing, but continuing...
> HLS ERROR: not panicing, but continuing...
I do not know about this, but it can be due to my latest change (and is
probably harmless). I'll double-check this evening.
> When I get a chance I'll rebuild the 2.4.18 kernel without RML's
> preemption and lock breaking patches and see if the HLS error msgs go
> away.
That would be great.
Anyway, at this poing I would say that things are working ok even with
preemption and lock breaking.
Thanks,
Luca
--
_____________________________________________________________________________
Copy this in your signature, if you think it is important:
N O W A R ! ! !
--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f
Sponsor:
Le nuove iniziative del Garden Center Peraga: i Tour Day Peraga per andare alla scoperta del Canavese e il Labirinto delle Pannocchie per perdersi in unoasi di pace, a contatto con la natura.
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=1615&d=29-8
|
|
From: John R. <re...@cs...> - 2003-08-29 06:14:42
|
> > The 10s trace looks as I expected apart from a 500ms portion near the > > start. Task 1 gets the CPU for its first period and then nothing until > > 550ms; thereafter it gets 10ms every 50ms. > I think this can be due to the fact that the reservation period and the > task period are not synchronized (if I understand well, hourglass has no > way to ensure the fact that the two periods are in phase... Or am I > wrong? John, please correct me...). It's true that without help from the scheduler, Hourglass cannot ensure that a thread period is synched up with its reservation. However, this 550 ms anomaly sounds like it's too large to be explained by a phase difference, so I have no idea what's happening. I changed offices this summer and my HLS test box is still sitting in a corner turned off. I'll get it running in the near future and try to reproduce this. Paul, is the HLS code in the Hourglass pre-release working for you? If so, I should roll out Hourglass 0.6. There were a few other issues I was trying to iron out relating to CPU speed detection on laptops and other difficult machines but now that was so long ago I've forgotten the issues... > Integrating HLS with high-resolution-timers is on my todo list, but it > requires some important changes to the code... It will take some time > (although I know how to integrate the two things, because I already did > something similar for another project). Cool. Was this the CBS scheduler you guys put into Linux? John |
|
From: Luca A. <luc...@em...> - 2003-08-29 07:22:46
|
Hi John, > It's true that without help from the scheduler, Hourglass cannot ensure > that a thread period is synched up with its reservation. > > However, this 550 ms anomaly sounds like it's too large to be explained by > a phase difference, Yes you are right... 550ms is too much. This seems to be some kind of transient phenomenon happening on the startup of a periodic task (I do not see it if the hourglass task is not periodic). I am wondering if hourglass periodic tasks do some kind of synchronization on startup... Ok, I'll have a look at the code. > Paul, is the HLS code in the Hourglass pre-release working for you? I just sent it to him 1 minute ago. Anyway, it is working fine for me. If you can wait some days before the release, I have some cleanups to do, and I have a patch for PPC support (!!!). (yes, I tested it on my ibook ;). > > Integrating HLS with high-resolution-timers is on my todo list, but it > > requires some important changes to the code... It will take some time > > (although I know how to integrate the two things, because I already did > > something similar for another project). > > Cool. Was this the CBS scheduler you guys put into Linux? Yes; you can compare http://feanor.sssup.it/cgi-bin/cvsweb.cgi/cbs/src/timers.c?rev=1.8&content-type=text/x-cvsweb-markup and http://feanor.sssup.it/cgi-bin/cvsweb.cgi/cbs/src/hrt.c?rev=1.3&content-type=text/x-cvsweb-markup to see the modifications that are needed. Unfortunately, the latest version of the high-resolution-timers patch still does not export some symbols that are needed for using hr timers from a module, and an additional patch is needed. Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: L'interpretazione dei doni di orti, frutteti, prati e giardini nel nostro Ristoro Sunflower. Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=1478&d=29-8 |