linux-hls-devel Mailing List for Hierarchical Loadable Schedulers
Status: Pre-Alpha
Brought to you by:
lucabe
You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(17) |
Sep
(19) |
Oct
|
Nov
(3) |
Dec
|
2004 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2009 |
Jan
|
Feb
|
Mar
(4) |
Apr
|
May
(2) |
Jun
(10) |
Jul
(10) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2010 |
Jan
(1) |
Feb
(1) |
Mar
(1) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
(2) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2011 |
Jan
|
Feb
(1) |
Mar
(1) |
Apr
(1) |
May
|
Jun
(1) |
Jul
(2) |
Aug
(2) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
(1) |
Dec
(1) |
2014 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(1) |
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Perfect B. T. <no-...@fr...> - 2012-05-29 10:40:34
|
Watch this VIDEO: 7 Foods that reduce stubborn abdominal fat > [http://b65449ubqyzl9wekxfjfp-3v4e.hop.clickbank.net/] (Only visible for a short time.) Hope you can benefit from it... :-) Regards Andrew E-mail Disclaimer: The information transmitted is intended only for the person or entity to which it is addressed and may contain user account or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. Any information within this message is not necessarily the expressed view of the E-MAIL SENDER and the E-MAIL SENDER cannot be held liable in any way. The information, images, documents and views expressed in this e-mail are personal to the sender and do not expressly or implicitly represent official positions and policies. If you received this in error, please contact the sender and delete the material from any computer. Every effort has been made to accurately represent this product and it¹s potential. Your level of success in attaining the results claimed in our materials depends on your time and effort you devote to the programme. We cannot promise your success. Nor are we responsible for any of your actions. If you feel that this email represents UNCOLICITED BULK EMAIL please forward this email for review to Rep...@Sp.... Should you no longer wish to receive correspondence from this sender, please unsubscribe on this list here >[http://http://free-movie.co.za/int/unsubscribe.php?M=5217507&C=730fa179d8402de756e33707037851f1&L=40&N=193/] |
From: Robert K <koj...@ao...> - 2010-06-06 13:45:07
|
- This mail is a HTML mail. Not all elements could be shown in plain text mode. - To: lin...@li... Please do not be embarrass am seeking your help in order to receive the sum of 4.5million as result of surfeit profit we made. If you will be so kind enough to grant me the permission, I will be glad to give the details. I am ready to offer you 30% of the total amount for your input. If you are interested go ahead and provide me your full name/Address and your contact number. Regards Robert |
From: <lin...@li...> - 2009-07-22 15:56:49
|
Your mail to 'Linux-Info' with the subject Dear linux-info 82% 0FF on MensHealth. Is being held until the list moderator can review it for approval. The reason it is being held: Envoi par un non-abonné sur une liste reservée aux abonnés Either the message will get posted to the list, or you will receive notification of the moderator's decision. If you would like to cancel this posting, please visit the following URL: http://listes.tice.ac-caen.fr/cgi-bin/mailman/confirm/linux-info/a4dcb06369413497215435edf48fada7ed181c45 |
From: <lin...@li...> - 2009-07-17 17:44:59
|
Your mail to 'Linux-Info' with the subject RE: UK Pharmacy Online Sale 80% OFF! Is being held until the list moderator can review it for approval. The reason it is being held: Envoi par un non-abonné sur une liste reservée aux abonnés Either the message will get posted to the list, or you will receive notification of the moderator's decision. If you would like to cancel this posting, please visit the following URL: http://listes.tice.ac-caen.fr/cgi-bin/mailman/confirm/linux-info/983db9713e77bea3025afe755329cc258b2a7ef0 |
From: <lin...@li...> - 2009-06-16 15:28:56
|
This list is now closed, please use the linux-hotplug mailing list as described at http://vger.kernel.org/vger-lists.html#linux-hotplug instead. |
From: Luca A. <luc...@em...> - 2004-03-27 12:29:20
|
Hi all, I finally got an x86 machine for testing, and here are the first results: 1) I verified that hls works ok with kernel 2.4.18 (I tried both gcc 2.95 and gcc 3.2... 2.4.18 does not compile with gcc 3.3 - BTW, I am on debian unstable: does anyone know how to switch the default compiler from 2.95 to 3.* in debian? I am currently changing the /usr/bin/gcc symlink by hands) 2) I ported the gensched patch to 2.4.25, and tested it. From a quick test, hls seems to work ok. I created a "patches" directory, containing the 2.4.28 and 2.4.25 gensched patch 3) I noticed that when enabling modular compilation of the hls schedulers, the scheduler modules were missing the license, and I added it. I also noticed that the hls-cvs mailing list is not working anymore: maybe sf changed some settings, and the cvs configuration must be updated... Unfortunately, I have no time for this kind of things right now (nor I have time for updating the web site, or putting files in the sf download area...). If anyone has some time for administrating the sf project (fixing the cvs ml, updating, the web site, and so on) and is willing to do it, let me know... Thanks, Luca |
From: Luca A. <luc...@em...> - 2003-11-03 11:08:10
|
Hi John, > > 1) is HLS_MAXIMUM_PRIORITY a valid RR priority? Looking at the current > > implementation of hls_sched_rr:RR_B_CallbackMsg(), it seems that it is > > (there is a test "if (m->pri < 1 || m->Pri > HLS_MAXIMUM_PRIORITY)"...), > > but if I try to use it, I obtain an oops... Probably I did something > > wrong when porting from Windows to Linux? John, if you still have your > > old Windows source, can you have a look? > > I'll take a look in the morning. Ok, thanks. [...] > I think that my reasoning here was that a RES scheduler can't make > guarantees effectively if the CPU gets revoked from it. Also, a > reservation anywhere in the hierarchy should be able to be moved to a RES > scheduler near the root without affecting overall allocation. Uhmm... This is a very interesting theoretical problem... Some time ago, I was working on hierarchical reservations (a res scheduler attached to a "father" res scheduler), and I argued that from the rt guarantee point of view a hierarchy of reservations can be modeled as a single reservation (computing the proper values for such rsv)... But I was unable to proved this fact (there are weird cases in which the father reservation is active but the child is depleted, or the two rsv are not synchronized...). Anyway, I think that having the possibility to construct hierarchies with res scheduler in other places than the root would increase the flexibility of HLS. > Basically, you only use a reservation scheduler when > you need a fairly hard guarantee, so in this case why would you revoke the > CPU? Well, this is what is happening right now: 1) hourglass main process calls sched_setscheduler() assigning the highest rt priority to itself 2) as a result, the hls module moves the hourglass process to rr1, with the highest rt priority (this is the effect of my latest changes ;-) 3) the hourglass "worker threads" are moved to the res scheduler (because I selected -rh 10ms 50ms, or something like that). 4) When the main hourglass process is woken up, the assertion is triggered... [BTW... You know, I don't like hard guarantees and hrt too much ;)] When the assertion was disabled (because of my debugging configuration mess), the Revoke callback was doing nothing, and everything was working fine (I think this is because the hourglass main process does not really consume much time...). Hence, I suppose that leaving the callback empty (or printing a warning) would be ok for the moment... I am thinking about implementing the callback in a proper way. My current idea is that we should try to avoid putting too much policy in the HLS module (urgh... I am using the policy vs mechanism argument!!! Shame on me... ;). In other words, I'd be for "you get what you deserve": if a user attaches the RES scheduler to another scheduler, he must know what he is doing... And if he knows, maybe revoking the CPU from the RES scheduler is the right thing to do... I think a proper Revoke callback should: 1) account the executed time to the proper res task 2) stop the depletion timer Of course, a proper Grant callback should be implemented too... Is this ok? John, what do you think about this? Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Hai una fotocamera digitale e vuoi stampare le tue immagini su vera carta fotografica professionale? Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=1533&d=3-11 |
From: John R. <re...@cs...> - 2003-11-03 02:46:00
|
Hi Luca, It may take me a little while to page this in... > 1) is HLS_MAXIMUM_PRIORITY a valid RR priority? Looking at the current > implementation of hls_sched_rr:RR_B_CallbackMsg(), it seems that it is > (there is a test "if (m->pri < 1 || m->Pri > HLS_MAXIMUM_PRIORITY)"...), > but if I try to use it, I obtain an oops... Probably I did something > wrong when porting from Windows to Linux? John, if you still have your > old Windows source, can you have a look? I'll take a look in the morning. > 2) in hls_sched_res, the Revoke Callback is implemented as an ASSERT(0). > Since in the past I broke the whole debug & assert mechanism, this > assertion was never triggered, but now that I fixed the debugging stuff > hourglass is triggering the assertion. What's the reason for having > ASSERT(0) in the Revoke Callback? Can I assume that it is just because > it still has to be implemented? In this case, I'll change it in an > hls_printk("FIXME: Implement the revoke callback..."). Argh, this is one of the dirty little secrets of HLS... some schedulers work only on uniprocessor machines and some only work when connected directly to a root scheduler. Probably both restrictions are true of the RES scheduler. I think that my reasoning here was that a RES scheduler can't make guarantees effectively if the CPU gets revoked from it. Also, a reservation anywhere in the hierarchy should be able to be moved to a RES scheduler near the root without affecting overall allocation. Or something like that. Basically, you only use a reservation scheduler when you need a fairly hard guarantee, so in this case why would you revoke the CPU? John |
From: Luca A. <luc...@em...> - 2003-11-01 11:43:02
|
Hi all, I am going to commit some changes that I did to implement the "rt scheduler" (similar to the default scheduler) discussed in my previous mail and to clean up the sources a little bit. While working on this modifications, I found some strange things in the code, hence I have some questions for John: 1) is HLS_MAXIMUM_PRIORITY a valid RR priority? Looking at the current implementation of hls_sched_rr:RR_B_CallbackMsg(), it seems that it is (there is a test "if (m->pri < 1 || m->Pri > HLS_MAXIMUM_PRIORITY)"...), but if I try to use it, I obtain an oops... Probably I did something wrong when porting from Windows to Linux? John, if you still have your old Windows source, can you have a look? 2) in hls_sched_res, the Revoke Callback is implemented as an ASSERT(0). Since in the past I broke the whole debug & assert mechanism, this assertion was never triggered, but now that I fixed the debugging stuff hourglass is triggering the assertion. What's the reason for having ASSERT(0) in the Revoke Callback? Can I assume that it is just because it still has to be implemented? In this case, I'll change it in an hls_printk("FIXME: Implement the revoke callback..."). Ok, that's all for now... I hope to commit my changes in the next days. Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Partecipa anche tu al progetto ShopMon: la Rete che fa bene Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=1953&d=1-11 |
From: Luca A. <luc...@em...> - 2003-09-22 08:56:51
|
Ok, I've been able to reproduce the bug (thanks to Tony for the test case!!!), and I can confirm that it happens only when the kernel preemption patch is applied. Here is my understanding of the problem: 1) the system enters in kernel mode, for some reason; 2) the kernel decides to unblock task A 3) before returning to user mode (and hence, before calling schedule()), the kernel decides to unblock task B also 4) before switching to user mode, schedule() is invoked... At this point, the Unblock hook is invoked (first for A, and then for B...) 4a) When the Unblock hook is invoked for task A, hls does not know that B is going to block before returning to user mode, hence an HLS scheduler decides to schedule B 4b) but the state of task B is not TASK_RUNNING anymore, hence HLS complains... If this understanding of the problem is correct, then the check in linux/bottom.c:show_tasks() is too strict... Just removing the call to show_tasks() in linux/bottom.c:HLSScheduleThread() should solve the problem (at least, without that call everything seems to work well). Tony, can you try, when you have some time? (you will also have to remove the debugging patch that I sent few days ago). I am still testing, and I hope to come up with a better solution in 2 or 3 days. Thanks, Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Hai un problema legato al mondo del verde? Chiedi a Mr. Green! clicca qui: Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=547&d=22-9 |
From: Luca A. <luc...@em...> - 2003-09-18 08:59:43
|
Hi Tony, thanks for the help!!! This will be very useful. > Sep 18 17:29:00 wally kernel: HLS MP initializing (no debugging). > Sep 18 17:29:00 wally kernel: already PRIVATE_DATA != NULL??? > Sep 18 17:29:00 wally kernel: already PRIVATE_DATA != NULL??? > Sep 18 17:29:09 wally kernel: hls_ctl: Moving to res1 > Sep 18 17:29:09 wally kernel: ...and setting the parameters! > Sep 18 17:29:11 wally kernel: HLS ERROR: Task 1 has rt_priority = 100 and state = 1 --- WAI = 5 WUP = 1 Last = 890 > Sep 18 17:29:11 wally kernel: HLSUnblockThreadHook --- WAI = 5 [498] > Sep 18 17:29:11 wally kernel: HLSUnblockThreadHook --- WAI = 5 [498] > Sep 18 17:29:11 wally kernel: HLS ERROR: Task 1 has rt_priority = 100 and state = 1 > Sep 18 17:29:11 wally kernel: HLS ERROR: Trying to schedule 1 with state = 1 Ok, so this is the problem... A scheduler (the RR scheduler, in this case) tried to schedule a task (task 1) when it is in the TASK_INTERRUPTIBLE state... I'll have a deeper look at the scheduler code to understand why... Thanks again, Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Il pacchetto classicissimi - 15 grandi classici della letteratura Sconto del 52% sul prezzo di mercato. http://www.gullivertown.com/emailit/promodetail.php3?p_codice=124 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=1903&d=18-9 |
From: Tony L. <tl...@au...> - 2003-09-18 08:19:38
|
Hi Luca, I've applied the patch you sent and re-run the test.=20 Debug output and a test program are attached - hope it's helpful. When I ran the program, it froze after a few seconds, but the machine didn't. I was able to use CTRL-C to exit the program. The only difference between this code and my test code in relation to this problem is that there is a 1ms sleep in the main loop in the test code. To compile use: g++ -o hlstest hlstest.cpp -g -I/usr/src/linux/include/ -I/usr/src/hls/include/ Cheers, Tony |
From: Luca A. <luc...@em...> - 2003-09-18 05:25:33
|
Hi all, I am still investigating this bug... And unfortunately I am still not able to reproduce it :( (yesterday evening, I ran HLS for more than 4 hours doing CPU-intensive activities, and creating a lot of 5/6.625 reservations, but nothing happened...). Hence, I suspect the problem is a bad interaction with the preemption patch. I hope to install it as soon as possible and to do some other testing. In the meanwhile, can someone that is able to trigger the bug run HLS with the attached debugging patch, and let me know the dmesg output? A simple recipe to trigger the bug would also be useful ;-) Thanks, Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Miotti.it eyewear: qualità, assortimento e sopratutto spedizioni gratuite! Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=1451&d=18-9 |
From: Luca A. <luc...@em...> - 2003-09-12 15:18:12
|
Hi Tony, [...] > >Does your machine crash even without using the RTC > Yes. I sent the first message at the end of the day after only running a > couple of tests, but I then ran more tests the next day which showed the > problem to occur even without the RTC. Sorry about the confusion - I > thought the first email was lost but our mail server sent it out a day > late! Ok, so RTC is not the cause of the problem... Hence, since I am not able to reproduce the crash I suspect that the problem is in a bad interaction with the preemption patch. > >Was the test performed with or without the kernel preemption patch? > Yes, we have been using the kernel preemption patch. I think this is the only difference between your configuration and my one... I hope to be able to get an x86 test machine in the next week, so that I can install the preemption patch and try to reproduce the problem. > I did plan to > re-test without this patch but I've been busy with other work. This would be very useful. Anyway, there is no hurry (I still have to set up a test environment with a preemptive kernel). > >If I understand well, the "HLS ERROR: Scheduler xxx posted a timer > twice" is not the first error in your log... Is my understanding > correct? > > Yes, here is the log from when the hls_module was inserted: Ok. So the timer was not the cause of the problem... [...] > Sep 9 09:36:31 wally kernel: [1938629664], 803 : sched 'PS' registered > in slot 4 > Sep 9 09:36:31 wally kernel: [2011100543], 803 : sched 'RES' registered > in slot 5 > Sep 9 09:36:31 wally kernel: already PRIVATE_DATA != NULL??? > Sep 9 09:36:31 wally kernel: already PRIVATE_DATA != NULL??? > Sep 9 09:36:36 wally kernel: hls_ctl: Moving to res1 > Sep 9 09:36:36 wally kernel: ...and setting the parameters! > Sep 9 09:36:52 wally kernel: HLS ERROR: Task 756 has rt_priority = 100 > and state = 1 > Sep 9 09:36:52 wally kernel: HLSUnblockThreadHook --- WAI = 5 [505] > Sep 9 09:36:52 wally kernel: HLSUnblockThreadHook --- WAI = 5 [505] It seems that about 15 seconds after moving a task to the res1 scheduler, the internal HLS status gets corrupted... I think the latest two messages are just the log daemon that wakes up due the the "HLS ERROR:" printk. Hence, the important message is "HLS ERROR: Task 756 has rt_priority = 100 and state = 1". I'll have a look... > >Is the HLS module failing when compiled with CLI=1 CREATE=1 > INT_SCHED=1 DEBUG=0 MULDIV=1? > I've just tried this and it builds OK. My machine still froze - here's > the output ... (I don't think that anything new was logged) Looks like a similar output... BTW, what does your program do? (how many reservations does it create, how big are those reservations, and so on...). I think Paul tested both the schedtest example and hourglass, and it worked... Can you provide a simple program that causes this freeze? Thanks, Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Se ami vivere all'aperto non puoi farne a meno...clicca e scopri cos'è Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=1775&d=12-9 |
From: Tony L. <tl...@au...> - 2003-09-12 03:46:03
|
Hi Luca, Sorry for the late reply - our mail server is finally working correctly (I hope!). >Does your machine crash even without using the RTC=20 Yes. I sent the first message at the end of the day after only running a couple of tests, but I then ran more tests the next day which showed the problem to occur even without the RTC. Sorry about the confusion - I thought the first email was lost but our mail server sent it out a day late! >Was the test performed with or without the kernel preemption patch? Yes, we have been using the kernel preemption patch. I did plan to re-test without this patch but I've been busy with other work. >If I understand well, the "HLS ERROR: Scheduler xxx posted a timer twice" is not the first error in your log... Is my understanding correct? Yes, here is the log from when the hls_module was inserted: Sep 9 09:36:31 wally kernel: HLS MP initializing (HLS_DBG_PRINT_LEVEL = =3D 1). Sep 9 09:36:31 wally kernel: [1643284912], 803 : sched 'ROOT' registered in slot 0 Sep 9 09:36:31 wally kernel: [1718493520], 803 : sched 'JOIN' registered in slot 1 Sep 9 09:36:31 wally kernel: [1793690865], 803 : sched 'TH' registered in slot 2 Sep 9 09:36:31 wally kernel: [1866161544], 803 : sched 'RR' registered in slot 3 Sep 9 09:36:31 wally kernel: [1938629664], 803 : sched 'PS' registered in slot 4 Sep 9 09:36:31 wally kernel: [2011100543], 803 : sched 'RES' registered in slot 5 Sep 9 09:36:31 wally kernel: already PRIVATE_DATA !=3D NULL??? Sep 9 09:36:31 wally kernel: already PRIVATE_DATA !=3D NULL??? Sep 9 09:36:36 wally kernel: hls_ctl: Moving to res1 Sep 9 09:36:36 wally kernel: ...and setting the parameters! Sep 9 09:36:52 wally kernel: HLS ERROR: Task 756 has rt_priority =3D = 100 and state =3D 1 Sep 9 09:36:52 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [505] Sep 9 09:36:52 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [505] Sep 9 09:36:52 wally kernel: HLS ERROR: Task 756 has rt_priority =3D = 100 and state =3D 1 Sep 9 09:36:52 wally kernel: HLS ERROR: Task 756 has rt_priority =3D = 100 and state =3D 1 Sep 9 09:36:52 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [505] Sep 9 09:36:52 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [505] Sep 9 09:36:52 wally kernel: HLS ERROR: Task 756 has rt_priority =3D = 100 and state =3D 1 Sep 9 09:36:52 wally kernel: HLS ERROR!!! Task state changed during the hook??? 0 !=3D 1!!! Correcting... Lots of the above messages, then ... Sep 9 09:38:42 wally kernel: HLS ERROR: Task 635 has rt_priority =3D = 100 and state =3D 1 Sep 9 09:38:42 wally kernel: HLS ERROR: Task 635 has rt_priority =3D = 100 and state =3D 1 Sep 9 09:38:42 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [505] Sep 9 09:38:42 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [505] Sep 9 09:38:42 wally kernel: HLS ERROR: Task 635 has rt_priority =3D = 100 and state =3D 1 Sep 9 09:38:42 wally last message repeated 34 times Sep 9 09:39:38 wally kernel: hls_ctl: Moving to res1 Sep 9 09:39:38 wally kernel: ...and setting the parameters! Sep 9 09:39:47 wally kernel: HLS ERROR: Scheduler res1 posted a timer twice!!! WAI =3D 0 Sep 9 09:40:04 wally last message repeated 2 times Sep 9 09:40:05 wally kernel: HLS ERROR: Task 1130 has rt_priority =3D = 100 and state =3D 1 >Is the HLS module failing when compiled with CLI=3D1 CREATE=3D1 INT_SCHED=3D1 DEBUG=3D0 MULDIV=3D1? I've just tried this and it builds OK. My machine still froze - here's the output ... (I don't think that anything new was logged) Sep 12 13:03:02 wally kernel: HLS MP initializing (no debugging). Sep 12 13:03:02 wally kernel: already PRIVATE_DATA !=3D NULL??? Sep 12 13:03:02 wally kernel: already PRIVATE_DATA !=3D NULL??? Sep 12 13:04:26 wally kernel: hls_ctl: Moving to res1 Sep 12 13:04:26 wally kernel: ...and setting the parameters! Sep 12 13:04:49 wally kernel: HLS ERROR: Task 1105 has rt_priority =3D = 100 and state =3D 1 Sep 12 13:04:49 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [498] Sep 12 13:04:49 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [498] Sep 12 13:04:49 wally kernel: HLS ERROR: Task 1105 has rt_priority =3D = 100 and state =3D 1 Sep 12 13:04:49 wally kernel: HLS ERROR: Task 1105 has rt_priority =3D = 100 and state =3D 1 Sep 12 13:04:49 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [498] Sep 12 13:04:49 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [498] Sep 12 13:04:49 wally kernel: HLS ERROR: Task 1105 has rt_priority =3D = 100 and state =3D 1 Sep 12 13:04:49 wally kernel: HLS ERROR!!! Task state changed during the hook??? 0 !=3D 1!!! Correcting... Sep 12 13:04:49 wally kernel: HLS ERROR: Task 498 has rt_priority =3D = 100 and state =3D 1 Sep 12 13:04:49 wally kernel: HLS ERROR: Task 498 has rt_priority =3D = 100 and state =3D 0 Sep 12 13:04:52 wally kernel: HLS ERROR: Task 1105 has rt_priority =3D = 100 and state =3D 1 Sep 12 13:04:52 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [498] Sep 12 13:04:52 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [498] Sep 12 13:04:52 wally kernel: HLS ERROR: Task 1105 has rt_priority =3D = 100 and state =3D 1 Sep 12 13:04:52 wally kernel: HLS ERROR: Task 1105 has rt_priority =3D = 100 and state =3D 1 Sep 12 13:04:52 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [498] Sep 12 13:04:52 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [498] Sep 12 13:04:52 wally kernel: HLS ERROR: Task 1105 has rt_priority =3D = 100 and state =3D 1 Sep 12 13:04:52 wally kernel: HLS ERROR!!! Task state changed during the hook??? 0 !=3D 1!!! Correcting... ... etc ... Thanks, Tony |
From: Luca A. <luc...@em...> - 2003-09-09 07:48:46
|
Hi Tony, thanks for all the information... I am going to parse it, and I'll let you know as soon as I discover something. Just few questions, to be sure that I understood everything correctly: - Does your machine crash even without using the RTC (in your first mail, I read "The good news is that the problem seems to be caused by the rtc as you suspected", but in the second one I seem to understand the opposite)? If yes, I would say that we are in big trouble... I cannot understand why HLS is having problems in your machine and I am not able to reproduce them... - Was the test performed with or without the kernel preemption patch? - If I understand well, the "HLS ERROR: Scheduler xxx posted a timer twice" is not the first error in your log... Is my understanding correct? If yes, can you post the first error that you see in your log, and the lines coming immediately before it in the log? - Is the HLS module failing when compiled with CLI=1 CREATE=1 INT_SCHED=1 DEBUG=0 MULDIV=1? If no, can you try it and send a log (when you have time)? Thanks, Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Natsabe.it la più grande erboristeria online italiana prezzi bassi tutto l'anno ! Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=1298&d=9-9 |
From: Tony L. <ton...@ya...> - 2003-09-09 03:39:21
|
Hi Luca, (You may get this email twice due to problems with my mail server - sorry). I applied the patch you sent and also uncommented the __DO_CLI__ macros from hls_timers.c (Note, I had to add "unsigned long flags;" to hls_timers.c to get the code to build with the macro enabled). I then ran my application with and without the rtc. The good news is that the problem seems to be caused by the rtc as you suspected; unfortunately I didn't see any of the debug messages from the patch that you sent. I only ran the test a couple of times (I have to reboot in between tests when it fails), and I will run it a few more times tomorrow and let you know if I get any output. Thanks, Tony __________________________________ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com |
From: Tony L. <tl...@au...> - 2003-09-09 02:35:54
|
Hi Luca, We've had problems with our mail server, so this email is a re-post of the one I sent yesterday with some added comments.... I applied the patch you sent and also uncommented the __DO_CLI__ macros from hls_timers.c (Note, I had to add "unsigned long flags;" to hls_timers.c to get the code to build with the macro enabled). I then ran my program a few times (~10) with and without the rtc. The machine froze independently of running my program with the rtc. On one occasion, the machine froze when I ran "init 6" to reboot it (when I couldn't kill off my program). I suspected that the amount of logging to /var/log/messages was the cause of the freeze, so I rebuilt the HLS scheduler with DEBUG=3D0, but then the module wouldn't install and locked up my machine! I then rebuilt the HLS scheduler with CLI=3D1 CREATE=3D1 INT_SCHED=3D1 = DEBUG=3D1 MULDIV=3D1 (ie normal build) and stopped syslogd. I then tested my program with no syslogd and with the rtc - which froze my program and also the machine, so there does seem to be a problem with the rtc and hls. During all of the rtc tests, I only saw the debug statement from your last patch once, eg: Sep 9 09:38:42 wally kernel: HLS ERROR: Task 635 has rt_priority =3D = 100 and state =3D 1 Sep 9 09:38:42 wally last message repeated 34 times Sep 9 09:39:38 wally kernel: hls_ctl: Moving to res1 Sep 9 09:39:38 wally kernel: ...and setting the parameters! Sep 9 09:39:47 wally kernel: HLS ERROR: Scheduler res1 posted a timer twice!!! WAI =3D 0 Sep 9 09:40:04 wally last message repeated 2 times Sep 9 09:40:05 wally kernel: HLS ERROR: Task 1130 has rt_priority =3D = 100 and state =3D 1 Sep 9 09:40:05 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [505] I then tested with no syslogd and no rtc. My program froze after running for a few minutes, but the machine did not lock up, so I was able to stop my program by removing the HLS module. This produces the following errors, but I'm able to access the machine OK after removing the module: (task 861 is my program) Removing task 861 from HLS ../hls/hls_hooks.c:329: failed HLS assertion: "current_task() =3D=3D Thread". =20 ../hls/hls_hooks.c:329: failed HLS assertion: "current_task() =3D=3D Thread". =20 HLS ERROR: not panicing, but continuing... HLS ERROR: not panicing, but continuing... =20 During both rtc/non-rtc tests, either nothing was logged before the program froze, or messages similar to the following appeared in the messages file while the program was frozen (a lot of them!): Sep 9 10:26:46 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [503] Sep 9 10:26:46 wally kernel: HLS ERROR: Task 754 has rt_priority =3D = 100 and state =3D 1 Sep 9 10:26:46 wally kernel: HLS ERROR!!! Task state changed during the hook??? 0 !=3D 1!!! Correcting... Hope this was helpful. Thanks, Tony |
From: Luca A. <luc...@em...> - 2003-09-07 09:47:43
|
Hi Tony, from your description, I think that the RTC is part of the problem... It probably generates a high priority interrupt that interferes with HLS (there should be some bug in the HLS locking code)... I'll double-check that. > On Monday I will : > - apply the debug patch from your last email > - uncomment the __DO_CLI__ macros > - disable the rtc interrupts in my application > - run my application Ok, thanks!!! These results will be very useful. Thanks again for your testing, Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Occhialeria.it Scopri le migliori marche a prezzi imbattibili Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=879&d=7-9 |
From: Tony L. <ton...@ya...> - 2003-09-06 03:14:28
|
Hi Luca, Thanks for the suggestions! Yes, the kernel we are using does have the pre-emptive kernel patch in it. I removed the patch yesterday but had problems building the driver for the custom PCI card we are using(?) - I am looking at rebuilding it and then running the test without the pre-emptive patch. >Looks like someone is unblocking ... I ran our application a few more times and observed the machine to freeze almost immediately after starting it. Given your comment above, the only thing that is running from the start of the program is the real time clock interrupt read() inside an infinite while() loop. It may be that the this is causing the unblocking behaviour you interpereted. On Monday I will : - apply the debug patch from your last email - uncomment the __DO_CLI__ macros - disable the rtc interrupts in my application - run my application If the problem doesn't occur then the rtc may be the cause. Otherwise I will re-enable the rtc and send you the output of dmesg. Following that test I will go back to getting the kernel running without the pre-emptive kernel patch and then let you know if the machine still freezes (and send dmesg output if needed). Thanks again for your help. Tony __________________________________ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com |
From: Luca A. <luc...@em...> - 2003-09-05 05:38:06
|
Hi guys, I had a look at the problem, and here is my interpretation: [...] > hls_ctl: Moving to res1 > ...and setting the parameters! > bug: kernel timer added twice at e24c1689. > HLSUnblockThreadHook --- WAI = 2 [512] > HLSUnblockThreadHook --- WAI = 2 [512] Looks like someone is unblocking, and its scheduler posts the scheduler timer when it is already active... :( This causes the kernel to scream, performing a printk() that wakes up the log daemon klogd ---> klogd unblocks during a hook. From this point, the HLS internal state is screwed, and crazy things happen. Hence, first of all we need to understand why a scheduler is posting its timer twice. Unfortunately, I am not able to reproduce the problem here, hence I'll need your help. Can you please apply the attached patch and reproduce the problem, sending me the dmesg output? The patch makes the module should be a little more verbose, giving information about the scheduler that is causing the problem. Also, there are some #ifdef __DO_CLI__ in linux/hls_timers.c that are currently commented out... Can you try to remove the comments, and see if this fixes the problem? Finally, are you usning the preemptive patch? If yes, can you confirm that the problem still happens without the patch installed? If it still happens, it likely is a locking problem... Thanks, Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Il nostro catalogo completo a casa tua, gratis! Vieni da Peraga, tanti prodotti introvabili per te. Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=450&d=5-9 |
From: Luca A. <luc...@em...> - 2003-09-04 10:09:20
|
Hi Paul, just a quick mail (I am a little bit in hurry, right now); I'll send a longer reply this evening. [...] > I noted in one of John's HLS papers a scheduler hierarchy that looked > like this: > > ROOT---| > | | > RES--JOIN > | > PS > > Does this hierarchy give stronger guarantees to a RES task than the > standard hierarchy in HLS? I assume the answer is "no" iff there are no > rr1 tasks with priority >= 20? Don't know about this (I'll need to re-read that paper)... [...] > I'm trying to understand whether or not the standard hierarchy will > suffice for our application, or whether we need to compose a (more) > appropriate hierarchy. We want to guarantee our application meets its > real time constraints (no missed deadlines). More below... I think that the standard hierarchy should be enough... You eventually will have to change some tasks' priorities. If I remember well, the standard hierarchy is: root | | -----rr1----- | | | | | | res1 rr2 ps1 P=20 P=10 P=9 (rr2 is the default scheduler) I assume you are not using the proportional share scheduler ps1. Anyway, you have a lot of freedom in scheduling your tasks: 1) you can leave a task under the rr2 scheduler, changing its priority to make it more or less important 2) you can move a task to the rr1 scheduler, giving it priority < 10, so that it is scheduled in background respect to all the rr2 tasks 3) you can move a task to the rr1 scheduler, giving it priority 20 > P > 10, so that it is scheduled in foreground respect to all the rr2 tasks, but in background respect to all the reserved tasks (res1) 3) you can move a task to the rr1 scheduler, giving it priority > 20, so that it is scheduled in foreground respect to all the rr2 and res1 tasks > I thought that all tasks were converted to HLS tasks (under rr2) when > the scheduler is first loaded into the kernel? Yes, this is correct: when the module is inserted all the tasks are moved to the default scheduler (rr2), and when a task is created, it is scheduled by the default scheduler. But if you explicitly call sched_setscheduler() choosing the SCHED_RR, SCHED_FIFO, or SCHED_OTHER policy, then the task returns to be a "regular linux task", and is scheduled in background respect to all the HLS tasks. > When you say "background", do you mean rr2? No, I was meaning that when a task decides to return to be a linux task (by selecting the SCHED_RR, SCHED_FIFO, or SCHED_OTHER policy), it is not scheduled anymore unless all the HLS tasks are idle. > I assumed that the HLS rr2 scheduler is basically playing the role > SCHED_OTHER did before HLS was loaded? Yes, this is almost correct. The only problem is that if a task explicitly selects SCHED_OTHER, it currently returns to be a non HLS task. Setting SCHED_OTHER ---> default scheduler is a good idea, and I will do it. > The > rr1 scheduler in the hierarchy is confusing me a little bit > (apologies!). Well, let's see if I remember well (John, please correct me...). If I am not wrong, a root scheduler can have only a single scheduler as a child. rr1 is the child of the root scheduler, and it is used to "schedule the other schedulers" in a prioritized way. I hope this clarifies things a little bit... > > Yes... You can for example set "rt scheduler = rr1" (for the > > standard hierarchy), so that changing a task to SCHED_FIFO or > > SCHED_RR will really increase its priority. > > What happens to res1 tasks in this case? If you set task priorities < 20, it is not affected. > Will the sched_setscheduler() > call for a res1 task then fail if there are "demanding" SCHED_FIFO/RR > tasks or vice versa? No, we do not implement hierarchical guarantees. If some time-demanding task is scheduled on rr1 with priority > 20, then the res1 tasks can fail to get their reserved time even if sched_setscheduler() did not fail. Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Difendi la tua casella di posta dai virus e dallo spam, prendi Email.it Pro15, Pro50 o Pro100 la casella professionale e sicura. Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=1048&d=4-9 |
From: Paul K. <pko...@au...> - 2003-09-04 07:11:23
|
Hi all, > > [Aside: from this I take it that CPU reservations are stronger than=20 > > SCHED_RR/SCHED_FIFO? > In the default hierarchy that is built when the HLS module is=20 > loaded, yes. There is a RR scheduler (rr1), and the res1=20 > scheduler is scheduled on rr1 with priority 20. All the=20 > "regular tasks" are scheduled on another RR scheduler (rr2),=20 > which is scheduled on rr1 with priority 10. Hence, res tasks=20 > are scheduled in foreground respect to all the other tasks.=20 > Of course, you can change a task to rr1 (with priority > 20)=20 > to schedule 1 in foreground respect to res tasks... I noted in one of John's HLS papers a scheduler hierarchy that looked like this: ROOT---| | | RES--JOIN | PS Does this hierarchy give stronger guarantees to a RES task than the standard hierarchy in HLS? I assume the answer is "no" iff there are no rr1 tasks with priority >=3D 20? I recall you said the join scheduler is currently broken. Is the following hierarchy valid and similar (only hard reservations allowed, no soft ones)? ROOT---| | | RES PS =20 I'm trying to understand whether or not the standard hierarchy will suffice for our application, or whether we need to compose a (more) appropriate hierarchy. We want to guarantee our application meets its real time constraints (no missed deadlines). More below... >=20 > > With the latest talk about linux tasks vs HLS > > tasks, what is the relationship between HLS round-robin and linux's=20 > > SCHED_OTHER, SCHED_FIFO and SCHED_RR? > Currently, all linux tasks are scheduled in background=20 > respect to HLS tasks (a non-HLS task is scheduled only when=20 > no HLS tasks are ready). Hence, doing=20 I thought that all tasks were converted to HLS tasks (under rr2) when the scheduler is first loaded into the kernel? This seems to be what /proc/HLS/tasks shows. When any new tasks appear, I assumed they end up on the default HLS scheduler (rr2) unless otherwise directed. When you say "background", do you mean rr2? I assumed that the HLS rr2 scheduler is basically playing the role SCHED_OTHER did before HLS was loaded? The rr1 scheduler in the hierarchy is confusing me a little bit (apologies!). > sched_setsched(SCHED_RR) can be dangerous, because it results=20 > to do the opposite of what the user expects... This is what I=20 > am trying to fix. >=20 > > Do SCHED_OTHER tasks -> HLS > > round-robin? What about SCHED_RR/SCHED_FIFO tasks? > This is exactly what we have to decide right now... The=20 > implemented solution (SCHED_OTHER, SCHED_RR, SCHED_FIFO -->=20 > background respect to > HLS) is not good. >=20 > > Is that what you're > > grappling with at the moment, thinking about a HLS rt scheduler? > Yes... You can for example set "rt scheduler =3D rr1" (for the=20 > standard hierarchy), so that changing a task to SCHED_FIFO or=20 > SCHED_RR will really increase its priority. What happens to res1 tasks in this case? Will the sched_setscheduler() call for a res1 task then fail if there are "demanding" SCHED_FIFO/RR tasks or vice versa? > =20 > > Based on the third simulation, I would've thought our application=20 > > would be okay. However it continues to show missed deadline=20 > problems.=20 > > It could be we need to revisit reservation requirements,=20 > but I'm also=20 > > keen to understand interrupt latency issues and scheduling issues. > Does your application do some setsched(SCHED_FIFO) (or=20 > SCHED_RR)? Does it create a high I/O load? >=20 Our application only schedules itself under res1 (5ms/6.625ms). It does a fair bit of I/O: in each 6.625ms period, it does some number crunching, sets up 3 DMA transfers to a cPCI card (1 read, 2 writes, tens of KB) and does read/writes to 2 Ethernet cards (low throughput, < 1MB/s); once all that is done, it blocks. It gets woken up by the RTC interrupts every 1ms and polls the cPCI card time counters to check if they've ticked over into a new 6.625ms period; if so, it does all its processing again,etc; if not, it immediately blocks again. Shortly we'll forget the RTC 1ms interrupt and polling mechanism, and get the cPCI to interrupt every 6.625ms. I hope that made sense! Regards, Paul. =20 ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Linux-hls-devel mailing list Lin...@li... https://lists.sourceforge.net/lists/listinfo/linux-hls-devel |
From: Luca A. <luc...@em...> - 2003-09-04 05:40:28
|
Hi Paul, > To answer one question it looks like 5 seconds pass before the timer bug > message appears. > > Sep 3 18:50:17 wally kernel: hls_ctl: Moving to res1 > Sep 3 18:50:17 wally kernel: ...and setting the parameters! > Sep 3 18:50:22 wally kernel: bug: kernel timer added twice at e24c1689 Ok, thanks! Unfortunately, I did not understand what is going on yet, but I am working on this... > I don't know if it's relevant but we use the RTC to generate 1 ms (1 > kHz) interrupts to our process. This should not be a problem. > I assume that the 1 kHz kernel timer is > generated by a different timer (the PIC/APIC?) Yes, it is generated by the PIT. > We tried to specify a CPU reservation of 5ms out of 6.625ms. Before > handballing to Tony I previously tested a "simulation" using hourglass > (v0.6) to check on interrupt latency and deadline misses. > > Typing: > > hourglass -n 1 -t 0 -rh 5ms 6.625ms -w LAT > while simultaneously running top with update period 0.01 gave latencies > of around 1.3ms > > hourglass -n 1 -t 0 -rh 5ms 6.625ms -w LAT -i RTC > while simultaneously running top with update period 0.01 gave latencies > of around 0.3ms > > hourglass -n 5 -t 0 -rh 5ms 6.625ms -w PERIODIC 4ms 6.625ms -t 1 -p > RTHIGH -w PERIODIC 1ms 10ms > showed that task 0 met all its deadlines but 1 (missed 1 at the > beginning, a synchronisation/startup problem still?) I'll try to have a look and to investigate these latencies. A thing that I can say for sure is that a period of 6.625 can give some problems, since it is not a multiple of the system tick (I expect some additional latencies due to this, but I don't think it is the reason for the problem you are seeing). > [Aside: from this I take it that CPU reservations are stronger than > SCHED_RR/SCHED_FIFO? In the default hierarchy that is built when the HLS module is loaded, yes. There is a RR scheduler (rr1), and the res1 scheduler is scheduled on rr1 with priority 20. All the "regular tasks" are scheduled on another RR scheduler (rr2), which is scheduled on rr1 with priority 10. Hence, res tasks are scheduled in foreground respect to all the other tasks. Of course, you can change a task to rr1 (with priority > 20) to schedule 1 in foreground respect to res tasks... > With the latest talk about linux tasks vs HLS > tasks, what is the relationship between HLS round-robin and linux's > SCHED_OTHER, SCHED_FIFO and SCHED_RR? Currently, all linux tasks are scheduled in background respect to HLS tasks (a non-HLS task is scheduled only when no HLS tasks are ready). Hence, doing sched_setsched(SCHED_RR) can be dangerous, because it results to do the opposite of what the user expects... This is what I am trying to fix. > Do SCHED_OTHER tasks -> HLS > round-robin? What about SCHED_RR/SCHED_FIFO tasks? This is exactly what we have to decide right now... The implemented solution (SCHED_OTHER, SCHED_RR, SCHED_FIFO --> background respect to HLS) is not good. > Is that what you're > grappling with at the moment, thinking about a HLS rt scheduler? Yes... You can for example set "rt scheduler = rr1" (for the standard hierarchy), so that changing a task to SCHED_FIFO or SCHED_RR will really increase its priority. > Based on the third simulation, I would've thought our application would > be okay. However it continues to show missed deadline problems. It could > be we need to revisit reservation requirements, but I'm also keen to > understand interrupt latency issues and scheduling issues. Does your application do some setsched(SCHED_FIFO) (or SCHED_RR)? Does it create a high I/O load? I am continuing to study the problem reported yesterday (unfortunately, I cannot reproduce it), and I'll let you know as soon as I discover something... Luca -- _____________________________________________________________________________ Copy this in your signature, if you think it is important: N O W A R ! ! ! -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Non capite un cavolo di borsa? Investite nella zucca. Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=665&d=4-9 |
From: Paul K. <pko...@au...> - 2003-09-04 01:29:48
|
Hi Luca/John (and anyone else who's listening), To answer one question it looks like 5 seconds pass before the timer bug message appears. Sep 3 18:50:17 wally kernel: hls_ctl: Moving to res1 Sep 3 18:50:17 wally kernel: ...and setting the parameters! Sep 3 18:50:22 wally kernel: bug: kernel timer added twice at e24c1689 I don't know if it's relevant but we use the RTC to generate 1 ms (1 kHz) interrupts to our process. I assume that the 1 kHz kernel timer is generated by a different timer (the PIC/APIC?) I'm a little confused by the different timing mechanisms and timer-related patches (e.g. hi-res). We tried to specify a CPU reservation of 5ms out of 6.625ms. Before handballing to Tony I previously tested a "simulation" using hourglass (v0.6) to check on interrupt latency and deadline misses. Typing:=20 hourglass -n 1 -t 0 -rh 5ms 6.625ms -w LAT while simultaneously running top with update period 0.01 gave latencies of around 1.3ms hourglass -n 1 -t 0 -rh 5ms 6.625ms -w LAT -i RTC while simultaneously running top with update period 0.01 gave latencies of around 0.3ms =20 hourglass -n 5 -t 0 -rh 5ms 6.625ms -w PERIODIC 4ms 6.625ms -t 1 -p RTHIGH -w PERIODIC 1ms 10ms=20 showed that task 0 met all its deadlines but 1 (missed 1 at the beginning, a synchronisation/startup problem still?) [Aside: from this I take it that CPU reservations are stronger than SCHED_RR/SCHED_FIFO? With the latest talk about linux tasks vs HLS tasks, what is the relationship between HLS round-robin and linux's SCHED_OTHER, SCHED_FIFO and SCHED_RR? Do SCHED_OTHER tasks -> HLS round-robin? What about SCHED_RR/SCHED_FIFO tasks? Is that what you're grappling with at the moment, thinking about a HLS rt scheduler? Where does it fit in the default HLS hierarchy? Do any of these questions make sense?!]=20 Based on the third simulation, I would've thought our application would be okay. However it continues to show missed deadline problems. It could be we need to revisit reservation requirements, but I'm also keen to understand interrupt latency issues and scheduling issues. Regards, Paul Koufalas Senior Communications Engineer AUSPACE Limited Level 1 Innovation House Technology Park MAWSON LAKES SA 5095 AUSTRALIA T +61 8 8260 8236 F +61 8 8260 8226 M +61 404 837 122 www.auspace.com.au This email is for the intended addressee only. If you have received this e-mail in error, you are requested to contact the sender and delete the e-mail. Nothing in this email shall bind Auspace Limited in any contract or obligation. =20 -----Original Message----- From: Tony Lupoi=20 Sent: Wednesday, September 03, 2003 7:00 PM To: lin...@li... Subject: [Linux-hls-devel] Processes freezing up Hi Luca/John, =20 I'm a work colleague of Paul Koufalas and have been looking with him at running our application with the HLS scheduler. =20 I've encountered an issue where processes seem to freeze up after running for any amount of time. Sometimes it's my application and I've also noticed it happen with top. =20 I've included the dmesg output below. =20 HLS MP initializing (HLS_DBG_PRINT_LEVEL =3D 1). [-1961706252], 833 : sched 'ROOT' registered in slot 0 [-1885129302], 833 : sched 'JOIN' registered in slot 1 [-1808565821], 833 : sched 'TH' registered in slot 2 [-1734729187], 833 : sched 'RR' registered in slot 3 [-1660892643], 833 : sched 'PS' registered in slot 4 [-1587056079], 833 : sched 'RES' registered in slot 5 already PRIVATE_DATA !=3D NULL??? already PRIVATE_DATA !=3D NULL??? hls_ctl: Moving to res1 ...and setting the parameters! bug: kernel timer added twice at e24c1689. HLSUnblockThreadHook --- WAI =3D 2 [512] HLSUnblockThreadHook --- WAI =3D 2 [512] HLS ERROR: Task 687 has rt_priority =3D 100 and state =3D 1 HLSUnblockThreadHook --- WAI =3D 5 [512] HLSUnblockThreadHook --- WAI =3D 5 [512] HLS ERROR: Task 687 has rt_priority =3D 100 and state =3D 1 HLS ERROR: Task 687 has rt_priority =3D 100 and state =3D 1 HLSUnblockThreadHook --- WAI =3D 5 [512] HLSUnblockThreadHook --- WAI =3D 5 [512] HLS ERROR: Task 687 has rt_priority =3D 100 and state =3D 1 HLS ERROR!!! Task state changed during the hook??? 0 !=3D 1!!! Correcting... HLS ERROR: Task 512 has rt_priority =3D 100 and state =3D 1 HLS ERROR: Task 512 has rt_priority =3D 100 and state =3D 0 HLS ERROR: Task 687 has rt_priority =3D 100 and state =3D 1 HLSUnblockThreadHook --- WAI =3D 5 [512] HLSUnblockThreadHook --- WAI =3D 5 [512] HLS ERROR: Task 687 has rt_priority =3D 100 and state =3D 1 HLS ERROR: Task 507 has rt_priority =3D 100 and state =3D 1 HLSUnblockThreadHook --- WAI =3D 5 [512] HLSUnblockThreadHook --- WAI =3D 5 [512] HLS ERROR: Task 507 has rt_priority =3D 100 and state =3D 1 =20 Here are the processes that are in error: =20 root 687 685 0 18:37 ? 00:00:00 in.telnetd: dm3 root 507 1 0 18:37 ? 00:00:00 syslogd -m 0 root 512 1 0 18:37 ? 00:00:00 klogd -x =20 On a previous run, I was seeing error messages with the init process, ie .. =20 Sep 3 17:58:16 wally kernel: HLS ERROR: Task 1 has rt_priority =3D 100 and state =3D 1 Sep 3 17:58:16 wally kernel: HLS ERROR: Task 1 has rt_priority =3D 100 and state =3D 1 Sep 3 17:58:16 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [511] Sep 3 17:58:16 wally kernel: HLSUnblockThreadHook --- WAI =3D 5 = [511] Sep 3 17:58:16 wally kernel: HLS ERROR: Task 1 has rt_priority =3D 100 and state =3D 1 Sep 3 17:58:16 wally kernel: HLS ERROR!!! Task state changed during the hook??? 0 !=3D 1!!! Correcting... =20 And also, when my process freezes, I'm unable to kill it, even with kill -9 <pid>! =20 I don't know if it's related, but I've also seen the following messages in the syslog: =20 Sep 3 18:50:17 wally kernel: hls_ctl: Moving to res1 Sep 3 18:50:17 wally kernel: ...and setting the parameters! Sep 3 18:50:22 wally kernel: bug: kernel timer added twice at e24c1689. =20 Any help would be appreciated, and let me know if I can enable any more debugging to give more information. =20 Thanks and regards, Tony |