[LTP] [PATCH] nanosleep02

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi,

Please copy me on replies as I have not subscribed to this list.

I am seeing nanosleep02 testcase failure on RHEL3 machines (kernel 
2.4.21-32.11.EL and other RHEL3 kernels). I believe the accuracy 
expected out of 'nanosleep' call in this testcase is too high and that 
is the cause of the problem. I have explained the reasons and provided a 
patch for the testcase below. Failure looks like this:

nanosleep02    1  FAIL  :  Remaining sleep time 4010000 usec doesn't 
match with the expected 3999622 usec time
nanosleep02    1  FAIL  :  child process exited abnormally

Explanation:
------------
nanosleep02 testcase does the following:
gettimeofday - before
nanosleep. - Interrupt the sleeping process by sending SIGINT
gettimeofday - after

It then compares the time remaining returned by nanosleep with the 
difference in time shown by the two gettimeofday calls. It expects the 
difference to be less than 2000 microseconds.

Compare this to nanosleep01 testcase. It is a simple test that lets the 
process sleep with nanosleep and checks how long it has slept. The 
SLOP_MS value, which is the allowed error margin, is 250 milliseconds. 
It does all calculations and comparisons in milliseconds.

Whereas nanosleep02 does all calculations in microseconds. However, 
'nanosleep' call is limited by the kernel timer mechanism, which can 
afford an accuracy of 1/HZ at best. From the manpage of 'nanosleep':
"The current implementation of nanosleep is based on the normal kernel 
timer mechanism, which has a  resolution  of 1/HZ s (i.e, 10 ms on 
Linux/i386 and 1ms on Linux/Alpha). Therefore, nanosleep pauses always 
for at least the specified time, however it can take up to 10 ms longer 
than specified until the process  becomes  runnable  again. For  the 
same  reason, the value returned in case of a delivered signal in *rem 
is usually rounded to the next larger multiple of 1/HZ s."

So while looking at the remaining time returned by an interrupted 
'nanosleep' call, the error margin we should be ready to allow is (1/HZ 
* 2), which is 20milliseconds. In fact 'nanosleep01' is quite generous!

The behavior shown by 'nanosleep' call is acceptable by POSIX standards 
as well. It says 
(http://www.opengroup.org/onlinepubs/007908799/xsh/nanosleep.html): "The 
suspension time may be longer than requested because the argument value 
is rounded up to an integer multiple of the sleep resolution or because 
of the scheduling of other activity by the system. But, except for the 
case of being interrupted by a signal, the suspension time will not be 
less than the time specified by rqtp, as measured by the system clock, 
CLOCK_REALTIME."

So I think we should change the testcase. I am attaching a patch to 
suggest changes for this. Please note that I have set the error margin 
to be same as nanosleep01, which is 250milliseconds. It can be set as 
low as 20 milliseconds, but if the system is heavily loaded, it may lead 
to the test failing again.


Thanks and regards,
Sripathi.

Patch:
------

--- testcases/kernel/syscalls/nanosleep/nanosleep02.c	2005-08-02 
13:48:29.000000000 -0500
+++ /home/sripathi/17215/nanosleep02.c	2005-08-02 13:40:38.000000000 -0500
@@ -101,7 +101,7 @@ void sig_handler();		/* signal catching
   * the "rem" field would never change without the increased
   * usec precision in the -aa tree.
   */
- #define USEC_PRECISION 2200  /* Originally set at 100 max but this 
compiler bug has been around for years. */
+#define MSEC_PRECISION 250      /* Error margin allowed in milliseconds */

  int
  main(int ac, char **av)
@@ -185,7 +185,7 @@ main(int ac, char **av)
  void
  do_child()
  {
-	unsigned long req, rem, before, after, elapsed; /* usec */
+	unsigned long req, rem, before, after, elapsed; /* msec */
  	struct timeval otime;		 /* time before child execution suspended */
  	struct timeval ntime;		 /* time after child resumes execution */

@@ -208,15 +208,15 @@ do_child()
  	 * The time remaining should be equal to the
  	 * Total time for sleep - time spent on sleep bfr signal
  	 */
-	req = timereq.tv_sec * 1000000 + timereq.tv_nsec / 1000;
-	rem = timerem.tv_sec * 1000000 + timerem.tv_nsec / 1000;
-	before = otime.tv_sec * 1000000 + otime.tv_usec;
-	after = ntime.tv_sec * 1000000 + ntime.tv_usec;
+	req = timereq.tv_sec * 1000 + timereq.tv_nsec / 1000000;
+	rem = timerem.tv_sec * 1000 + timerem.tv_nsec / 1000000;
+	before = otime.tv_sec * 1000 + otime.tv_usec/1000;
+	after = ntime.tv_sec * 1000 + ntime.tv_usec/1000;
  	elapsed = after - before;

-	if (rem - (req - elapsed) > USEC_PRECISION) {
-		tst_resm(TFAIL, "Remaining sleep time %lu usec doesn't "
-			 "match with the expected %lu usec time",
+	if (rem - (req - elapsed) > MSEC_PRECISION) {
+		tst_resm(TFAIL, "Remaining sleep time %lu msec doesn't "
+			 "match with the expected %lu msec time",
  			 rem, (req - elapsed));
  		exit(1);
  	}




[LTP] [PATCH] nanosleep02

Testsuite to validate the reliability, robustness, stability of Linux.

[LTP] [PATCH] nanosleep02