From: SourceForge.net <no...@so...> - 2009-04-13 09:53:00
|
Bugs item #2756909, was opened at 2009-04-12 22:23 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=622063&aid=2756909&group_id=98788 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: v0.8.x (devel) Status: Open Resolution: None Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: sleep crashing with "Assertion `0 <= seconds' failed." Initial Comment: While researching bug 2748015 (http://sourceforge.net/tracker/?func=detail&aid=2748015&group_id=98788&atid=622063), we came along another problem with the 0.74-rc1 and 0.8.0 code base. When starting command "while true; do sleep 0.1; done" and starting command "openssl genrsa -out /dev/null 4096" in another session the sleep command in the first session aborts occasionally with error: sleep: xnanosleep.c:67: xnanosleep: Assertion `0 <= seconds' failed. Aborted This problem can even be reproduced when you start the different commands as different unprivileged users. This seems like the kernel is changing the memory of random processes. Tested versions: colinux 0.8.0: suffers this bug colinux 0.7.4-rc1: suffers this bug colinux 0.7.3: does not suffer this bug Test system: AMD AthlonXP 3800+ Windows XP SP3 + all updates to date Guest OS is ArchLinux (ver 2009.02) (using only prebuild packages from the ArchLinux repositories) While researching this further, I discovered this thread which describes a bug in the User Mode Linux kernel almost a year ago. http://fixunix.com/openssl/518688-re-uml-devel-dev-random-problems-fp-registers-corruption.html I have not been able to link this to a bug on the UML Sourceforge.net development page. Keith ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2009-04-13 09:52 Message: To test this problem I simplified the program listed in the fixunix thread above. dblchange.c: #include <stdio.h> #define true 1 #define false 0 int main(int argc, char* argv[]){ double theDouble; double theLastDouble; theDouble = 1; while(true){ theLastDouble = theDouble; theDouble += 1; if(theLastDouble + 1 != theDouble){ printf("Double test fails!\n"); printf("- previous double: %f (%LX)\n", theLastDouble, theLastDouble); printf("- current double: %f (%LX)\n", theDouble, theDouble); break; } sleep(1); } return 0; } intchange.c: #include <stdio.h> #define true 1 #define false 0 int main(int argc, char* argv[]){ double theInteger; double theLastInteger; theInteger = 1; while(true){ theLastInteger = theInteger; theInteger += 1; if(theLastInteger + 1 != theInteger){ printf("Integer test fails!\n"); printf("- previous int: %d (%X)\n", theLastInteger, theLastInteger); printf("- current int: %d (%X)\n", theInteger, theInteger); break; } sleep(1); } return 0; } By analyzing the error thrown by sleep it seems the double value which specifies how long to sleep gets changed outside of the program's control. First I adapted the fixunix program to test for doubles being changed. It runs smoothly until I start the openssl key generation operation. Then it errors after several seconds: Double test fails! - previous double: nan (FFF8000000000000) - current double: nan (FFF8000000000000) By injecting some other printfs I've seen that in the fatal iteration the second read of the previous double goes wrong. But this doesn't matter that much because it gets overwritten by the current double variable. After that both variables are good again, but when increasing the current variable the outcome becomes the NAN value. Output below: +++ - previous double: 4.000000 (FFF8000000000000) - current double: 5.000000 (4014000000000000) theLastDouble = theDouble; - previous double: 5.000000 (4014000000000000) - current double: 5.000000 (4014000000000000) theDouble += 1; - previous double: 5.000000 (4014000000000000) - current double: nan (FFF8000000000000) Double test fails! - previous double: 5.000000 (4014000000000000) - current double: nan (FFF8000000000000) Note: at the "Double test fails!" piece the previous double does not have a NAN value. This only occurs when I add printfs so I blame this on the printfs doing stuff in between which changes the data flow. After a little further investigation it shows the previous double gets corrupted because in the final iteration the second read of any double gets turned into the NAN value. This means the current value wil be read as NAN and then copied to the previous value. After the double catastrophy I was curious if integers would also be affected so I wrote intchange.c. This showed that even integers are affected by this bug. Output below: Integer test fails! - previous int: 0 (FFF80000) - current int: 0 (FFF80000) The hex pattern seems to be the same as the corruption which doubles seem to get. Keith ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=622063&aid=2756909&group_id=98788 |