From: <0...@pe...> - 2002-10-29 23:58:00
|
> I think I found the problem. Apparently my GCC 3.2 or > something miscompiles it. Well, I tried a few recompiles of 5.0-16 with my GCC 3.2 and replacing -O2 by -Os apparently fixes the miscompilation. I just don't know what's going on with optimization, as almost all my system was compiled with GCC 3.2 and everything seems to be very stable, and my actual uptime is 32 days. But here's: 1- I removed -fsigned-char No changes. 2- I replaced -O2 by -O No changes. I mean no changes, but when miscompiled 'SMART Attributes Data Structure revision number' always change. 3- I replaced -O by -O3 -march=athlon -mcpu=athlon 5 Reallocated_Sector_Ct 0x1e08 000 000 020 Old_age FAILING_NOW 131072 Which miscompiled with -O2 or -O would be: 5 Reallocated_Sector_Ct 0x0008 133 005 020 Old_age In_the_past 847118427 # 1 Off-line Completed 00% 35616 0x00020008 which miscompiled with -O2 or -O would be # 1 Off-line Completed 00% 35584 0x327e0008 but apparently LifeTime(hours) also changes. And with just -O3 it printed: # 1 Off-line Completed 00% 39840 0x00020008 -- 0@pervalidus.{net, {dyndns.}org} |
From: Bruce A. <ba...@gr...> - 2002-10-30 00:35:44
|
Hi Fr=E9d=E9ric, Sorrry, I had a bit of trouble following this. Could you please just send me a list of compile lines and "WORKED" or "FAILED" soemthing like gcc2.96 -O3 -s WORKED gcc3.2 -O1 FAILED and so on. Cheers, =09Bruce On Tue, 29 Oct 2002, [ISO-8859-1] Fr=E9d=E9ric L. W. Meunier wrote: > > I think I found the problem. Apparently my GCC 3.2 or > > something miscompiles it. >=20 > Well, I tried a few recompiles of 5.0-16 with my GCC 3.2 and > replacing -O2 by -Os apparently fixes the miscompilation. I > just don't know what's going on with optimization, as almost > all my system was compiled with GCC 3.2 and everything seems to > be very stable, and my actual uptime is 32 days. But here's: >=20 > 1- I removed -fsigned-char >=20 > No changes. >=20 > 2- I replaced -O2 by -O >=20 > No changes. >=20 > I mean no changes, but when miscompiled 'SMART Attributes Data > Structure revision number' always change. >=20 > 3- I replaced -O by -O3 -march=3Dathlon -mcpu=3Dathlon >=20 > 5 Reallocated_Sector_Ct 0x1e08 000 000 020 Old_age FAILING_= NOW 131072 >=20 > Which miscompiled with -O2 or -O would be: >=20 > 5 Reallocated_Sector_Ct 0x0008 133 005 020 Old_age In_the_p= ast 847118427 >=20 > # 1 Off-line Completed 00% 35616 = 0x00020008 >=20 > which miscompiled with -O2 or -O would be >=20 > # 1 Off-line Completed 00% 35584 = 0x327e0008 >=20 > but apparently LifeTime(hours) also changes. >=20 > And with just -O3 it printed: >=20 > # 1 Off-line Completed 00% 39840 = 0x00020008 >=20 > --=20 > 0@pervalidus.{net, {dyndns.}org} >=20 >=20 >=20 |
From: <0...@pe...> - 2002-10-30 00:48:56
|
On Tue, 29 Oct 2002, Bruce Allen wrote: > Sorrry, I had a bit of trouble following this. Could you > please just send me a list of compile lines and "WORKED" or > "FAILED" > > soemthing like > > gcc2.96 -O3 -s WORKED > gcc3.2 -O1 FAILED > > and so on. The following worked: 2.95.4 CVS with the default options. 3.2 with -Os The following failed: 3.2 with the default options. 3.2 with -O 3.2 without -O2 or -O 3.2 with -O3 and -O3 -march=athlon -mcpu=athlon also failed, but gave some different results with smartctl -a. What's more strange is that I compiled my GCC 3.2, and smartmontools 5.0-10 from Slackware was compiled with their 3.2, and it works. I don't know if my compiler is broken, but it's the first time I run into (visible) problems with it. -- 0@pervalidus.{net, {dyndns.}org} |
From: Bruce A. <ba...@gr...> - 2002-10-30 00:58:40
|
I just checked in a new version of atacmds.c. Could you take that out of CVS and see if it helps matters. (I added one comma -- a long shot). Also, please, what does -Os do? Cheers, =09Bruce On Tue, 29 Oct 2002, [ISO-8859-1] Fr=E9d=E9ric L. W. Meunier wrote: > On Tue, 29 Oct 2002, Bruce Allen wrote: >=20 > > Sorrry, I had a bit of trouble following this. Could you > > please just send me a list of compile lines and "WORKED" or > > "FAILED" > > > > soemthing like > > > > gcc2.96 -O3 -s WORKED > > gcc3.2 -O1 FAILED > > > > and so on. >=20 > The following worked: >=20 > 2.95.4 CVS with the default options. > 3.2 with -Os >=20 > The following failed: >=20 > 3.2 with the default options. > 3.2 with -O > 3.2 without -O2 or -O >=20 > 3.2 with -O3 and -O3 -march=3Dathlon -mcpu=3Dathlon also failed, > but gave some different results with smartctl -a. >=20 > What's more strange is that I compiled my GCC 3.2, and > smartmontools 5.0-10 from Slackware was compiled with their > 3.2, and it works. I don't know if my compiler is broken, but > it's the first time I run into (visible) problems with it. >=20 > --=20 > 0@pervalidus.{net, {dyndns.}org} >=20 >=20 >=20 > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support >=20 |
From: <0...@pe...> - 2002-10-30 01:08:20
|
On Tue, 29 Oct 2002, Bruce Allen wrote: > I just checked in a new version of atacmds.c. Could you take > that out of CVS and see if it helps matters. (I added one > comma -- a long shot). Still the same miscompilations shown by smartctl -a. > Also, please, what does -Os do? From my gcc.1: Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size. -- 0@pervalidus.{net, {dyndns.}org} |
From: Bruce A. <ba...@gr...> - 2002-10-30 02:36:03
|
> > I just checked in a new version of atacmds.c. Could you take > > that out of CVS and see if it helps matters. (I added one > > comma -- a long shot). > > Still the same miscompilations shown by smartctl -a. OK, at least this is consistent. We would have seen similar warning messages in smartd if the problem had been the extra comma. > > Also, please, what does -Os do? > > From my gcc.1: > > Optimize for size. -Os enables all -O2 optimizations that do not > typically increase code size. It also performs further optimizations > designed to reduce code size. OK. I'm hoping you can see the bug with -g. If so we can kill it fast... Bruce |
From: Bruce A. <ba...@gr...> - 2002-10-30 02:33:04
|
> 2.95.4 CVS with the default options. > 3.2 with -Os > > The following failed: > > 3.2 with the default options. > 3.2 with -O > 3.2 without -O2 or -O This last one is exciting. Can you see if 3.2 with -g fails? If so you can email me a binary and I'll track it down REAL fast. > 3.2 with -O3 and -O3 -march=athlon -mcpu=athlon also failed, > but gave some different results with smartctl -a. Thanks. I'm assuming that "failed" means specifically with smartctl, right? > What's more strange is that I compiled my GCC 3.2, and smartmontools > 5.0-10 from Slackware was compiled with their 3.2, and it works. I > don't know if my compiler is broken, but it's the first time I run > into (visible) problems with it. Well, this might be because the code's evolved a lot since 5.10. You could try downloading the 5_0_10 tag, compiling that with your gcc3.2 -O2, and seeing if it works OK. If it does we could do a binary search for the first tag that breaks your gcc3.2 -O2 and look at the diffs between that tag and the previous (working) tag. What time zone are you in? Bruce |
From: <0...@pe...> - 2002-10-30 02:52:53
|
On Tue, 29 Oct 2002, Bruce Allen wrote: > > 2.95.4 CVS with the default options. > > 3.2 with -Os > > > > The following failed: > > > > 3.2 with the default options. > > 3.2 with -O > > 3.2 without -O2 or -O > > This last one is exciting. Can you see if 3.2 with -g fails? Yes, I just tried smartctl -a /dev/hda and it failed. > If so you can email me a binary and I'll track it down REAL > fast. I just sent you privately. > > 3.2 with -O3 and -O3 -march=athlon -mcpu=athlon also failed, > > but gave some different results with smartctl -a. > > Thanks. I'm assuming that "failed" means specifically with > smartctl, right? Yes, like printing 5 Reallocated_Sector_Ct 0x1e08 000 000 020 Old_age FAILING_NOW 131072 when with the default options it was 5 Reallocated_Sector_Ct 0x0008 133 005 020 Old_age In_the_past 847118427 > Well, this might be because the code's evolved a lot since > 5.10. You could try downloading the 5_0_10 tag, compiling > that with your gcc3.2 -O2, and seeing if it works OK. If it > does we could do a binary search for the first tag that > breaks your gcc3.2 -O2 and look at the diffs between that tag > and the previous (working) tag. I did it. My first smartmontools version was 5.0-10, which it also failed with "255" etc. It's miscompiling since smartsuite 2.1. > What time zone are you in? Brazil. BRT (GMT-3). 23:45 now. -- 0@pervalidus.{net, {dyndns.}org} |
From: Bruce A. <ba...@gr...> - 2002-10-30 06:11:37
|
> > Well, this might be because the code's evolved a lot since > > 5.10. You could try downloading the 5_0_10 tag, compiling > > that with your gcc3.2 -O2, and seeing if it works OK. If it > > does we could do a binary search for the first tag that > > breaks your gcc3.2 -O2 and look at the diffs between that tag > > and the previous (working) tag. > > I did it. My first smartmontools version was 5.0-10, which it > also failed with "255" etc. It's miscompiling since smartsuite > 2.1. I didn't understand what you meant. Does smartsuite 2.1 produce the same effect, namely different output depending upon the compilation options? > Brazil. BRT (GMT-3). 23:45 now. OK, I'm in Germany right now, GMT +1. I've had another idea about what is going wrong. I use a varargs function in my code, to control where the printout goes. The [From GLIBC Manual: Since the prototype doesn't specify types for optional arguments, in a call to a variadic function the default argument promotions are performed on the optional argument values. This means the objects of type char or short int (whether signed or not) are promoted to either int or unsigned int, as appropriate.] So I've now explicitly converted all the print format statements into explicit integer types (except in a couple of places where I had to go to long longs.). Could you try the latest code snapshot from CVS, 5.0-23, please? I'm off to sleep. Bruce |
From: <0...@pe...> - 2002-10-30 06:40:49
|
On Wed, 30 Oct 2002, Bruce Allen wrote: > > I did it. My first smartmontools version was 5.0-10, which it > > also failed with "255" etc. It's miscompiling since smartsuite > > 2.1. > > I didn't understand what you meant. Does smartsuite 2.1 > produce the same effect, namely different output depending > upon the compilation options? Yes. A miscompiled (with default options) smartsuite 2.1: smartd: Device: /dev/hda, S.M.A.R.T. Attribute: 1 Changed -66 smartctl -a /dev/hda: Revision Number: 3 ... SMART Error Log: SMART Error Logging Version: 3 No Errors Logged With GCC 2.95.4: No erroneous smartd line. smartctl -a /dev/hda: Revision Number: 11 ... +( 1)Raw Read Error Rate 0x0029 100 253 020 0 SMART Error Log: SMART Error Logging Version: 1 No Errors Logged > > Brazil. BRT (GMT-3). 23:45 now. > > OK, I'm in Germany right now, GMT +1. > > I've had another idea about what is going wrong. I use a varargs function > in my code, to control where the printout goes. The > > [From > GLIBC Manual: Since the prototype doesn't specify types for > optional arguments, in a call to a variadic function the default > argument promotions are performed on the optional argument > values. This means the objects of type char or short int (whether > signed or not) are promoted to either int or unsigned int, as > appropriate.] > > So I've now explicitly converted all the print format statements into > explicit integer types (except in a couple of places where I had to go to > long longs.). Could you try the latest code snapshot from CVS, 5.0-23, > please? Just tested. Still the same. SMART Attributes Data Structure revision number: 23936 ... 5 Reallocated_Sector_Ct 0x6008 135 005 020 Old_age In_the_past 847118427 5 Reallocated_Sector_Ct <== Data Page | WARNING: PREVIOUS ATTRIBUTE HAS TWO 1 Raw_Read_Error_Rate <== Threshold Page | INCONSISTENT IDENTITIES IN THE DATA ... SMART Error Log Version: 3 No Errors Logged SMART Self-test log, version number 3 Warning - structure revision number does not match spec! Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Off-line Completed 00% 36192 > I'm off to sleep. Me too. -- 0@pervalidus.{net, {dyndns.}org} |
From: Bruce A. <ba...@gr...> - 2002-10-30 10:57:54
|
> > I'm off to sleep. >=20 > Me too. Hi Fr=E9d=E9ric L. W. Meunier, Thanks very much for the gcc 3.2 binary. As I told you in my earlier email, the bug takes place when a function is calle with three arguments, a, b, and c. a and b are both 512 byte data structures fill of badly-aligned objects. c is just a 4-byte integer. What happens is that the first 7 bytes of a get scrambled in the call. I never liked passing these huge structures by value rather than by reference, so I decided that the easiest thing to do was to simply modify the calling semantics to pass pointers to the objects rather than the objects themselves. It's how it should have been done in the first place. The fact that smartsuite-2.1 exhibits the same bug also suggests that this may be the problem. I've posted a new release -- 5.0-24. Please give that a try and see if gcc 3.2 -O2 now produces a fully-functional executable... Cheers, =09Bruce PS: I added another config file option for smartd:=20 -C <N> where N is an integer >=3D10. It's how long the system sleeps between smar= t checks. Very nice for debugging. Default value of N is 1800 sec. Cheers, =09Bruce >=20 > --=20 > 0@pervalidus.{net, {dyndns.}org} >=20 >=20 >=20 > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support >=20 |
From: <0...@pe...> - 2002-10-30 18:46:03
|
On Wed, 30 Oct 2002, Bruce Allen wrote: > I never liked passing these huge structures by value rather than by > reference, so I decided that the easiest thing to do was to simply modify > the calling semantics to pass pointers to the objects rather than the > objects themselves. It's how it should have been done in the first place. > > The fact that smartsuite-2.1 exhibits the same bug also suggests that this > may be the problem. > > I've posted a new release -- 5.0-24. Please give that a try and see if > gcc 3.2 -O2 now produces a fully-functional executable... I just read http://www.uwsg.iu.edu/hypermail/linux/kernel/0011.3/0036.html Here's some more information, and maybe why the Slackware 5.0-10 binaries produced by their GCC 3.2 are good. Their compiler was configured for i386-slackware-linux. From what I've read, then it defaults to -march=i386 -mcpu=i386. My 3.2 was compiled with what configure found, which is i686-pc-linux-gnu. Then it defaults to -march=i686 -mcpu=i686. I just downgraded to RELEASE_5_0_22, added -march=i386 -mcpu=i386 after -O2, and yes, it produced good binaries. Then I tried with -march=i486 -mcpu=i486: Good -march=i586 -mcpu=i586: Good -march=k6 -mcpu=k6: Good And to be sure I tested with -march=i686 -mcpu=i686: Bad -march=athlon -mcpu=athlon: Bad -- 0@pervalidus.{net, {dyndns.}org} |
From: Bruce A. <ba...@gr...> - 2002-10-30 19:03:49
|
If you can, you should add this information to the GCC bug report. I think it's still valid. My laptop is an i686, and I saw the bug on it. Bruce On Wed, 30 Oct 2002, [ISO-8859-1] Fr=E9d=E9ric L. W. Meunier wrote: > On Wed, 30 Oct 2002, Bruce Allen wrote: >=20 > > I never liked passing these huge structures by value rather than by > > reference, so I decided that the easiest thing to do was to simply modi= fy > > the calling semantics to pass pointers to the objects rather than the > > objects themselves. It's how it should have been done in the first pla= ce. > > > > The fact that smartsuite-2.1 exhibits the same bug also suggests that t= his > > may be the problem. > > > > I've posted a new release -- 5.0-24. Please give that a try and see if > > gcc 3.2 -O2 now produces a fully-functional executable... >=20 > I just read > http://www.uwsg.iu.edu/hypermail/linux/kernel/0011.3/0036.html >=20 > Here's some more information, and maybe why the > Slackware 5.0-10 binaries produced by their GCC 3.2 are good. >=20 > Their compiler was configured for i386-slackware-linux. From > what I've read, then it defaults to -march=3Di386 -mcpu=3Di386. >=20 > My 3.2 was compiled with what configure found, which is > i686-pc-linux-gnu. Then it defaults to -march=3Di686 -mcpu=3Di686. >=20 > I just downgraded to RELEASE_5_0_22, added -march=3Di386 > -mcpu=3Di386 after -O2, and yes, it produced good binaries. >=20 > Then I tried with -march=3Di486 -mcpu=3Di486: Good >=20 > -march=3Di586 -mcpu=3Di586: Good >=20 > -march=3Dk6 -mcpu=3Dk6: Good >=20 > And to be sure I tested with -march=3Di686 -mcpu=3Di686: Bad >=20 > -march=3Dathlon -mcpu=3Dathlon: Bad >=20 > --=20 > 0@pervalidus.{net, {dyndns.}org} >=20 >=20 |
From: Bruce A. <ba...@gr...> - 2002-10-30 15:11:18
|
Hi Fr=E9d=E9ric, I've posted a new release - could you try it and see if this fixes the problem? Now all I pass to functions are simple pointers. Hopefully gcc 3.2 gets that right. Also, I am very curious if the bug.c program I sent you misbehaves or not. If it does we need to file a bug report for gcc 3.2 Cheers, =09Bruce |
From: <0...@pe...> - 2002-10-30 15:26:09
|
Sorry for the delay. On Wed, 30 Oct 2002, Bruce Allen wrote: > I've posted a new release - could you try it and see if this > fixes the problem? Now all I pass to functions are simple > pointers. Hopefully gcc 3.2 gets that right. Yes. smartctl -a /dev/hda seems right. I compiled with the default options: smartctl version 5.0-24 Copyright (C) 2002 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: MAXTOR 6L060J3 Serial Number: 663200252994 Firmware Version: A93.0500 ATA Version is: 5 ATA Standard is: ATA/ATAPI-5 T13 1321D revision 1 SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Off-line data collection status: (0x02) Offline data collection activity completed without error. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete off-line data collection: ( 35) seconds. Offline data collection capabilities: (0x1b) SMART execute Offline immediate. Automatic timer ON/OFF support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 30) minutes. SMART Attributes Data Structure revision number: 11 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x0029 100 253 020 Pre-fail - 0 3 Spin_Up_Time 0x0027 066 066 020 Pre-fail - 4287 4 Start_Stop_Count 0x0032 100 100 008 Old_age - 24 5 Reallocated_Sector_Ct 0x0033 100 100 020 Pre-fail - 0 7 Seek_Error_Rate 0x000b 100 100 023 Pre-fail - 0 9 Power_On_Hours 0x0012 093 093 001 Old_age - 4962 10 Spin_Retry_Count 0x0026 100 100 000 Old_age - 0 11 Calibration_Retry_Count 0x0013 100 100 020 Pre-fail - 0 12 Power_Cycle_Count 0x0032 100 100 008 Old_age - 22 13 Read_Soft_Error_Rate 0x000b 100 100 023 Pre-fail - 0 194 Temperature_Centigrade 0x0022 079 074 042 Old_age - 54 195 Hardware_ECC_Recovered 0x001a 100 020 000 Old_age - 499481 196 Reallocated_Event_Count 0x0010 100 100 020 Old_age - 0 197 Current_Pending_Sector 0x0032 100 100 020 Old_age - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age - 0 199 UDMA_CRC_Error_Count 0x001a 200 200 000 Old_age - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log, version number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short off-line Completed 00% 4924 > Also, I am very curious if the bug.c program I sent you > misbehaves or not. If it does we need to file a bug report > for gcc 3.2 These two values should be equal: 62104 16 -- 0@pervalidus.{net, {dyndns.}org} |
From: Bruce A. <ba...@gr...> - 2002-10-30 15:31:50
|
Hi Fr=E9d=E9ric, > > I've posted a new release - could you try it and see if this > > fixes the problem? Now all I pass to functions are simple > > pointers. Hopefully gcc 3.2 gets that right. >=20 > Yes. smartctl -a /dev/hda seems right. I compiled with the > default options: Hurrray!! I hope smartd is also working OK. > > Also, I am very curious if the bug.c program I sent you > > misbehaves or not. If it does we need to file a bug report > > for gcc 3.2 >=20 > These two values should be equal: 62104 16 Are you there for a few more minutes? Let me send you something self-contained to try. Assuming that it breaks gcc 3.2, do you want to file the bug report or should I do it? Cheers, =09Bruce |
From: <0...@pe...> - 2002-10-30 15:35:38
|
On Wed, 30 Oct 2002, Bruce Allen wrote: > Hurrray!! I hope smartd is also working OK. Apparently it's. > > > Also, I am very curious if the bug.c program I sent you > > > misbehaves or not. If it does we need to file a bug report > > > for gcc 3.2 > > > > These two values should be equal: 62104 16 > > Are you there for a few more minutes? Let me send you > something self-contained to try. Assuming that it breaks gcc > 3.2, do you want to file the bug report or should I do it? Sure. For hours. I think you should file the bug (I never did it), but are you sure it's a bug and not a problem with my GCC 3.2, as others (like the 3.2 from Slackware) seem to work ? -- 0@pervalidus.{net, {dyndns.}org} |