From: Andrei S. <as...@gm...> - 2006-12-19 04:35:03
|
Hello, I have HP DL585 DB server with 4 HDDs installed as Raid 1+0 on Smart Array controller (Device: COMPAQ Smart Array 5i Version: 2.62). I can easily access/test cciss,0 and cciss,1 (first 2 drives in array): - smartctl -a -d cciss,0 /dev/cciss/c0d0 - smartctl -a -d cciss,1 /dev/cciss/c0d0 When queing the other 2 I get the following error: # smartctl -a -d cciss,2 /dev/cciss/c0d0 (or smartctl -a -d cciss,3 /dev/cciss/c0d0) smartctl version 5.37 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: COMPAQ Smart Array 5i Version: 2.62 >> Terminate command early due to bad response to IEC mode page A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. This is the log file (/var/log/smartd/smartd.log) Dec 18 15:17:01 lnx smartd[16124]: Device: /dev/cciss/c0d0 [cciss_disk_02], opened Dec 18 15:17:01 lnx smartd[16124]: Device: /dev/cciss/c0d0 [cciss_disk_02], Bad IEC (SMART) mode page, err=5, skip device I am using the latest CVS version as of today (18 Dec 06). Any help will be appreciated. Thanks, Andrei. |
From: Michael M. <mi...@np...> - 2006-12-19 04:56:10
|
Hi Andrei, > Hello, > > I have HP DL585 DB server with 4 HDDs installed as Raid 1+0 on Smart > Array controller (Device: COMPAQ Smart Array 5i Version: 2.62). I > can easily access/test cciss,0 and cciss,1 (first 2 drives in array): > > - smartctl -a -d cciss,0 /dev/cciss/c0d0 > - smartctl -a -d cciss,1 /dev/cciss/c0d0 > > When queing the other 2 I get the following error: > > # smartctl -a -d cciss,2 /dev/cciss/c0d0 (or smartctl -a -d cciss,3 > /dev/cciss/c0d0) > smartctl version 5.37 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 > Bruce Allen Home page is http://smartmontools.sourceforge.net/ > > Device: COMPAQ Smart Array 5i Version: 2.62 > >> Terminate command early due to bad response to IEC mode page > A mandatory SMART command failed: exiting. To continue, add one or > more '-T permissive' options. > > This is the log file (/var/log/smartd/smartd.log) > Dec 18 15:17:01 lnx smartd[16124]: Device: /dev/cciss/c0d0 > [cciss_disk_02], opened > Dec 18 15:17:01 lnx smartd[16124]: Device: /dev/cciss/c0d0 > [cciss_disk_02], Bad IEC (SMART) mode page, err=5, skip device > > I am using the latest CVS version as of today (18 Dec 06). I've exeperienced this exact same problem on Proliant BL40 blades using the latest smartmontools CVS also. 4 disks in raid 5, first two can be seen but the next two can't. I haven't reported this problem yet to the list as I'm still troubleshooting, because the allocation of what I see from smartmontools doesn't make sense in my scanerio. To explain a bit, what I can see from using: smartctl -a -d cciss,0 /dev/cciss/c0d0 smartctl -a -d cciss,1 /dev/cciss/c0d0 is the output of first two disks, yet when I boot the server and enter the SMART Array bios screen, I see: slot 0 MISSING slot 1 36.4gb slot 2 36.4gb slot 3 36.4gb Yes, "missing" is what I see for the first disk. If the disk is in fact missing, then why does it correctly display using "cciss,0" is what I find baffling? I get the same output as you do for cciss,2 and cciss,3. I had an engineer go to the data centre yesterday to physically look at the server to see if this is a failed or missing disk (maybe the SMART controller is not reporting the disk properly). So from my setup, I experienced this on two BL40's same config. I'm now on leave from work till 2nd of January and have passed the problem to a colleague who's handling it, so won't find out what happened till early next year. What I'm saying is that I've experienced the exact same problem, but because my results are so far inclusive (whether it smartmontools or the smart controller or ?? ) I didn't think reporting this as a smartmontools problem was the right thing to do. It looks like I'm now not the one, and if I have got a missing disk in slot 0, then cciss,0 should be outputting that or maybe it's querying something different? (slot assignments wrong etc). Regards, Michael. > Any help will be appreciated. > > Thanks, > Andrei. > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to > share your opinions on IT & business topics through brief surveys - > and earn cash http://www.techsay.com/default.php? page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support ------- End of Original Message ------- |
From: B. <fbo...@ca...> - 2006-12-19 07:49:06
|
Le lun 18 d=E9c 2006 23:35:00 CET, "Andrei Sereda" <as...@gm...> a =E9crit : > Hello, >=20 > I have HP DL585 DB server with 4 HDDs installed as Raid 1+0 on Smart > Array controller (Device: COMPAQ Smart Array 5i Version: 2.62). I > can easily access/test cciss,0 and cciss,1 (first 2 drives in array): >=20 > - smartctl -a -d cciss,0 /dev/cciss/c0d0 > - smartctl -a -d cciss,1 /dev/cciss/c0d0 >=20 > When queing the other 2 I get the following error: >=20 > # smartctl -a -d cciss,2 /dev/cciss/c0d0 (or smartctl -a -d cciss,3 > /dev/cciss/c0d0) > smartctl version 5.37 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruc= e Allen > Home page is http://smartmontools.sourceforge.net/ >=20 > Device: COMPAQ Smart Array 5i Version: 2.62 > >> Terminate command early due to bad response to IEC mode page > A mandatory SMART command failed: exiting. To continue, add one or > more '-T permissive' options. Hello, I=A0don't know DL585 DB model, but on DL380G4, you have 6=A0disk bays on the front. If your disks aren't in the 4=A0first ones, it can explain your problem :=A0simply try with 'cciss,4', 'cciss,5' and more to be sure... Fr=E9d=E9ric. |
From: Andrei S. <as...@gm...> - 2006-12-22 16:55:51
|
I am sure I'm using all 4 HDD sockets for Raid . smartctl -a -d cciss,4 /dev/cciss/c0p0 smartctl -a -d cciss,5 /dev/cciss/c0p0 smartctl -a -d cciss,6 /dev/cciss/c0p0 didn't help either. Andrei. ---------- Forwarded message ---------- From: Andrei Sereda <as...@gm...> Date: Dec 19, 2006 4:52 PM Subject: Re: [smartmontools-support] problems with CCISS as Raid 1+0 (4 dis= ks) To: Michael Mansour <mi...@np...> Mike, what Smart Array Controller are you using (version, firmware etc.). Also what are your HDDs (manufacter, size, interface etc.)? I have used compaq arrayprobe to find out if there are any problems ( http://www.strocamp.net/opensource/arrayprobe.php ) but seems to be OK. Unfortunately it doesn=B4t tell you the type and number of discs in RAID configuration. I can confirm this problem on 2 DB servers (ProLiant DL 585) with same RAID configuration (4 x Seagate 15K 300G; RAID 0+1). however it works fine on our blades (BL 25p) with RAID 1 ( 2 x Seagate 10K 300G). Besides the difference in RAID configuration the controllers are not the same: - DL586 : Smart Array 5i Version: 2.62 - BL 25p : HP SA 6i Version 2.68 So it is difficult to tell what is the problem: CCISS driver, smartmontools bug, RAID controller ? Thanks, Andrei. On 12/19/06, Michael Mansour <mi...@np...> wrote: > Hi Andrei, > > > Hello, > > > > I have HP DL585 DB server with 4 HDDs installed as Raid 1+0 on Smart > > Array controller (Device: COMPAQ Smart Array 5i Version: 2.62). I > > can easily access/test cciss,0 and cciss,1 (first 2 drives in array): > > > > - smartctl -a -d cciss,0 /dev/cciss/c0d0 > > - smartctl -a -d cciss,1 /dev/cciss/c0d0 > > > > When queing the other 2 I get the following error: > > > > # smartctl -a -d cciss,2 /dev/cciss/c0d0 (or smartctl -a -d cciss,3 > > /dev/cciss/c0d0) > > smartctl version 5.37 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 > > Bruce Allen Home page is http://smartmontools.sourceforge.net/ > > > > Device: COMPAQ Smart Array 5i Version: 2.62 > > >> Terminate command early due to bad response to IEC mode page > > A mandatory SMART command failed: exiting. To continue, add one or > > more '-T permissive' options. > > > > This is the log file (/var/log/smartd/smartd.log) > > Dec 18 15:17:01 lnx smartd[16124]: Device: /dev/cciss/c0d0 > > [cciss_disk_02], opened > > Dec 18 15:17:01 lnx smartd[16124]: Device: /dev/cciss/c0d0 > > [cciss_disk_02], Bad IEC (SMART) mode page, err=3D5, skip device > > > > I am using the latest CVS version as of today (18 Dec 06). > > I've exeperienced this exact same problem on Proliant BL40 blades using t= he > latest smartmontools CVS also. 4 disks in raid 5, first two can be seen b= ut > the next two can't. > > I haven't reported this problem yet to the list as I'm still > troubleshooting, because the allocation of what I see from smartmontools > doesn't make sense in my scanerio. > > To explain a bit, what I can see from using: > > smartctl -a -d cciss,0 /dev/cciss/c0d0 > smartctl -a -d cciss,1 /dev/cciss/c0d0 > > is the output of first two disks, yet when I boot the server and enter th= e > SMART Array bios screen, I see: > > slot 0 MISSING > slot 1 36.4gb > slot 2 36.4gb > slot 3 36.4gb > > Yes, "missing" is what I see for the first disk. If the disk is in fact > missing, then why does it correctly display using "cciss,0" is what I fin= d > baffling? > > I get the same output as you do for cciss,2 and cciss,3. > > I had an engineer go to the data centre yesterday to physically look at t= he > server to see if this is a failed or missing disk (maybe the SMART > controller is not reporting the disk properly). > > So from my setup, I experienced this on two BL40's same config. > > I'm now on leave from work till 2nd of January and have passed the proble= m > to a colleague who's handling it, so won't find out what happened till ea= rly > next year. > > What I'm saying is that I've experienced the exact same problem, but beca= use > my results are so far inclusive (whether it smartmontools or the smart > controller or ?? ) I didn't think reporting this as a smartmontools probl= em > was the right thing to do. > > It looks like I'm now not the one, and if I have got a missing disk in sl= ot > 0, then cciss,0 should be outputting that or maybe it's querying somethin= g > different? (slot assignments wrong etc). > > Regards, > > Michael. > > > Any help will be appreciated. > > > > Thanks, > > Andrei. > > > > -----------------------------------------------------------------------= -- > > Take Surveys. Earn Cash. Influence the Future of IT > > Join SourceForge.net's Techsay panel and you'll get the chance to > > share your opinions on IT & business topics through brief surveys - > > and earn cash http://www.techsay.com/default.php? > page=3Djoin.php&p=3Dsourceforge&CID=3DDEVDEV > > _______________________________________________ > > Smartmontools-support mailing list > > Sma...@li... > > https://lists.sourceforge.net/lists/listinfo/smartmontools-support > ------- End of Original Message ------- > > |
From: Andrei S. <as...@gm...> - 2007-04-05 17:55:50
|
Hi Sergey, Yes I still have this problem. Thank you for your help. Here you go ------------------------------- [root@moz ~]# smartctl -a -d cciss,2 -r ioctl,2 /dev/cciss/c0d0 smartctl version 5.37 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: COMPAQ Smart Array 5i Version: 2.62 scsiModePageOffset: response length too short, resp_len=1 offset=4 bd_len=0 >> Terminate command early due to bad response to IEC mode page A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. [root@moz ~]# smartctl -a -d cciss,3 -r ioctl,2 /dev/cciss/c0d0 smartctl version 5.37 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: COMPAQ Smart Array 5i Version: 2.62 scsiModePageOffset: response length too short, resp_len=1 offset=4 bd_len=0 >> Terminate command early due to bad response to IEC mode page A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. ------------------------------------------ If it will help, the output of arrayprobe ------------------------------- [root@moz ~]# ./arrayprobe -r -f /dev/cciss/c0d0 Retrieving logical drive information from controller /dev/cciss/c0d0 Number of logical volumes (00 00 00 08) : 1 Controller /dev/cciss/c0d0 reports 1 logical drives Logical drive 0 found on controller /dev/cciss/c0d0 Event code 5/0/0 at 2-5-2006 02:12:31 with message: State change, logical drive 0 logical drive 0, changed from state 2 to 0 state 2: Logical drive is not configured state 0: Logical drive is ok Event code 5/2/0 at 2-9-2006 02:10:27 with message: Parity/consistency initialization complete, logical drive 0 Event code 5/0/0 at 2-4-2007 20:56:28 with message: State change, logical drive 0 logical drive 0, changed from state 0 to 2 state 0: Logical drive is ok state 2: Logical drive is not configured Event code 5/0/0 at 2-4-2007 20:56:44 with message: State change, logical drive 0 logical drive 0, changed from state 2 to 0 state 2: Logical drive is not configured state 0: Logical drive is ok Event code 5/2/0 at 2-5-2007 01:15:46 with message: Parity/consistency initialization complete, logical drive 0 Event code 0/0/0 with message: No events to report. Logical drive 0 on controller /dev/cciss/c0d0 has state 0 OK Arrayprobe All controllers ok On 4/5/07, Sergey Svishchev <sv...@ro...> wrote: > On Mon, Dec 18, 2006 at 11:35:00PM -0500, Andrei Sereda wrote: > >Hello, > > > >I have HP DL585 DB server with 4 HDDs installed as Raid 1+0 on Smart > >Array controller (Device: COMPAQ Smart Array 5i Version: 2.62). I > >can easily access/test cciss,0 and cciss,1 (first 2 drives in array): > > > >- smartctl -a -d cciss,0 /dev/cciss/c0d0 > >- smartctl -a -d cciss,1 /dev/cciss/c0d0 > > > >When queing the other 2 I get the following error: > > > ># smartctl -a -d cciss,2 /dev/cciss/c0d0 (or smartctl -a -d cciss,3 > >/dev/cciss/c0d0) > >smartctl version 5.37 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen > >Home page is http://smartmontools.sourceforge.net/ > > > >Device: COMPAQ Smart Array 5i Version: 2.62 > >>> Terminate command early due to bad response to IEC mode page > > Do you still have this problem? If you do, please add '-r ioctl,2' to > command line and mail me the output. > > -- > Sergey Svishchev > > |
From: Sergey S. <sha...@us...> - 2007-04-11 18:57:59
Attachments:
cciss-abs.diff
|
On Mon, Dec 18, 2006 at 11:35:00PM -0500, Andrei Sereda wrote: >Hello, > >I have HP DL585 DB server with 4 HDDs installed as Raid 1+0 on Smart >Array controller (Device: COMPAQ Smart Array 5i Version: 2.62). I >can easily access/test cciss,0 and cciss,1 (first 2 drives in array): > >- smartctl -a -d cciss,0 /dev/cciss/c0d0 >- smartctl -a -d cciss,1 /dev/cciss/c0d0 > >When queing the other 2 I get the following error: > ># smartctl -a -d cciss,2 /dev/cciss/c0d0 (or smartctl -a -d cciss,3 >/dev/cciss/c0d0) >smartctl version 5.37 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen >Home page is http://smartmontools.sourceforge.net/ > >Device: COMPAQ Smart Array 5i Version: 2.62 >>> Terminate command early due to bad response to IEC mode page As far as I understood from Open_CISS_Spec.pdf (the only public specification there is, apparently), CISS LUN addresses are 8 bytes long. Current code is rather ad-hoc and matches 7th of these 8 bytes against drive number from command line, and then uses the first matched LUN address. This breaks in Andrei's case: ===== [LUN DATA] DATA START (BASE-16) ===== 000-015: 00 00 00 30 00 00 00 00 00 00 00 c0 00 00 00 01 016-031: 00 00 00 c0 00 00 01 01 00 00 00 c0 00 00 0f 01 032-047: 00 00 00 c0 00 00 00 02 00 00 00 c0 00 00 01 02 048-063: 00 00 00 c0 00 00 0f 02 00 00 00 00 00 00 00 00 Until a better solution is found, I propose to treat 'drive number' as index into LUN addresses' array (see attached diff). -- Sergey Svishchev |
From: Michael M. <mi...@np...> - 2007-04-12 01:02:46
|
Hi, > On Mon, Dec 18, 2006 at 11:35:00PM -0500, Andrei Sereda wrote: > >Hello, > > > >I have HP DL585 DB server with 4 HDDs installed as Raid 1+0 on Smart > >Array controller (Device: COMPAQ Smart Array 5i Version: 2.62). I > >can easily access/test cciss,0 and cciss,1 (first 2 drives in array): > > > >- smartctl -a -d cciss,0 /dev/cciss/c0d0 > >- smartctl -a -d cciss,1 /dev/cciss/c0d0 > > > >When queing the other 2 I get the following error: > > > ># smartctl -a -d cciss,2 /dev/cciss/c0d0 (or smartctl -a -d cciss,3 > >/dev/cciss/c0d0) > >smartctl version 5.37 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen > >Home page is http://smartmontools.sourceforge.net/ > > > >Device: COMPAQ Smart Array 5i Version: 2.62 > >>> Terminate command early due to bad response to IEC mode page A few months back I bought two new DL380 G4 SAS servers, identical machines in every way. Both have two hardware mirrored SAS drives. On one of them, I can query smartctl 0 and 1 devices, the other I get the error above. One of them, querying 0 and 1: # smartctl -a -dcciss,0 /dev/cciss/c0d0 smartctl version 5.37 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: HP DG146ABAB4 Version: HPD2 Serial number: 3NM01DML000087049405 Device type: disk Transport protocol: SAS Local Time is: Thu Apr 12 10:55:47 2007 EST Device supports SMART and is Enabled Temperature Warning Enabled SMART Health Status: OK Current Drive Temperature: 24 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 3901854737 Blocks received from initiator = 373625747 Blocks read from cache and sent to initiator = 864717603 Number of read and write commands whose size <= segment size = 158286194 Number of read and write commands whose size > segment size = 0 Vendor (Seagate/Hitachi) factory information number of hours powered up = 3401.55 number of minutes until next internal SMART test = 13 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 0 0 0 0 0 0.000 0 write: 0 0 0 0 0 0.000 0 Non-medium error count: 0 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed - 3392 - [- - -] # 2 Background short Completed - 3368 - [- - -] # 3 Background short Completed - 3344 - [- - -] # 4 Background short Completed - 3320 - [- - -] # 5 Background short Completed - 3296 - [- - -] # 6 Background long Completed - 3274 - [- - -] # 7 Background short Completed - 3272 - [- - -] # 8 Background short Completed - 3248 - [- - -] # 9 Background short Completed - 3224 - [- - -] #10 Background short Completed - 3200 - [- - -] #11 Background short Completed - 3176 - [- - -] #12 Background short Completed - 3152 - [- - -] #13 Background short Completed - 3128 - [- - -] #14 Background long Completed - 3106 - [- - -] #15 Background short Completed - 3104 - [- - -] #16 Background short Completed - 3080 - [- - -] #17 Background short Completed - 3056 - [- - -] #18 Background short Completed - 3032 - [- - -] #19 Background short Completed - 3008 - [- - -] #20 Background short Completed - 2984 - [- - -] Long (extended) Self Test duration: 2070 seconds [34.5 minutes] # smartctl -a -dcciss,1 /dev/cciss/c0d0 smartctl version 5.37 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: HP DG146ABAB4 Version: HPD2 Serial number: 3NM01RG4000086493HLY Device type: disk Transport protocol: SAS Local Time is: Thu Apr 12 10:56:38 2007 EST Device supports SMART and is Enabled Temperature Warning Enabled SMART Health Status: OK Current Drive Temperature: 24 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 1440004366 Blocks received from initiator = 12030696 Blocks read from cache and sent to initiator = 692884891 Number of read and write commands whose size <= segment size = 129701420 Number of read and write commands whose size > segment size = 0 Vendor (Seagate/Hitachi) factory information number of hours powered up = 3368.72 number of minutes until next internal SMART test = 20 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 0 0 0 0 0 0.000 0 write: 0 0 0 0 0 0.000 0 Non-medium error count: 0 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed - 3360 - [- - -] # 2 Background short Completed - 3335 - [- - -] # 3 Background short Completed - 3311 - [- - -] # 4 Background short Completed - 3287 - [- - -] # 5 Background short Completed - 3263 - [- - -] # 6 Background long Completed - 3241 - [- - -] # 7 Background short Completed - 3239 - [- - -] # 8 Background short Completed - 3215 - [- - -] # 9 Background short Completed - 3191 - [- - -] #10 Background short Completed - 3167 - [- - -] #11 Background short Completed - 3143 - [- - -] #12 Background short Completed - 3119 - [- - -] #13 Background short Completed - 3095 - [- - -] #14 Background long Completed - 3073 - [- - -] #15 Background short Completed - 3071 - [- - -] #16 Background short Completed - 3047 - [- - -] #17 Background short Completed - 3023 - [- - -] #18 Background short Completed - 2999 - [- - -] #19 Background short Completed - 2975 - [- - -] #20 Background short Completed - 2951 - [- - -] Long (extended) Self Test duration: 2070 seconds [34.5 minutes] One the other one: # smartctl -a -d cciss,0 /dev/cciss/c0d0 smartctl version 5.37 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: HP P600 Version: 1.52 >> Terminate command early due to bad response to IEC mode page A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. # smartctl -a -d cciss,1 /dev/cciss/c0d0 smartctl version 5.37 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: HP P600 Version: 1.52 >> Terminate command early due to bad response to IEC mode page A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. I've been hunting around for an explanation of this one but can't find one. > As far as I understood from Open_CISS_Spec.pdf (the only public > specification there is, apparently), CISS LUN addresses are 8 bytes > long. Current code is rather ad-hoc and matches 7th of these 8 bytes > against drive number from command line, and then uses the first matched > LUN address. This breaks in Andrei's case: > > ===== [LUN DATA] DATA START (BASE-16) ===== > 000-015: 00 00 00 30 00 00 00 00 00 00 00 c0 00 00 00 01 > 016-031: 00 00 00 c0 00 00 01 01 00 00 00 c0 00 00 0f 01 > 032-047: 00 00 00 c0 00 00 00 02 00 00 00 c0 00 00 01 02 > 048-063: 00 00 00 c0 00 00 0f 02 00 00 00 00 00 00 00 00 > > Until a better solution is found, I propose to treat 'drive number' > as index into LUN addresses' array (see attached diff). >From the above problem I'm getting, I question whether it's related to devices 2 and 3 at all. Michael. > -- > Sergey Svishchev ------- End of Original Message ------- |
From: Sergey S. <sha...@us...> - 2007-04-14 23:02:41
|
On Thu, Apr 12, 2007 at 11:01:44AM +1000, Michael Mansour wrote: > >A few months back I bought two new DL380 G4 SAS servers, identical machine= s in >every way. Both have two hardware mirrored SAS drives. > >On one of them, I can query smartctl 0 and 1 devices, the other I get the >error above. >One the other one: > ># smartctl -a -d cciss,0 /dev/cciss/c0d0 >smartctl version 5.37 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruc= e Allen >Home page is http://smartmontools.sourceforge.net/ > >Device: HP P600 Version: 1.52 >>> Terminate command early due to bad response to IEC mode page Please post '-r ioctl,2' output from this server. --=20 Sergey Svishchev |
From: Michael M. <mi...@np...> - 2007-04-14 23:53:31
|
Hi Sergey, > On Thu, Apr 12, 2007 at 11:01:44AM +1000, Michael Mansour wrote: > > > >A few months back I bought two new DL380 G4 SAS servers, identical machines in > >every way. Both have two hardware mirrored SAS drives. > > > >On one of them, I can query smartctl 0 and 1 devices, the other I get the > >error above. > > >One the other one: > > > ># smartctl -a -d cciss,0 /dev/cciss/c0d0 > >smartctl version 5.37 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen > >Home page is http://smartmontools.sourceforge.net/ > > > >Device: HP P600 Version: 1.52 > >>> Terminate command early due to bad response to IEC mode page > > Please post '-r ioctl,2' output from this server. On the one that doesn't work: # smartctl -i -r ioctl,2 -d cciss,0 /dev/cciss/c0d0 smartctl version 5.37 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: HP P600 Version: 1.52 scsiModePageOffset: response length too short, resp_len=1 offset=4 bd_len=0 >> Terminate command early due to bad response to IEC mode page A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. On the one that works: # smartctl -i -r ioctl,2 -d cciss,0 /dev/cciss/c0d0 smartctl version 5.37 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: HP DG146ABAB4 Version: HPD2 Serial number: 3NM01DML000087049405 Device type: disk Transport protocol: SAS Local Time is: Sun Apr 15 09:51:29 2007 EST Device supports SMART and is Enabled Temperature Warning Enabled Thanks for helping trouble-shoot this one. Anything I can asisst with please let me know. Michael. > -- > Sergey Svishchev ------- End of Original Message ------- |