From: Bruce A. <ba...@gr...> - 2007-04-13 08:57:41
|
Tom, FYI. On Thu, 12 Apr 2007, Douglas Gilbert wrote: > Bruce Allen wrote: >> FYI > > Bruce, > Everything looks okay in the dump. All SCSI commands > succeeded so the "SCSI transport failed" messages > may not be related to smartmontools. > > Doug Gilbert > >> ---------- Forwarded message ---------- >> Date: Thu, 12 Apr 2007 13:58:14 -0700 >> From: Tom Wiley <to...@hy...> >> To: Bruce Allen <ba...@gr...> >> Subject: RE: [smartmontools-support] smartd error on Solaris 2.6 >> >> Bruce, >> >> Thank you for your reply. I started the daemon manually, so the SCSI >> bus should be settled, the machine hasn't been rebooted in awhile. >> >> I've noticed all SEAGATE ST318203LSUN18G Version: 034AT(Qty 8) and one >> SEAGATE SX318203LC Version: B90C, report: >> >> DATA CHANNEL IMPENDING FAILURE DATA ERROR RATE TOO HIGH [asc=5d, >> ascq=32] >> >> I spoke with Seagate and they don't support firmware upgrades on these >> drives because they're SUN OEM, so I'm planning to replace ASAP. >> >> -Tom >> >> Here is the output of "-r ioctl,3": >> >> >> wpt17-root # /usr/sbin/smartctl -a -r ioctl,3 /dev/rdsk/c0t0d0s0 >> smartctl version 5.37 [sparc-sun-solaris2.6] Copyright (C) 2002-6 Bruce >> Allen >> Home page is http://smartmontools.sourceforge.net/ >> >> [inquiry: 12 00 00 00 24 00 ] status=0 >> Incoming data, len=36: >> 00 00 00 03 12 8b 00 01 3e 53 45 41 47 41 54 45 20 >> >> 10 53 54 33 33 36 37 30 34 4c 53 55 4e 33 36 47 20 >> >> 20 30 33 32 36 >> >> Device: SEAGATE ST336704LSUN36G Version: 0326 >> [mode sense: 1a 00 1c 00 40 00 ] status=0 >> Incoming data, len=64: >> 00 17 00 10 08 04 3d 67 1f 00 00 02 00 9c 0a 10 06 >> >> 10 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 >> >> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> [mode sense: 1a 00 5c 00 40 00 ] status=0 >> Incoming data, len=64: >> 00 17 00 10 08 04 3d 67 1f 00 00 02 00 9c 0a 9d 0f >> >> 10 ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 >> >> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> [inquiry: 12 01 80 00 40 00 ] status=0 >> Incoming data, len=64: >> 00 00 80 00 14 33 43 44 31 48 30 52 32 30 30 30 30 >> >> 10 37 31 32 39 42 59 4d 4a 4c 53 55 4e 33 36 47 20 >> >> 20 30 33 32 36 00 00 00 00 00 00 00 00 00 00 00 00 >> >> 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> Serial number: 3CD1H0R200007129BYMJ >> Device type: disk >> [mode sense: 1a 00 19 00 40 00 ]Local Time is: Thu Apr 12 13:48:01 2007 >> PDT >> [test unit ready: 00 00 00 00 00 00 ] status=0 >> Incoming data, len=0: >> Device supports SMART and is Enabled >> Temperature Warning Enabled >> [log sense: 4d 00 40 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 00 00 00 0b >> >> [log sense: 4d 00 40 00 00 00 00 00 10 00 ] status=0 >> Incoming data, len=16: >> 00 00 00 00 0b 00 02 03 05 06 0d 10 0e 37 3d 3e 00 >> >> [request sense: 03 00 00 00 12 00 ] status=0 >> Incoming data, len=18: >> 00 70 00 00 00 d2 7d a6 0a 00 00 00 00 00 00 00 00 >> >> 10 00 00 >> >> [log sense: 4d 00 4d 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 0d 00 00 0c >> >> [log sense: 4d 00 4d 00 00 00 00 00 10 00 ] status=0 >> Incoming data, len=16: >> 00 0d 00 00 0c 00 00 00 02 00 28 00 01 00 02 00 41 >> >> SMART Health Status: OK >> >> [log sense: 4d 00 4d 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 0d 00 00 0c >> >> [log sense: 4d 00 4d 00 00 00 00 00 10 00 ] status=0 >> Incoming data, len=16: >> 00 0d 00 00 0c 00 00 00 02 00 28 00 01 00 02 00 41 >> >> Current Drive Temperature: 40 C >> Drive Trip Temperature: 65 C >> [log sense: 4d 00 4e 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 0e 00 00 24 >> >> [log sense: 4d 00 4e 00 00 00 00 00 28 00 ] status=0 >> Incoming data, len=40: >> 00 0e 00 00 24 00 01 01 06 32 30 30 31 30 34 00 02 >> >> 10 01 06 32 30 20 20 20 20 00 03 00 04 00 00 27 10 >> >> 20 00 04 00 04 00 00 00 ab >> >> Manufactured in week 04 of year 2001 >> Recommended maximum start stop count: 10000 times >> Current start stop count: 171 times >> [read defect list(10): 37 00 0c 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 00 0c 00 00 >> >> Elements in grown defect list: 0 >> [log sense: 4d 00 77 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 37 00 00 28 >> >> [log sense: 4d 00 77 00 00 00 00 00 2c 00 ] status=0 >> Incoming data, len=44: >> 00 37 00 00 28 00 00 00 04 b5 58 14 2d 00 01 00 04 >> >> 10 0d 6f 2c 44 00 02 00 04 00 ac 20 e4 00 03 00 04 >> >> 20 00 6f 78 5c 00 04 00 04 00 00 00 00 >> >> Vendor (Seagate) cache information >> Blocks sent to initiator = 3042448429 >> Blocks received from initiator = 225389636 >> Blocks read from cache and sent to initiator = 11280612 >> Number of read and write commands whose size <= segment size = 7305308 >> Number of read and write commands whose size > segment size = 0 >> [log sense: 4d 00 7e 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 3e 00 00 10 >> >> [log sense: 4d 00 7e 00 00 00 00 00 14 00 ] status=0 >> Incoming data, len=20: >> 00 3e 00 00 10 00 00 00 04 00 00 e1 5f 00 08 00 04 >> >> 10 00 00 00 64 >> >> Vendor (Seagate/Hitachi) factory information >> number of hours powered up = 961.58 >> number of minutes until next internal SMART test = 100 >> [log sense: 4d 00 43 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 03 00 00 3c >> >> [log sense: 4d 00 43 00 00 00 00 00 40 00 ] status=0 >> Incoming data, len=64: >> 00 03 00 00 3c 00 00 00 04 00 00 30 50 00 01 00 04 >> >> 10 00 00 00 00 00 02 00 04 00 00 00 00 00 03 00 04 >> >> 20 00 00 30 50 00 04 00 04 00 00 30 50 00 05 00 08 >> >> 30 00 00 00 2c 52 6b 4e 00 00 06 00 04 00 00 00 00 >> >> [log sense: 4d 00 42 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 02 00 00 34 >> >> [log sense: 4d 00 42 00 00 00 00 00 38 00 ] status=0 >> Incoming data, len=56: >> 00 02 00 00 34 00 01 00 04 00 00 00 00 00 02 00 04 >> >> 10 00 00 00 00 00 03 00 04 00 00 00 00 00 04 00 04 >> >> 20 00 00 00 00 00 05 00 08 00 00 00 1f 81 cb b8 00 >> >> 30 00 06 00 04 00 00 00 00 >> >> [log sense: 4d 00 45 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 05 00 00 3c >> >> [log sense: 4d 00 45 00 00 00 00 00 40 00 ] status=0 >> Incoming data, len=64: >> 00 05 00 00 3c 00 00 00 04 00 00 00 00 00 01 00 04 >> >> 10 00 00 00 00 00 02 00 04 00 00 00 00 00 03 00 04 >> >> 20 00 00 00 00 00 04 00 04 00 00 00 00 00 05 00 08 >> >> 30 00 00 00 00 00 00 00 00 00 06 00 04 00 00 00 00 >> >> >> Error counter log: >> Errors Corrected by Total Correction >> Gigabytes Total >> ECC rereads/ errors algorithm >> processed uncorrected >> fast | delayed rewrites corrected invocations [10^9 >> bytes] errors >> read: 12368 0 0 12368 12368 190.361 >> 0 >> write: 0 0 0 0 0 135.322 >> 0 >> [log sense: 4d 00 46 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 06 00 00 08 >> >> [log sense: 4d 00 46 00 00 00 00 00 0c 00 ] status=0 >> Incoming data, len=12: >> 00 06 00 00 08 00 00 00 04 00 00 00 0d >> >> >> Non-medium error count: 13 >> [mode sense: 1a 00 0a 00 40 00 ] status=0 >> Incoming data, len=64: >> 00 17 00 10 08 04 3d 67 1f 00 00 02 00 8a 0a 00 00 >> >> 10 00 00 00 00 00 00 05 46 00 00 00 00 00 00 00 00 >> >> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> [log sense: 4d 00 50 00 00 00 00 00 04 00 ] status=0 >> Incoming data, len=4: >> 00 10 00 01 90 >> >> [log sense: 4d 00 50 00 00 00 00 01 94 00 ] status=0 >> Incoming data, len=404 [only first 256 bytes shown]: >> 00 10 00 01 90 00 01 03 10 20 00 00 00 ff ff ff ff >> >> 10 ff ff ff ff 00 00 00 00 00 02 03 10 20 00 00 00 >> >> 20 ff ff ff ff ff ff ff ff 00 00 00 00 00 03 03 10 >> >> 30 20 00 00 00 ff ff ff ff ff ff ff ff 00 00 00 00 >> >> 40 00 04 03 10 20 00 00 00 ff ff ff ff ff ff ff ff >> >> 50 00 00 00 00 00 05 03 10 20 00 00 00 ff ff ff ff >> >> 60 ff ff ff ff 00 00 00 00 00 06 03 10 20 00 00 00 >> >> 70 ff ff ff ff ff ff ff ff 00 00 00 00 00 07 03 10 >> >> 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> 90 00 08 03 10 00 00 00 00 00 00 00 00 00 00 00 00 >> >> a0 00 00 00 00 00 09 03 10 00 00 00 00 00 00 00 00 >> >> b0 00 00 00 00 00 00 00 00 00 0a 03 10 00 00 00 00 >> >> c0 00 00 00 00 00 00 00 00 00 00 00 00 00 0b 03 10 >> >> d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> e0 00 0c 03 10 00 00 00 00 00 00 00 00 00 00 00 00 >> >> f0 00 00 00 00 00 0d 03 10 00 00 00 00 00 00 00 00 >> >> >> SMART Self-test log >> Num Test Status segment LifeTime >> LBA_first_err [SK ASC ASQ] >> Description number (hours) >> # 1 Background short Completed - 0 >> - [- - -] >> # 2 Background short Completed - 0 >> - [- - -] >> # 3 Background short Completed - 0 >> - [- - -] >> # 4 Background short Completed - 0 >> - [- - -] >> # 5 Background short Completed - 0 >> - [- - -] >> # 6 Background short Completed - 0 >> - [- - -] >> >> [mode sense: 1a 00 0a 00 40 00 ] status=0 >> Incoming data, len=64: >> 00 17 00 10 08 04 3d 67 1f 00 00 02 00 8a 0a 00 00 >> >> 10 00 00 00 00 00 00 05 46 00 00 00 00 00 00 00 00 >> >> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> >> Long (extended) Self Test duration: 1350 seconds [22.5 minutes] >> >> -----Original Message----- >> From: Bruce Allen [mailto:ba...@gr...] >> Sent: Thursday, April 12, 2007 12:58 PM >> To: Smartmontools Mailing List >> Cc: Tom Wiley >> Subject: Re: [smartmontools-support] smartd error on Solaris 2.6 >> >> Tom, >> >> See thread below. Try adding '-r ioctl,3' to get verbose output showing >> data exchange details. >> >> Cheers, >> bruce >> >> >> On Thu, 12 Apr 2007, Douglas Gilbert wrote: >> >>> Bruce Allen wrote: >>>> Doug, any thoughts about this one? >>>> >>>> Bruce >>>> >>>> ---------- Forwarded message ---------- >>>> Date: Tue, 10 Apr 2007 12:18:42 -0700 >>>> From: Tom Wiley <to...@hy...> >>>> To: sma...@li... >>>> Subject: [smartmontools-support] smartd error on Solaris 2.6 >>>> >>>> Bruce, >>>> >>>> I wanted to give you some feedback about running smartmontools on my >>>> Solaris 2.6 box. "smard" displays the following error on startup in >>>> /var/adm/messages: >>>> >>>> Apr 10 11:54:09 wpt17 unix: WARNING: /pci@1f,4000/scsi@3/sd@0,0 >> (sd0): >>>> Apr 10 11:54:09 wpt17 unix: SCSI transport failed: reason >>>> 'unexpected_bus_free': giving up >>>> Apr 10 11:54:09 wpt17 unix: >>>> Apr 10 11:54:09 wpt17 unix: WARNING: /pci@1f,4000/scsi@3/sd@0,0 >> (sd0): >>>> Apr 10 11:54:09 wpt17 unix: SCSI transport failed: reason >>>> 'unexpected_bus_free': giving up >>>> Apr 10 11:54:09 wpt17 unix: >>>> >>>> It shows it's running, so I don't know if there is anything else I >>>> should try: >>>> >>>> # ps -ef |grep smartd >>>> root 13416 1 0 12:10:09 ? 0:00 /usr/sbin/smartd -p >>>> /var/run/smartd.pid >>>> >>>> Thank you, >>>> >>>> Thomas Wiley >>>> Test Equipment Engineer >>>> Hynix Semiconductor Manufacturing America >>>> (541) 338-5349 >>> >>> Bruce, >>> Perhaps: >>> a) smartd is starting too early after machine boot up >>> before the SCSI parallel bus had settled, or >>> b) some SCSI command that smartd has sent has caused >>> a bus reset (or it was badly formed) >>> >>> As always, '-r ioctl,3' may shed some light. >>> >>> Doug Gilbert >>> >> > |