それは一通のメールからはじまった。


From root@hogehoge.localdomain Sat Mar 26 08:15:55 2011
Envelope-to: root@hogehoge.localdomain
Delivery-date: Sat, 26 Mar 2011 08:15:55 +0900
To: root@hogehoge.localdomain
Subject: SMART error (FailedReadSmartSelfTestLog) detected on host: hogehoge
From: root <root@hogehoge.localdomain>
Date: Sat, 26 Mar 2011 08:13:34 +0900
 
This email was generated by the smartd daemon running on:
 
   host name: hogehoge
  DNS domain: localdomain
  NIS domain: (none)
 
The following warning/error was logged by the smartd daemon:
 
Device: /dev/sda, Read SMART Self-Test Log Failed
 
For details see host's SYSLOG (default: /var/log/syslog).
 
You can also use the smartctl utility for further investigation.
No additional email messages about this problem will be sent.
syslogを見る。

Mar 26 08:13:07 hogehoge smartd[2830]: Device: /dev/sda, not capable of SMART self-check
Mar 26 08:13:07 hogehoge kernel: [5394267.093395] ata1.00: exception Emask 0x0 SAct 0x0 SErr
0x0 action 0x6 frozen
Mar 26 08:13:07 hogehoge kernel: [5394267.093415] ata1.00: cmd b0/da:00:00:4f:c2/00:00:00:00:
00/00 tag 0
Mar 26 08:13:07 hogehoge kernel: [5394267.093417]          res 40/00:00:68:fe:ee/24:00:22:00:
00/40 Emask 0x4 (timeout)
Mar 26 08:13:07 hogehoge kernel: [5394267.093434] ata1.00: status: { DRDY }
Mar 26 08:13:07 hogehoge kernel: [5394267.093445] ata1: hard resetting link
Mar 26 08:13:07 hogehoge kernel: [5394268.594916] ata1: softreset failed (device not ready)
Mar 26 08:13:07 hogehoge kernel: [5394268.594931] ata1: failed due to HW bug, retry pmp=0
Mar 26 08:13:07 hogehoge kernel: [5394268.762664] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Mar 26 08:13:07 hogehoge kernel: [5394271.097305] ata1.00: SB600 AHCI: limiting to 255 sectors per cmd
Mar 26 08:13:07 hogehoge kernel: [5394271.189184] ata1.00: SB600 AHCI: limiting to 255 sectors per cmd
Mar 26 08:13:07 hogehoge kernel: [5394271.189198] ata1.00: configured for UDMA/133
Mar 26 08:13:07 hogehoge kernel: [5394271.189222] ata1: EH complete
Mar 26 08:13:07 hogehoge kernel: [5394271.189412] sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
Mar 26 08:13:07 hogehoge kernel: [5394271.189445] sd 0:0:0:0: [sda] Write Protect is off
Mar 26 08:13:07 hogehoge kernel: [5394271.189454] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Mar 26 08:13:07 hogehoge kernel: [5394271.189493] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 26 08:13:14 hogehoge kernel: [5394278.396056] ata1.00: exception Emask 0x0 SAct 0x0 SErr0x0 action 0x6 frozen
Mar 26 08:13:14 hogehoge kernel: [5394278.396076] ata1.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
Mar 26 08:13:14 hogehoge kernel: [5394278.396078]          res 40/00:00:68:fe:ee/24:00:22:00:00/40 Emask 0x4 (timeout)
Mar 26 08:13:14 hogehoge kernel: [5394278.398909] ata1.00: status: { DRDY }
Mar 26 08:13:14 hogehoge kernel: [5394278.398920] ata1: hard resetting link
Mar 26 08:13:15 hogehoge kernel: [5394279.647171] ata1: softreset failed (device not ready)
Mar 26 08:13:15 hogehoge kernel: [5394279.647183] ata1: failed due to HW bug, retry pmp=0
Mar 26 08:13:15 hogehoge kernel: [5394279.812178] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Mar 26 08:13:21 hogehoge kernel: [5394286.112247] ata1.00: SB600 AHCI: limiting to 255 sectors per cmd
Mar 26 08:13:21 hogehoge kernel: [5394286.155355] ata1.00: SB600 AHCI: limiting to 255 sectors per cmd
Mar 26 08:13:21 hogehoge kernel: [5394286.155355] ata1.00: configured for UDMA/133
Mar 26 08:13:21 hogehoge kernel: [5394286.155355] ata1: EH complete
Mar 26 08:13:21 hogehoge kernel: [5394286.155355] sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
Mar 26 08:13:21 hogehoge kernel: [5394286.155355] sd 0:0:0:0: [sda] Write Protect is off
Mar 26 08:13:21 hogehoge kernel: [5394286.155355] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Mar 26 08:13:21 hogehoge kernel: [5394286.155355] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 26 08:13:21 hogehoge smartd[2830]: Device: /dev/sda, failed to read SMART Attribute Data
Mar 26 08:13:28 hogehoge kernel: [5394293.897837] ata1.00: exception Emask 0x0 SAct 0x0 SErr0x0 action 0x6 frozen
Mar 26 08:13:28 hogehoge kernel: [5394293.897857] ata1.00: cmd b0/d5:01:06:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
Mar 26 08:13:28 hogehoge kernel: [5394293.897859]          res 40/00:00:68:fe:ee/24:00:22:00:00/40 Emask 0x4 (timeout)
Mar 26 08:13:28 hogehoge kernel: [5394293.897880] ata1.00: status: { DRDY }
Mar 26 08:13:28 hogehoge kernel: [5394293.897891] ata1: hard resetting link
Mar 26 08:13:29 hogehoge kernel: [5394295.389831] ata1: softreset failed (device not ready)
Mar 26 08:13:29 hogehoge kernel: [5394295.389843] ata1: failed due to HW bug, retry pmp=0
Mar 26 08:13:30 hogehoge kernel: [5394295.554677] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Mar 26 08:13:34 hogehoge kernel: [5394300.137817] ata1.00: SB600 AHCI: limiting to 255 sectors per cmd
Mar 26 08:13:34 hogehoge smartd[2830]: Device: /dev/sda, Read SMART Self Test Log Failed
Mar 26 08:13:34 hogehoge smartd[2830]: Sending warning via /usr/share/smartmontools/smartd-runner to root ...
Mar 26 08:13:34 hogehoge kernel: [5394300.194817] ata1.00: SB600 AHCI: limiting to 255 sectors per cmd
Mar 26 08:13:34 hogehoge kernel: [5394300.194817] ata1.00: configured for UDMA/133
Mar 26 08:13:34 hogehoge kernel: [5394300.194817] ata1: EH complete
Mar 26 08:13:34 hogehoge kernel: [5394300.194817] sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
Mar 26 08:13:34 hogehoge kernel: [5394300.194817] sd 0:0:0:0: [sda] Write Protect is off
Mar 26 08:13:34 hogehoge kernel: [5394300.194817] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Mar 26 08:13:34 hogehoge kernel: [5394300.194817] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ぜんぜんわかんにゃい。。。
There's two possibilities:

(1) disk working normally, but not capable of self-test. In this case
remove the scheduled self-test directives from /etc/smartd.conf

(2) disk failed or failing, so no longer responding correctly to smartd
scheduled self test.

Re: what does "not capable of SMART self-check" mean? - msg#00049 - linux.utilities.smartmontools

Ref.
-2011/01/31、「ata2.00:exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen」が発生 - debian-etch に関するメモ(後、lennyへアップグレード)
-2010/04/03、「ata2.00:exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen」が発生 - debian-etch に関するメモ(後、lennyへアップグレード)

tags: smart linux

Posted by NI-Lab. (@nilab)