Я установил и настроил smartd
для ежедневного запуска коротких SMART-тестов и еженедельных длинных SMART-тестов, а также уведомлял меня об ошибках:
$ ls /dev/sd*
/dev/sda
/dev/sda1
/dev/sda2
/dev/sdb
/dev/sdb1
/dev/sdb2
/dev/sdb3
/dev/sdb4
/dev/sdb5
/dev/sdb6
/dev/sdb7
$ cat /etc/smartd.conf
/dev/sda -m root -M diminishing -s (S/../.././23|L/../../1/02)
/dev/sdb -m root -M diminishing -s (S/../.././23|L/../../1/02)
# systemctl status smartd
● smartd.service - Self Monitoring and Reporting Technology (SMART) Daemon
Loaded: loaded (/lib/systemd/system/smartd.service; enabled; vendor preset: enab
Active: active (running) since Tue 2018-06-05 22:46:00 CEST; 1 day 19h ago
Docs: man:smartd(8)
man:smartd.conf(5)
Main PID: 691 (smartd)
Tasks: 1 (limit: 4915)
CGroup: /system.slice/smartd.service
`-691 /usr/sbin/smartd -n --interval=7200
Jun 07 00:46:01 inspiron smartd[691]: Sending warning via <mail> to root ...
Jun 07 00:46:01 inspiron smartd[691]: Warning via <mail> to root: successful
Jun 07 00:46:01 inspiron smartd[691]: Device: /dev/sdb [SAT], previous self-test co
Jun 07 02:46:01 inspiron smartd[691]: Device: /dev/sda [SAT], SMART Usage Attribute
Jun 07 04:46:01 inspiron smartd[691]: Device: /dev/sda [SAT], SMART Usage Attribute
Jun 07 06:46:01 inspiron smartd[691]: Device: /dev/sda [SAT], SMART Usage Attribute
Jun 07 08:46:01 inspiron smartd[691]: Device: /dev/sda [SAT], SMART Usage Attribute
Jun 07 10:46:04 inspiron smartd[691]: Device: /dev/sda [SAT], SMART Usage Attribute
Jun 07 13:24:29 inspiron smartd[691]: Device: /dev/sda [SAT], SMART Usage Attribute
Jun 07 14:46:01 inspiron smartd[691]: Device: /dev/sda [SAT], SMART Usage Attribute
Однако теперь я получаю электронные письма, в которых сообщается, что «Количество ошибок в журнале самопроверки увеличивалось с 0 до», либо «1», «2» или «3», в /dev/sda
каждые несколько дней.
$ mutt
1 Jun 07 root@inspiron (0.5K) SMART error (SelfTest) detected on host: inspiron
6 Jun 03 root@inspiron (0.5K) SMART error (SelfTest) detected on host: inspiron
7 May 31 root@inspiron (0.5K) SMART error (SelfTest) detected on host: inspiron
16 May 28 root@inspiron (0.5K) SMART error (SelfTest) detected on host: inspiron
---NeoMutt: =localhost [Msgs:8/98 Inc:2 147K]---(reverse-threads/last-date-received)-(50%)---
Date: Thu, 07 Jun 2018 00:46:01 +0200
From: root@inspiron
To: root@inspiron
Subject: SMART error (SelfTest) detected on host: inspiron
X-Mailer: mail (GNU Mailutils 3.1.1)
This message was generated by the smartd daemon running on:
host name: inspiron
DNS domain: [Empty]
The following warning/error was logged by the smartd daemon:
Device: /dev/sda [SAT], Self-Test Log error count increased from 0 to 2
Device info:
Crucial_CT275MX300SSD4, S/N:163613DAF037, WWN:5-00a075-113daf037, FW:M0CR021, 275 GB
For details see host's SYSLOG.
You can also use the smartctl utility for further investigation.
Another message will be sent in 24 hours if the problem persists.
В электронных письмах мне советуют «посмотреть SYSLOG хоста», но системный журнал не содержит ничего интересного, и smartctl
не сообщает ничего необычного:
# grep smart </var/log/syslog
Jun 7 00:46:01 inspiron smartd[691]: Device: /dev/sda [SAT], previous self-test completed without error
Jun 7 00:46:01 inspiron smartd[691]: Device: /dev/sda [SAT], Self-Test Log error count increased from 0 to 2
Jun 7 00:46:01 inspiron smartd[691]: Sending warning via <mail> to root ...
Jun 7 00:46:01 inspiron smartd[691]: Warning via <mail> to root: successful
Jun 7 00:46:01 inspiron smartd[691]: Device: /dev/sdb [SAT], previous self-test completed without error
Jun 7 02:46:01 inspiron smartd[691]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 56 to 57
# smartctl -a /dev/sda | sed -n '/Self-test execution status/,/been run/p'
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Как я могу узнать больше о том, что беспокоит smartd
?