Здравейте и за много години!
Аз я почвам ударно! За честитка получих писмо от smartctl, което казва, че има грешки по единия ми диск.
Ето изхода на:
[root@srv1 ~]# smartctl -l error -d ata /dev/sdd
smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 1282 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 1282 occurred at disk power-on lifetime: 6980 hours (290 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 64 4a 67 00
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 5f 4a 67 40 00 6d+02:38:32.824 [RESERVED FOR SERIAL ATA]
27 00 00 00 00 00 e0 00 6d+02:38:32.824 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 6d+02:38:32.823 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 6d+02:38:32.823 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 6d+02:38:32.822 READ NATIVE MAX ADDRESS EXT
Error 1281 occurred at disk power-on lifetime: 6980 hours (290 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 64 4a 67 00
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 5f 4a 67 40 00 6d+02:38:30.977 [RESERVED FOR SERIAL ATA]
27 00 00 00 00 00 e0 00 6d+02:38:30.977 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 6d+02:38:30.976 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 6d+02:38:30.976 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 6d+02:38:30.975 READ NATIVE MAX ADDRESS EXT
Error 1280 occurred at disk power-on lifetime: 6980 hours (290 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 64 4a 67 00
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 5f 4a 67 40 00 6d+02:38:29.093 [RESERVED FOR SERIAL ATA]
27 00 00 00 00 00 e0 00 6d+02:38:29.092 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 6d+02:38:29.092 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 6d+02:38:29.091 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 6d+02:38:29.091 READ NATIVE MAX ADDRESS EXT
Error 1279 occurred at disk power-on lifetime: 6980 hours (290 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 64 4a 67 00
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 5f 4a 67 40 00 6d+02:38:27.219 [RESERVED FOR SERIAL ATA]
27 00 00 00 00 00 e0 00 6d+02:38:27.218 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 6d+02:38:27.218 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 6d+02:38:27.217 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 6d+02:38:27.217 READ NATIVE MAX ADDRESS EXT
Error 1278 occurred at disk power-on lifetime: 6980 hours (290 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 64 4a 67 00
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 5f 4a 67 40 00 6d+02:38:25.345 [RESERVED FOR SERIAL ATA]
27 00 00 00 00 00 e0 00 6d+02:38:25.344 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 6d+02:38:25.343 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 6d+02:38:25.343 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 6d+02:38:25.343 READ NATIVE MAX ADDRESS EXT
[root@srv1 ~]#
Ето и на:
[root@srv1 ~]# smartctl -l selftest -d ata /dev/sdd
smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 6980 6769252
# 2 Extended offline Completed: read failure 90% 6975 6769252
# 3 Short offline Completed: read failure 90% 6975 6769252
# 4 Short offline Completed: read failure 90% 6973 6769252
# 5 Short offline Completed: read failure 90% 6961 6769252
# 6 Extended offline Completed: read failure 90% 6958 6769252
# 7 Short offline Completed: read failure 90% 6958 6769252
# 8 Short offline Completed: read failure 90% 6958 6769252
# 9 Short offline Completed: read failure 90% 6951 6769252
#10 Short offline Completed: read failure 90% 6950 6769252
#11 Short offline Completed without error 00% 6570 -
[root@srv1 ~]#
Аз вече ден и половина се опитвам да отстраня проблема. Намерих интересни неща, които отговаряха на моя случай:
http://lists.linuxcoding.com/rhl/2005/msg38273.html,
http://www.phwinfo.com/forum/linux-debian-user/329395-smart-error-offlineuncorrectablesector-detected-host.html и
http://smartmontools.sourceforge.net/badblockhowto.html. Следвах последното и се опитах да запиша нули на диска. Записах, но явно не съм сметнал добре мястото. Пробвах и с форматиране на въпросния диск и наново създаване на файловата система на lvm партишъна. Вече съм изчерпан откъм идеи.
Интересуват ме две неща:
1. тоя диск за смяна ли е (той, както и останалите са на около 6 месеца)?
2. има ли начин да поправя нещата?
Файловата система на диска е Linux LVM, а на въпросния логически дял ext3.
Благодаря!