title
わたしの日記は日々の出来事の鬱憤晴らしの毒だし日記がメインです。 相当病んでいます。くだを巻いています。許容出来る方のみのアクセスをお願いします。 また、この日記へのリンクは原則自由にして頂いても結構ですが、 写真への直リンクを張るのはご遠慮下さい。内容に関しては、一切保証致しません。
カテゴリ一覧 Network, Internet, IPv6, DC, NTT, Comp, Linux, Debian, FreeBSD, Windows, Server, Security, IRC, 大学, Neta, spam, , 生活, 遊び, Drive, TV, 仕事,
過去日記:





2015年08月26日(水) [雨]

[Server][FreeBSD] サーバーがぴーぴーピーうるさくて目が覚めた(泣)。結果、HDDが死んだので、交換した

夜寝ていて、トイレで目が覚めたら、サーバーがピーピーピーとものすごくうるさくて泣いていた。
これで目が覚めたと言っても過言ではない。

どのサーバーが泣いているか切り分けしていたら、判明。

ちなみにステータス、すなわちLEDでは、異常見つからない。

ぴーぴーなっているので、最初UPSかなぁとおもって調べてみたら、HP DL320だった。
うん、こいつは諸事情によりLSI LogicのRAID板をつかってるやつで、SmartArrayじゃなかったやつのようだ。

OSはFreeBSD。調べてみると、ログが。。。

Aug 25 23:53:21 backup1 mfi0: 141334 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 13 da 00 00 26 00
Aug 25 23:53:21 backup1 mfi0: 141335 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 14 00 00 02 76 00
Aug 25 23:53:21 backup1 mfi0: 141336 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 17 44 9e 00 02 16 00
Aug 25 23:53:21 backup1 mfi0: 141337 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 17 3a 3f 00 00 6d 00
Aug 25 23:53:21 backup1 mfi0: 141338 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 28 6b 00 00 a5 00
Aug 25 23:53:21 backup1 mfi0: 141339 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 14 d5 95 00 00 b0 00
Aug 25 23:53:21 backup1 mfi0: 141340 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 6f c0 5d 00 00 c1 00
Aug 25 23:53:21 backup1 mfi0: 141341 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 17 34 9b 00 00 ad 00
Aug 25 23:53:21 backup1 mfi0: 141342 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 6f c9 c6 00 00 a6 00
Aug 25 23:53:21 backup1 mfi0: 141343 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 17 28 96 00 01 00 00
Aug 25 23:53:21 backup1 mfi0: 141344 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 6f d0 ba 00 00 cb 00
Aug 25 23:53:21 backup1 mfi0: 141345 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 6f b8 03 00 03 fd 00
Aug 25 23:53:21 backup1 mfi0: 141346 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 16 b9 eb 00 01 00 00
Aug 25 23:53:21 backup1 mfi0: 141347 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 0a 2b 00 01 90 00
Aug 25 23:53:21 backup1 mfi0: 141348 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 6f d8 01 00 03 ad 00
Aug 25 23:53:22 backup1 mfi0: 141349 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 20 67 00 00 bb 00
Aug 25 23:53:22 backup1 mfi0: 141350 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 6f d6 fe 00 00 c3 00
Aug 25 23:53:22 backup1 mfi0: 141351 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 17 49 60 00 02 a0 00
Aug 25 23:53:22 backup1 mfi0: 141352 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 17 40 d9 00 00 e3 00
Aug 25 23:53:22 backup1 mfi0: 141353 (493860690s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 04 00 00 03 e5 00
Aug 25 23:53:22 backup1 mfi0: 141354 (493860690s/0x0002/WARN) - PD 00(e0xfc/s0) Path 1221000000000000  reset (Type 03)
Aug 25 23:53:22 backup1 mfi0: 141355 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d5 14 17 74 00 00 60 00
Aug 25 23:53:22 backup1 mfi0: 141356 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 2a 00 d4 17 3c 00 00 04 00 00
Aug 25 23:53:22 backup1 mfi0: 141357 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 2a 00 d4 70 19 91 00 00 de 00
Aug 25 23:53:22 backup1 mfi0: 141358 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 50 00 00 03 e0 00
Aug 25 23:53:22 backup1 mfi0: 141359 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 63 d7 00 00 29 00
Aug 25 23:53:22 backup1 mfi0: 141360 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d3 e6 d2 6e 00 00 d4 00
Aug 25 23:53:22 backup1 mfi0: 141361 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 2a 00 d4 70 1e 2e 00 00 bb 00
Aug 25 23:54:44 backup1 mfi0: 141362 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 2a 00 d4 70 38 1d 00 00 9b 00
Aug 25 23:54:44 backup1 mfi0: 141363 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 64 00 00 00 b0 00
Aug 25 23:54:44 backup1 mfi0: 141364 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d5 14 24 00 00 03 f7 00
Aug 25 23:54:44 backup1 mfi0: 141365 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 48 e5 00 02 d4 00
Aug 25 23:54:44 backup1 mfi0: 141366 (493860691s/0x0002/WARN) - Command timeout on PD 00(e0xfc/s0) Path 1221000000000000, CDB: 28 00 d4 70 58 00 00 04 00 00
Aug 25 23:54:44 backup1 mfi0: 141367 (493860702s/0x0010/CRIT) - SAS topology error: Unaddressable device
Aug 25 23:54:44 backup1 mfi0: 141368 (493860703s/0x0010/CRIT) - SAS topology error: Unaddressable device
Aug 25 23:54:44 backup1 mfi0: 141369 (493860703s/0x0010/CRIT) - SAS topology error: Unaddressable device
Aug 25 23:54:44 backup1 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 35569, size: 12288
Aug 25 23:54:44 backup1 mfi0: 141370 (493860716s/0x0002/WARN) - PD 110(e0x00/s16) Path 1221000000000000  reset (Type 03)
Aug 25 23:54:44 backup1 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 34801, size: 20480
Aug 25 23:54:44 backup1 mfi0: 141371 (493860730s/0x0002/WARN) - PD 110(e0x00/s16) Path 1221000000000000  reset (Type 03)
Aug 25 23:54:44 backup1 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 35569, size: 12288

どうやらHDDの障害のようだ。
要はデグレしたよっと。

Aug 25 23:54:44 backup1 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 35569, size: 12288
Aug 25 23:54:44 backup1 mfi0: 141374 (493860772s/0x0002/WARN) - PD 110(e0x00/s16) Path 1221000000000000  reset (Type 03)
Aug 25 23:54:44 backup1 mfi0: 141375 (493860773s/0x0002/WARN) - Removed: PD 00(e0xfc/s0)
Aug 25 23:54:46 backup1 mfi0: 141376 (493860773s/0x0002/info) - Removed: PD 00(e0xfc/s0) Info: enclPd=fc, scsiType=0, portMap=00, sasAddr=1221000000000000,0pass0 at mfi0 bus 0 scbus0 target 0 lun 0
Aug 25 23:54:46 backup1 pass0: <ATA ST2000DM001-1CH1 CC43> s/n             Z1E0ZY4H detached
Aug 25 23:54:46 backup1 000000000000000
Aug 25 23:54:46 backup1 (pass0:mfi0:0:0:0): Periph destroyed
Aug 25 23:54:46 backup1 mfi0: 141377 (493860773s/0x0002/info) - State change on PD 00(e0xfc/s0) from ONLINE(18) to FAILED(11)
Aug 25 23:54:46 backup1 mfi0: 141378 (493860773s/0x0001/info) - State change on VD 00/0 from OPTIMAL(3) to DEGRADED(2)
Aug 25 23:54:46 backup1 mfi0: 141379 (493860773s/0x0001/CRIT) - VD 00/0 is now DEGRADED
Aug 25 23:54:46 backup1 mfi0: 141380 (493860773s/0x0001/info) - State change on VD 01/1 from OPTIMAL(3) to DEGRADED(2)
Aug 25 23:54:46 backup1 mfi0: 141381 (493860773s/0x0001/CRIT) - VD 01/1 is now DEGRADED
Aug 25 23:54:46 backup1 mfi0: 141382 (493860773s/0x0002/info) - State change on PD 00(e0xfc/s0) from FAILED(11) to UNCONFIGURED_BAD(1)

てことで、予備のHDDに交換して終わり。
交換の対応まで、次のような時間での対応。

Aug 26 00:20:33 backup1 mfi0: 141383 (493862322s/0x0010/CRIT) - SAS topology error: Unaddressable device
Aug 26 00:28:37 backup1 mfi0: 141384 (493862805s/0x0002/info) - Inserted: PD 00(e0xfc/s0)
Aug 26 00:28:38 backup1 mfi0: 141385 (493862805s/0x0002/info) - Inserted: PD 00(e0xfc/s0) Info: enclPd=fc, scsiType=0, portMap=00, sasAddr=1221000000000000,0000000000000000
Aug 26 00:28:38 backup1 mfi0: 141386 (493862806s/0x0002/info) - State change on PD 00(e0xfc/s0) from UNCONFIGURED_BAD(1) to UNCONFIGURED_GOOD(0)
Aug 26 00:28:38 backup1 mfi0: 141387 (493862806s/0x0002/info) - State change on PD 00(e0xfc/s0) from UNCONFIGURED_GOOD(0) to OFFLINE(10)
Aug 26 00:28:38 backup1 mfi0: 141388 (493862806s/0x0002/info) - Rebuild automatically started on PD 00(e0xfc/s0)
Aug 26 00:28:38 backup1 mfi0: 141389 (493862806s/0x0002/info) - State change on PD 00(e0xfc/s0) from OFFLINE(10) to REBUILD(14)
Aug 26 00:28:38 backup1 mfi0: 141390 (493862806s/0x0020/info) - Patrol Read complete
Aug 26 00:28:38 backup1 mfi0: 141391 (493862807s/0x0020/WARN) - Patrol Read can't be started, as PDs are either not ONLINE, or are in a VD with an active process, or are in an excluded VD
Aug 26 00:28:39 backup1 mfi0: 141392 (493862808s/0x0020/WARN) - Patrol Read can't be started, as PDs are either not ONLINE, or are in a VD with an active process, or are in an excluded VD

障害検知から、45分後には、リビルド開始!

[root@backup1 ~]# mfiutil show drives
mfi0 Physical Drives:
 0 ( 1863G) REBUILD <ST2000DL003-9VT1 CC3C serial=6YD1QGH1> SATA E1:S0
 1 ( 1863G) ONLINE  <ST2000DM001-1CH1 CC43 serial=Z1E0ZY3C> SATA E1:S1
 2 ( 1863G) ONLINE  <ST2000DM001-1CH1 CC43 serial=Z1E0ZXAB> SATA E1:S2
 3 ( 1863G) ONLINE  <ST2000DM001-1CH1 CC43 serial=W1E0Z3RJ> SATA E1:S3

45分で暫定対応完了、ハード障害の対応っていうのは、オンサイト3時間より早い対応!えっへん。

にしても、うるさいわ・・・。リビルド終わるまでなきやまないな、こりゃ。。。
あとは、リビルド中にもう一本逝かないことを祈ります。。。

  『MegaCli -AdpSetProp AlarmSilence -aALL』
tomocha  『FreeBSD用にMegaCLIって提供されてます? 私が探した限りでは提供されていませんでした。ports煮もそれらしき物はあるけど中身は無く。mfiutilに統合されて、最新の環境ではあるみたいですが、F9のpkgからは降ってこず。』
  『storcli含めてavagoのサイトから落とせますよ。』
  『RESTRICTED= Source recipient must acknowledge license. Reproduction or redistribution prohibited. See http://www.lsi.com/lookup/License.aspx なんで、pkgng にはないです。』
  『これかな。 http://docs.avagotech.com/docs/12351587?_ga=1.240819530.873723904.1442043110』
[ コメントを読む(5) | コメントする ]

[Linux][Server] screenって長生きするんだ

        17031.irc       (2014年09月02日 12時42分29秒)   (Detached)
        27664.diary     (2014年08月20日 11時38分26秒)   (Detached)
        18486.local     (2014年08月18日 02時29分43秒)   (Detached)
        23778.test      (2014年07月23日 17時25分01秒)   (Detached)
        23521.wide      (2014年07月23日 17時06分13秒)   (Detached)
        17832.sv        (2014年07月23日 13時14分06秒)   (Detached)

一年以上生きているのが多すぎるw

[ コメントを読む(0) | コメントする ]

Diary for 1 day(s)
Powered by hns HyperNikkiSystem Project




(c) Copyright 1998-2014 tomocha. All rights reserved.