티스토리 뷰
HBA Card Fault 시 Log 확인 및 HBA Card 교체를 위한 PCI 슬롯 정보 확인
helperchoi 2014. 3. 29. 21:09Linux OS 이하에서 SAN Storage 사용중 HBA Card Port Fault 발생시 Card 교체를 위한 확인 과정은 다음과 같다.
1. OS 이하 /var/log/messages 상 FC Link Down 메시지를 확인후 해당 HBA Card의 PCI Bus Address 정보 확인
[root@marine1 ~]# cat /var/log/messages | grep "kernel: lpfc" -A7
Mar 26 01:29:43 marine1 kernel: lpfc 0000:0b:00.1: 1:1305 Link Down Event xe received Data: xe x20 x80110 x0 x0 Mar 26 01:30:13 marine1 kernel: rport-3:0-2: blocked FC remote port time out: saving binding Mar 26 01:30:13 marine1 kernel: Error:Mpx:Path Bus 3 Tgt 0 Lun 141 to 000492600113 is dead. Mar 26 01:30:13 marine1 kernel: Error:Mpx:Path Bus 3 Tgt 0 Lun 161 to 000492600113 is dead. Mar 26 01:30:13 marine1 kernel: Error:Mpx:Path Bus 3 Tgt 0 Lun 162 to 000492600113 is dead. Mar 26 01:30:13 marine1 kernel: Error:Mpx:Path Bus 3 Tgt 0 Lun 163 to 000492600113 is dead. Mar 26 01:30:13 marine1 kernel: Error:Mpx:Path Bus 3 Tgt 0 Lun 9 to 000492600113 is dead. |
2. 확인된 PCI BUS 정보를 기준으로 HBA Card 정보와 해당 PCI Slot 정보의 조회
[root@marine1 ~]# [root@marine1 ~]# lspci | grep -i "fibre channel" | grep -i "0b:00.1" 0b:00.1 Fibre Channel: Emulex Corporation Zephyr-X LightPulse Fibre Channel Host Adapter (rev 02) [root@marine1 ~]# [root@marine1 ~]# [root@marine1 ~]# [root@marine1 ~]# dmidecode -t slot | grep "0b:00" -B7 Designation: PCI-E Slot 9 Type: x8 PCI Express Gen 2 x16 Current Usage: In Use Length: Long Characteristics: 3.3 V is provided PME signal is supported Bus Address: 0000:0b:00.0
[root@marine1 ~]# [root@marine1 ~]# [root@marine1 ~]# [root@marine1 ~]# find /sys/class/fc_host/ -type l -name device -exec ls -l {} \; | grep -i "0b:00.1" lrwxrwxrwx 1 root root 0 3월 26 11:35 /sys/class/fc_host/host3/device -> ../../../devices/pci0000:00/0000:00:07.0/0000:0b:00.1/host3 [root@marine1 ~]# |
3. 해당 PCI Bus 정보 및 맵핑된 Host Number를 기준으로 Storage Level 에서 Fault 확인 (예시 EMC Storage 기준)
[root@marine1 ~]# [root@marine1 ~]# powermt display Symmetrix logical device count=163 CLARiiON logical device count=0 Invista logical device count=0 ============================================================================== ----- Host Bus Adapters --------- ------ I/O Paths ----- ------ Stats ------ ### HW Path Summary Total Dead IO/Sec Q-IOs Errors ============================================================================== 2 lpfc optimal 163 0 - 0 0 3 lpfc failed 163 163 - 0 163 4 lpfc optimal 163 0 - 0 0 5 lpfc optimal 163 0 - 0 0 6 lpfc optimal 163 0 - 0 0 7 lpfc optimal 163 0 - 0 0 8 lpfc optimal 163 0 - 0 0 9 lpfc optimal 163 0 - 0 0 10 lpfc optimal 163 0 - 0 0 11 lpfc optimal 163 0 - 0 0 12 lpfc optimal 163 0 - 0 0 13 lpfc optimal 163 0 - 0 0 [root@marine1 ~]# [root@marine1 ~]# [root@marine1 ~]# powermt display dev=all | head -8 && powermt display dev=all | grep '1$' Pseudo name=emcpowera Symmetrix ID=000492600113 Logical device ID=0055 state=alive; policy=SymmOpt; priority=0; queued-IOs=0; ============================================================================== --------------- Host --------------- - Stor - -- I/O Path -- -- Stats --- ### HW Path I/O Paths Interf. Mode State Q-IOs Errors ============================================================================== 3 lpfc sdfj FA 10fB active dead 0 1 3 lpfc sdfk FA 10fB active dead 0 1 3 lpfc sdfl FA 10fB active dead 0 1 3 lpfc sdfm FA 10fB active dead 0 1 3 lpfc sdfn FA 10fB active dead 0 1 3 lpfc sdfo FA 10fB active dead 0 1 3 lpfc sdfp FA 10fB active dead 0 1
. . .
생략
[root@marine1 ~]# [root@marine1 ~]# [root@marine1 ~]# powermt check dev=all Warning: Symmetrix device path sdfj is currently dead. Do you want to remove it (y/n/a/q)? [root@marine1 ~]# [root@marine1 |
'System Story > CentOS 5,6' 카테고리의 다른 글
Ethernet Driver 오류 발생시 해당 Device의 정보 확인 과정 (0) | 2014.04.08 |
---|---|
iSCSI로 할당받은 SCSI Device의 정보 확인하기 (0) | 2014.04.01 |
OS 이하 네트워크 패킷 최대전송단위(MTU) 설정 (0) | 2014.03.28 |
Hitachi SAN Storage에서 할당된 Shared Device Volume 조회 및 확인 (0) | 2014.03.25 |
Shell - ab 유틸을 활용한 WEB 서버 응답속도 측정 및 간단한 성능 부하 테스트 (0) | 2014.03.20 |