DW 성향의 대용량 분석DB 시스템을 운용하다 보면, CPU나 Memory 등의 시스템 리소스 상태는 충분하나 DB 응답지연등의 현상이 있을 때가 있다.
시스템 담당자로서 각 리소스별 사용추이나 IDLE상태를 체크하면서 Storage 레벨 또는 특정 Disk Device 에 대한 I/O 상태를 확인해야 할 경우 아래와 같이 sar 명령을 통해 각 Device 별 실시간 유입 I/O 와 응답시간(await)을 확인 할 수 있다.
[root@TestDB01 ~]# [root@TestDB01 ~]# sar -d 1 | grep "Average" | sort -nrk8 | head -3 Average: dev66-1696 1.00 512.00 0.00 512.00 0.05 52.00 52.00 5.20 Average: dev66-1552 2.00 1024.00 0.00 512.00 0.10 47.50 47.50 9.50 Average: dev132-1536 1.00 1024.00 0.00 1024.00 0.05 47.00 47.00 4.70 [root@TestDB01 ~]# |
하지만 위에서 보이는 것 처럼 sar -d 명령을 통해 확인 되는 Device Name은 Linux 상의 Device Major No와 Minor No를 참조하는 Allocated devices Name 형식으로 보여 Disk Device를 쉽게 구분하기 힘들다.
Linux 상에서 Disk Device의 Major 번호와 Minor 번호의 참조 및 확인은 아래와 같이 /proc/diskstats 를 통해 가능하며 한두개의 Disk Volume을 조회한다면 아래와 같은 형식으로 조회가 가능하다.
[root@TestDB01 ~]# cat /proc/diskstats | awk '$1 ~ /^66$/ && $2 ~ /^1696$/ {print}' 66 1696 sdbhs 3987081 125 2437187827 72148078 186224 0 29668336 2194288 0 66875924 74341945 [root@TestDB01 ~]# |
하지만 수십개 이상의 Disk Divce를 조회해야 한다면 아래와 같은 Shell Script를 통해 손쉽게 확인이 가능하다.
1) 실행 예시
[root@TestDB01 ~]# [root@TestDB01 ~]# ./diskio.sh 10 8 5
### Collect Disk I/O = 10 sec / Sort Number : 8 / Ouputs Line : 5 ### Sort Number 6 - [ avgrq-sz ] : The average size (in sectors) of the requests that were issued to the device. ### Sort Number 7 - [ avgqu-sz ] : The average queue length of the requests that were issued to the device. ### Sort Number 8 - [ I/O await ] : The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. ### Raw Data = /root/collect_io.dat
[ System Load INFO - 11:13:27 up 53 days, 21:03, 14 users, load average: 24.23, 20.98, 17.51 ]
11시 13분 16초 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util Average: sdks 0.60 268.53 0.00 448.00 0.02 27.00 27.00 1.62 Average: sdavc 0.30 153.45 0.00 512.00 0.01 24.33 24.33 0.73 Average: sdbhf 0.60 300.50 0.00 501.33 0.01 23.33 23.33 1.40 Average: sdaxl 0.60 306.89 0.00 512.00 0.01 23.33 20.00 1.20 Average: sdbwz 0.60 300.50 0.00 501.33 0.01 22.67 18.67 1.12 [root@TestDB01 ~]# [root@TestDB01 ~]# [root@TestDB01 ~]# ./diskio.sh
### Default Disk I/O Collect = 5 sec / Sort by : [ I/O await time ] / Outputs Line : 20 ### Usage ex) : ./diskio.sh 10 8 20 ### Usage Manual : ./diskio.sh [Collect time] [Sort Number] [Outputs Line] ### Sort Number 6 - [ avgrq-sz ] : The average size (in sectors) of the requests that were issued to the device. ### Sort Number 7 - [ avgqu-sz ] : The average queue length of the requests that were issued to the device. ### Sort Number 8 - [ I/O await ] : The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. ### Raw Data = /root/collect_io.dat
[ System Load INFO - 11:13:47 up 53 days, 21:03, 14 users, load average: 24.29, 21.19, 17.65 ]
11시 13분 40초 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util Average: sdaxh 0.20 102.40 0.00 512.00 0.01 44.00 44.00 0.88 Average: sdbws 0.40 204.80 0.00 512.00 0.02 43.00 43.00 1.72 Average: sdbdi 0.20 102.40 0.00 512.00 0.00 43.00 8.00 0.16 Average: sdbxi 0.20 0.00 6.40 32.00 0.01 42.00 42.00 0.84 Average: sdaww 0.40 204.80 0.00 512.00 0.01 37.50 37.50 1.50 Average: sdra 0.20 102.40 0.00 512.00 0.01 37.00 37.00 0.74 Average: sduv 0.20 102.40 0.00 512.00 0.01 34.00 34.00 0.68 Average: sdbdx 0.40 204.80 0.00 512.00 0.00 34.00 12.00 0.48 Average: sdajt 0.40 204.80 0.00 512.00 0.01 32.50 20.50 0.82 Average: sdaxe 0.40 204.80 0.00 512.00 0.01 32.00 32.00 1.28 Average: sdakn 0.20 102.40 0.00 512.00 0.01 32.00 32.00 0.64 Average: sdxb 0.20 102.40 0.00 512.00 0.01 31.00 31.00 0.62 Average: sdwz 0.20 102.40 0.00 512.00 0.01 31.00 31.00 0.62 Average: sdahu 0.40 204.80 0.00 512.00 0.01 30.50 30.50 1.22 Average: sdaql 0.20 102.40 0.00 512.00 0.01 30.00 30.00 0.60 Average: sdaka 0.20 102.40 0.00 512.00 0.01 30.00 30.00 0.60 Average: sdqu 0.20 102.40 0.00 512.00 0.01 30.00 30.00 0.60 Average: sdaqk 0.60 307.20 0.00 512.00 0.02 29.33 18.67 1.12 Average: sdbwy 0.20 102.40 0.00 512.00 0.01 29.00 29.00 0.58 Average: sdbwm 0.20 102.40 0.00 512.00 0.01 29.00 29.00 0.58 [root@TestDB01 ~]# [root@TestDB01 ~]# |
2) Shell Script 내용
[root@TestDB01 ~]# cat ./diskio.sh #!/bin/bash
#Develop by helperchoi / helperchoi@gmail.com
SORTNUM6="### Sort Number 6 - [ avgrq-sz ] : The average size (in sectors) of the requests that were issued to the device." SORTNUM7="### Sort Number 7 - [ avgqu-sz ] : The average queue length of the requests that were issued to the device." SORTNUM8="### Sort Number 8 - [ I/O await ] : The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them."
RAW_DATA=`pwd`/collect_io.dat rm -f `pwd`/change.sh
if [ $# = 0 ] then COLLECT_TIME=5 SORT_BY=8 OUT_LINE=20
echo echo "### Default Disk I/O Collect = 5 sec / Sort by : [ I/O await time ] / Outputs Line : 20" echo "### Usage ex) : $0 10 8 20" echo "### Usage Manual : $0 [Collect time] [Sort Number] [Outputs Line]" echo "${SORTNUM6}" echo "${SORTNUM7}" echo "${SORTNUM8}" echo "### Raw Data = ${RAW_DATA}" echo elif [ $# = 3 ] then VERIFY_CEHCK=`expr $1 + $2 + $3 + 0` if [ ${VERIFY_CEHCK} -ge 0 ] then COLLECT_TIME=$1 SORT_BY=$2 OUT_LINE=$3
echo echo "### Collect Disk I/O = $1 sec / Sort Number : $2 / Ouputs Line : $3" echo "${SORTNUM6}" echo "${SORTNUM7}" echo "${SORTNUM8}" echo "### Raw Data = ${RAW_DATA}" echo else echo echo "### Error - Non-numeric argument Parameters" echo "### Usage ex) : $0 10 8 20" echo "### Usage Manual : $0 [Collect time] [Sort Number] [Outputs Line]" echo "${SORTNUM6}" echo "${SORTNUM7}" echo "${SORTNUM8}" echo exit 0 fi else echo echo "### Error - Not correct format" echo "### Usage ex) : $0 10 8 20" echo "### Usage Manual : $0 [Collect time] [Sort Number] [Outputs Line]" echo "${SORTNUM6}" echo "${SORTNUM7}" echo "${SORTNUM8}" echo exit 0 fi
sar -d 1 | sed -n '3,3p' > ${RAW_DATA} && sar -d 1 ${COLLECT_TIME} | grep "Average" | sort -nrk${SORT_BY} | head -${OUT_LINE} | grep "dev" >> ${RAW_DATA}
echo "[ System Load INFO - `uptime` ]" echo
for LIST in `cat ${RAW_DATA} | grep dev | awk '{print $2}' | sed 's#dev##g'` do MAJOR_NO=`echo "${LIST}" | cut -d "-" -f 1` MINOR_NO=`echo "${LIST}" | cut -d "-" -f 2` DEVICE_NAME=`cat /proc/diskstats | awk '$1 ~ /^'"${MAJOR_NO}"'$/ && $2 ~ /^'"${MINOR_NO}"'$/ {print $3}'`
echo "perl -pi -e 's#dev${LIST}#${DEVICE_NAME}#g' ${RAW_DATA}" >> `pwd`/change.sh done
sh `pwd`/change.sh cat ${RAW_DATA}
echo
|
3) 약간의 Shell Source 개작을 거치면 아래와 같이 각 Device별 용도 구분도 가능하다.
[root@TestDB01 ~]# ./diskio.sh 5 8 20
### Collect Disk I/O = 5 sec / Sort Number : 8 / Ouputs Line : 20 ### Sort Number 6 - [ avgrq-sz ] : The average size (in sectors) of the requests that were issued to the device. ### Sort Number 7 - [ avgqu-sz ] : The average queue length of the requests that were issued to the device. ### Sort Number 8 - [ I/O await ] : The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. ### Raw Data = /root/collect_io.dat
[ System Load INFO - 20:30:12 up 61 days, 6:20, 10 users, load average: 2.34, 2.41, 2.96 ]
20시 30분 04초 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util Average: sdkx [ /dev/emcpowerjr | LVM-PV ] 0.20 0.00 1.60 8.00 0.00 4.00 4.00 0.08 Average: sdel [ /dev/emcpowerjr | LVM-PV ] 0.20 0.00 1.60 8.00 0.00 3.00 3.00 0.06 Average: emcpowerju [ /dev/emcpowerju | LVM-PV ] 0.60 0.00 152.00 253.33 0.00 2.33 1.00 0.06 Average: emcpowerju [ /dev/emcpowerju | LVM-PV ] 0.60 0.00 152.00 253.33 0.00 2.33 1.00 0.06 Average: dm- [ | null ] 24.00 0.00 192.00 8.00 0.05 2.11 0.05 0.12 Average: sdrj [ /dev/emcpowerjr | LVM-PV ] 0.20 0.00 1.60 8.00 0.00 2.00 2.00 0.04 Average: sdep [ /dev/emcpowerjn | LVM-PV ] 0.20 0.00 1.60 8.00 0.00 2.00 2.00 0.04 Average: emcpowerjb [ /dev/emcpowerjb | LVM-PV ] 0.20 0.00 1.60 8.00 0.00 2.00 2.00 0.04 Average: emcpowerjb [ /dev/emcpowerjb | LVM-PV ] 0.20 0.00 1.60 8.00 0.00 2.00 2.00 0.04 Average: sdarf [ /dev/emcpowerjr | LVM-PV ] 0.20 0.00 20.80 104.00 0.00 1.00 1.00 0.02 Average: sdarc [ /dev/emcpowerju | LVM-PV ] 0.20 0.00 64.00 320.00 0.00 1.00 1.00 0.02 Average: sdwn [ /dev/emcpowerha | /dev/raw/raw103 ] 0.20 0.00 1.60 8.00 0.00 1.00 1.00 0.02 Average: sddg [ /dev/emcpowerhg | /dev/raw/raw106 ] 0.20 0.00 0.20 1.00 0.00 1.00 1.00 0.02 Average: sdafr [ /dev/emcpowerm | /dev/raw/raw5 ] 0.20 6.40 0.00 32.00 0.00 1.00 1.00 0.02 Average: sdmm [ /dev/emcpowerw | /dev/raw/raw10 ] 0.20 0.00 1.60 8.00 0.00 1.00 1.00 0.02 Average: sdmh [ /dev/emcpowerm | /dev/raw/raw5 ] 0.20 6.40 0.00 32.00 0.00 1.00 1.00 0.02 Average: sdafh [ /dev/emcpowerkc | /dev/raw/raw136 ] 0.20 0.00 1.60 8.00 0.00 1.00 1.00 0.02 Average: sdafd [ /dev/emcpoweriv | /dev/raw/raw133 ] 0.20 0.00 1.60 8.00 0.00 1.00 1.00 0.02 Average: sdlx [ /dev/emcpowerkc | /dev/raw/raw136 ] 0.20 0.00 1.60 8.00 0.00 1.00 1.00 0.02 Average: sdbrw [ /dev/emcpoweriw | /dev/raw/raw134 ] 0.20 0.00 0.20 1.00 0.00 1.00 1.00 0.02
[root@TestDB01 ~]# |
4) Source 주요 로직 개작 예시
for LIST in `awk '$2 ~ /^dev/ {print $2}' ${RAW_DATA} | sed 's#dev##g'` do MAJOR_NO=`echo "${LIST}" | cut -d "-" -f 1` MINOR_NO=`echo "${LIST}" | cut -d "-" -f 2` DEVICE_NAME=`awk '$1 ~ /^'"${MAJOR_NO}"'$/ && $2 ~ /^'"${MINOR_NO}"'$/ {print $3}' /proc/diskstats | cut -d "/" -f 1 | sed 's#[0-9]##g'` EMC_SYM_ID=`/usr/symcli/bin/sympd list | awk '$1 ~ /^'"\/dev\/${DEVICE_NAME}"'$/ {print $2}'` EMC_DEVICE_NAME=`/usr/symcli/bin/symdev list | awk '$1 ~ /^'"${EMC_SYM_ID}"'$/ {print $2}'`
if [ -z ${EMC_DEVICE_NAME} ] then RAW_DEVICE_NAME=null else CHECK_RAW_DEVICE=`grep "${EMC_DEVICE_NAME}" /etc/sysconfig/rawdevices | wc -l` CHECK_LVM=`pvscan | grep "${EMC_DEVICE_NAME}" | wc -l` if [ ${CHECK_RAW_DEVICE} -gt 0 ] then RAW_DEVICE_NAME=`cat /etc/sysconfig/rawdevices | grep -v "#" | grep "${EMC_DEVICE_NAME}" | awk '{print $1}'` elif [ ${CHECK_LVM} -ge 1 ] then RAW_DEVICE_NAME=LVM-PV else RAW_DEVICE_NAME=null fi fi
echo "perl -pi -e 's#dev${LIST}#${DEVICE_NAME} [ ${EMC_DEVICE_NAME} | ${RAW_DEVICE_NAME} ]#g' ${RAW_DATA}" >> `pwd`/change.sh done
sh `pwd`/change.sh cat ${RAW_DATA} echo
|