Linux基礎排查

03-17

這裡綜合了一些雲上ECS Linux系統常見問題涉及知識和排查工具，方便大家自行排查時候作一些參考。

一、磁碟/分區操作

分區操作

fdisk
parted

文件系統操作

mount/umount - 掛載/卸載文件系統
mkfs - 創建文件系統
fsck - 文件系統檢查和修復
tune2fs - 調整/查看文件系統
resize2fs - resize文件系統
debugfs

文件

/proc/mounts
/proc/partitions
/etc/mtab
/etc/fstab

[root@iXXXXXXX ~]# cat /etc/fstab # # /etc/fstab # Created by anaconda on Thu Feb 23 07:28:22 2017 # # Accessible filesystems, by reference, are maintained under /dev/disk # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=3d083579-f5d9-4df5-9347-8d27925805d4 / ext4 defaults 1 1 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 /dev/vdb1 /mnt ext3 defaults 0 0

常見問題

1、開機進入緊急模式(Emergency Mode)

現象：

無法遠程，VNC連接顯示以下信息：

絕大多數情況為/etc/fstab掛載信息有問題，少數情況為啟動項異常、磁碟自檢異常等。

解法：

（1）0Emergency Mode下登錄系統後進行日誌檢查。

/var/logs
journalctl -xb

（2）進入單用戶模式（single-user-mode）進行修改

2、文件系統異常

現象：

啟動異常，提示文件系統有問題
進入系統發現磁碟是只讀狀態
一些命令執行或者程序運行異常，報錯可能提示文件系統錯誤

解法：

（1）通過以下命令檢查文件系統狀態

tune2fs -l <分區文件系統>

異常示例：

（2）修復文件系統

數據盤：請umount後直接執行

fsck <文件系統分區> -y

系統盤：確保做好快照後工單聯繫售後處理。

（3）tune2fs確認磁碟clean後重啟伺服器

3、磁碟分區/擴容失敗

基本思路：打好快照，基於報錯通過搜索引擎檢索，通過fdisk/parted/resize2fs等嘗試重新分區和擴容。

2TB以上分區

二、系統性能資源

CPU

查詢cpu信息

cat /proc/cpuinfo

進程狀態

D Uninterruptible sleep (usually IO)R Running or runnable (on run queue)S Interruptible sleep (waiting for an event to complete)T Stopped, either by a job control signal or because it is being traced.Z Defunct ("zombie") process, terminated but not reaped by its parent

查詢進程狀態和資源消耗

ps auxf USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMANDps -eLf UID PID PPID LWP C NLWP STIME TTY TIME CMDtoppidstatvmstat

查詢線程CPU資源

ps -eT -o%cpu,pid,tid,ppid,comm | sort -n -r | head -20

查看進程運行在哪個CPU核上

ps -eo pid,psr

關於Load

cat /proc/loadavg

Memory

基礎命令

topfree -hcat /proc/meminfoatop

實際也可以對比雲監控預設監控項：

內存佔用前20的進程

ps -e -o%mem,pid,tid,ppid,comm | sort -n -r | head -20

slab緩存

slab是啥？

通過啥命令來查看？

cat /proc/slabinfoslabtop #display kernel slab cache information in real time

啥命令來釋放緩存？

echo "3">/proc/sys/vm/drop_caches

需要注意的是，系統啟動後其初始值是0，之後可以賦值1、2、3，賦值只是觸發動作，本身值沒有什麼意義。

Writing to this will cause the kernel to drop clean caches, dentries and inodes from memory, causing that memory to become free.To free pagecache:echo 1 > /proc/sys/vm/drop_cachesTo free dentries and inodes:echo 2 > /proc/sys/vm/drop_cachesTo free pagecache, dentries and inodes:echo 3 > /proc/sys/vm/drop_cachesAs this is a non-destructive operation, and dirty objects are notfreeable, the user should run "sync" first in order to make sure allcached objects are freed.This tunable was added in 2.6.16.

Out Of Memory (OOM)

Linux的OOM Killer機制的存在跟它的overcommit特性有關。

所謂overcommit就是操作系統分配給進程的總內存大小超過了實際可用的內存，這樣做的原因是進程實際上使用的內存往往比申請的內存要少。比如有個進程申請了1G的內存，但實際上它只在一小段時間裡載入了大量數據，需要使用較大的內存，而在運行過程的其他大部分時間裡只用了100M的內存。這樣其實有900多M的內存在大部分時間裡是閑置的，完全可以分給其他進程，overcommit的機制就能充分利用這些閑置的內存。

這種分配方式寄希望於進程用不了那麼多內存，但是當進程實際使用較多內存導致內存不足時候，內核會觸發OOM Killer機制來找出一個合適的進程kill掉來釋放內存滿足需求。

Linux內存overcommit配置有三種模式，也即內核參數vm.overcommit_memory接受三種取值

0 – Heuristic overcommit handling. 這是預設值，它允許overcommit，但過於明目張胆的overcommit會被拒絕，比如malloc一次性申請的內存大小就超過了系統總內存。Heuristic的意思是「試探式的」，內核利用某種演算法猜測你的內存申請是否合理，它認為不合理就會拒絕overcommit。

1 – Always overcommit. 允許overcommit，對內存申請來者不拒。

2 – Dont overcommit. 禁止overcommit。

如何配置使得進程不會被OOM Kill？

echo -17 > /proc/<PID>/oom_adj

[root@iZbp1cvnme86y07zhd1mfiZ /]# cat /usr/include/linux/oom.h#ifndef __INCLUDE_LINUX_OOM_H #define __INCLUDE_LINUX_OOM_H /*/proc//oom_score_adj set to OOM_SCORE_ADJ_MIN disables oom killing forpid.*/ #define OOM_SCORE_ADJ_MIN (-1000) #define OOM_SCORE_ADJ_MAX 1000 /*/proc//oom_adj set to -17 protects from the oom killer for legacypurposes.*/ #define OOM_DISABLE (-17) / inclusive / #define OOM_ADJUST_MIN (-16) #define OOM_ADJUST_MAX 15 #endif / __INCLUDE_LINUX_OOM_H /

手動觸發OOM

echo f> /proc/sysrq-trigger

Disk

工具

iostat

dstat

sar - [-b #Display io statistics]

df - 檢查文件系統使用磁碟空間

du - 檢查文件使用磁碟空間

fio - 磁碟性能壓測

一般對磁碟性能有疑問，建議用fio壓測磁碟，壓測時候避免有其他業務運行影響結果。

命令參考

df和du命令查看到的使用空間不一致？

網路

常用工具：

網路監測：

ping - ICMP包測試，[-s #Size in byte] [-i #Interval in second]

telnet - 測試埠連通性

nmap - 網路探測&掃描工具，基礎使用[nmap <ip> <ip> <ip>] [nmap <ip/mask>] [-Pn #Ports scan without ping host check]

traceroute - 路由跟蹤，[-I #using ICMP] [-T #using TCP SYN]，默認用UDP

mtr - 網路診斷，[-c #Probes per second] [-s #ICMP packet size] [-T #Using TCP SYN] [-P #Specify port number] -u #Using UDP]

網路分析：

ss - 查詢socket統計信息，[-s #Summary statistics] [-ant #]

iftop - 基於host顯示網卡帶寬使用情況，[-P #Diplay ports] [-N #Display ports in number instead of service name]

iperf - 壓測

dstat

sar - 系統資源使用統計，[-n #Specify network] [DEV #Network usage for all interfaces] [EDEV #Network failure/errors count]

sar -n DEV輸出項說明：IFACE 網路設備名rxpck/s 每秒接收的包總數txpck/s 每秒傳輸的包總數rxbyt/s 每秒接收的位元組（byte）總數txbyt/s 每秒傳輸的位元組（byte）總數rxcmp/s 每秒接收壓縮包的總數txcmp/s 每秒傳輸壓縮包的總數rxmcst/s 每秒接收的多播（multicast）包的總數

netstat - 查詢連接情況

netstat -s #連接信息統計匯總netstat -i #介面網路統計netstat -nltp #進程TCP監聽信息netstat -nlup #進程UDP監聽信息netstat -ano #列出監聽和連接信息

tcpdump - 抓包

tcpdump -i <interface # any|eth0|eth1...> host <ip> port <port number>tcpdump -i any port 53 #抓DNS請求tcpdump -i any host 1.1.1.1 port 80 #抓包含ip 1.1.1.1和port 80的包nohup tcpdump -i any -C 30 -W 50 -w /tmp/net.pcap & #後台循環抓包50M*30

測試下載

wget - 下載工具，[-O #Download to specified filename] [-P #To specified path]

*註：有特殊字元的url記得加上雙引號

curl - 基於url傳輸數據，[-I #Header info only] [-o #Output to file] [-v #Verbose] [-s #Silent mode]

網路解析

dig

nslookup

防火牆

iptables

firewalld

網路配置

ifconfig

route

配置文件

/etc/sysconfig/network-scripts

/etc/network/interfaces

/etc/resolv.conf

三、基礎服務

SSH

默認22埠

/etc/ssh/

/etc/ssh/ssh_config #ssh客戶端配置文件

/etc/ssh/sshd_config #ssh服務端配置文件

詳細解讀和問題匯總

verbose & debug

客戶端啟用verbose日誌： ssh -vvv

服務端開啟debug模式：/usr/sbin/sshd -p <testport> -d

定時任務

系統MAC時間

參考基礎介紹

cron&anacron

*/etc/crontab目前版本系統基本都是空的，通過crontab命令的操作具體反映在/var/spool/cron中。

*/bin/run-parts用於執行整個目錄的可執行程序。

*基於日、周、月的cron任務是通過anacron來跑的，crond執行/etc/cron.hourly/0anacron會去檢查當天/周/月對應anacron是不是跑過，沒跑過才會調用/usr/sbin/anacron。

*系統日誌默認是每周滾動，但是對應文件是在/etc/cron.daily下

簡單概括功能的話，cron基於時間進行定時任務調度，而anacron不依賴於系統持續運行，方便實現非同步的周期性任務。

crontab -l #列出當前用戶定時任務

crontab -e #配置當前用戶定時任務

crontab -r #刪除當前用戶定時任務

日誌查詢：

/var/log/cron*

logrotate

Linux系統提供了logrotate這個工具來方便地實現日誌滾動，默認通過cron每日執行

cron和logrotate詳細介紹

系統時間

date

顯示或者修改系統時間

hwclock/clock

查詢或者修改硬體時間（RTC）

寫入系統時間到硬體時間：hwclock -w (--systohc)

寫入硬體時間到系統時間：hwclock -s (--hctosys)

ntpd

基於ntp server校準系統時間和時鐘頻率。一般伺服器先ntpdate同步時間後，再啟用ntpd進行逐漸校正。

/etc/ntp.conf #ntp配置文件

/usr/share/zoneinfo/ #時區配置文件

/etc/localtime #當前時區配置文件

/etc/timezone #當前時區描述

ntpdate

基於ntp server同步系統時間

timedatectl

設置系統日期和時間

了解更多請微博關注阿里雲客戶滿意中心