使用OpenCV進行圖像分析

04-26

背景

在進行一個和視頻分析相關的項目研究的時候，我們需要前置使用OpenCV對圖像進行預處理。在密集使用OpenCV的API的過程中，我們有了這樣一種感覺：大部分人寫的API都是ctrl+c 和 ctrl+v，而OpenCV的好多API，每一個API背後都是一篇論文。感動之餘，Gemfield寫了這篇文章，把調研過程中使用過的OpenCV的API都在這篇文章中予以解釋。Gemfield也歡迎OpenCV的用戶提供和反饋自己使用過的並且覺得有意義的OpenCV API。

本文示例代碼基於OpenCV 3.1和python 2.7.14。

frame變數

在下文中任何一個演算法的示例代碼中，都會用到frame這個變數。frame代表一個圖片或者一幀圖像。它在C++中是個Mat結構，在Python中是numpy存儲的多維數組。當frame是一個圖片的時候，它來自讀取的一個圖片：

import cv2import numpy as npframe = cv2.imread(gemfield.jpg)

當frame是一幀圖像的時候，它來自一段視頻中的一幀：

import cv2import numpy as npvideo_cap = cv2.VideoCapture(gemfield.mp4)while True: rc, frame = video_cap.read() if not rc: break

常規操作

1，cv2.namedWindow()是新增一個播放窗口。

2，cv2.destroyWindow()是銷毀一個播放窗口。

3，cv2.setMouseCallback(gemfield window 1, callback_function)，為gemfield window 1播放窗口創建個回調，一般用於滑鼠事件。callback_function的第一個參數是event。

cv2.namedWindow(gemfield window 1)cv2.destroyWindow(gemfield window 1)cv2.setMouseCallback(gemfield window 1, callback_func1)

顏色操作

1，cv2.cvtColor()

改變frame的顏色。flag有cv2.COLOR_BGR2GRAY 、cv2.COLOR_BGR2HSV等幾十個吧。

Image Thresholding

1，cv2.threshold()

將一副圖像的像素點按照黑白顏色二分類，非黑即白。

圖像平滑

要用卷積核的啊。

1，cv2.filter2D()

2，cv2.blur()

3，cv2.GaussianBlur()

高斯模糊，gemfield最常用的。使用Gaussian kernel做卷積。

4， cv2.medianBlur()

5，cv2.bilateralFilter()

圖像梯度和邊緣檢測

1，cv2.Sobel()

2，cv2.Scharr()

3，cv2.Laplacian()

最好用的、最好上手的API當屬下Canny 邊緣檢測，它融合了梯度檢測和NMS：

4，cv2.Canny()

Image Pyramids 圖像金字塔

圖像融合方面用處很大。

1，cv2.pyrUp()

2，cv2.pyrDown()

圖像變換之Fourier Transform

1，cv2.dft()

2，cv2.idft()

Template Matching

1，cv2.matchTemplate()

模板匹配，這個就很有用了。雖然和神經網路比起來是慘不忍睹，但是對於一些模式恆定不變的目標檢測，這個還是很有用的。就是使用簡單的滑動窗口的方式去尋找獵物。模板匹配的模式有6種：

cv2.TM_CCOEFF；
cv2.TM_CCOEFF_NORMED
cv2.TM_CCORR
cv2.TM_CCORR_NORMED
cv2.TM_SQDIFF
cv2.TM_SQDIFF_NORMED

2，cv2.minMaxLoc()

Hough 線變換和圓變換

一版都會先做edge detection。

1，cv2.HoughLines()

用於檢測圖像中的直線。不知道為啥，效果很差。

2，cv2.HoughCircles()

用於檢測圖像中的直線。不知道為啥，效果還是很差。

Image Segmentation 圖像分割

1，cv2.watershed()

使用Watershed演算法。用處不大，和很多其它API一樣，神經網路出來後，這些東西就進博物館了。

Foreground Extraction 前景提取

1，cv2.grabCut()

Feature Detection and Description 特徵檢測和描述

1，cv2.cornerHarris()

2，cv2.cornerSubPix()

Harris角檢測，這個非常有用。

3，cv2.goodFeaturesToTrack()

Shi-Tomasi 角檢測演算法。這個應用很廣。

4，cv2.SIFT()

5，cv2.SURF()

這2個API在OpenCV 3.1中已經被挪走了。

Background Subtraction 背景移除

主要是為了拿到（移動的）前景，OpenCV實現了3個背景移除的演算法：BackgroundSubtractorMOG、BackgroundSubtractorMOG2、BackgroundSubtractorGMG。

1，cv2.BackgroundSubtractorMOG2()

使用了Gaussian Mixture-based Background/Foreground Segmentation 演算法，基於2004年的「Improved adaptive Gausian mixture model for background subtraction」和2006年的「Efficient AdaptiveDensity Estimation per Image Pixel for the Task of Background Subtraction」這2篇論文。

import numpy as npimport cv2cap = cv2.VideoCapture(gemfield.mp4)fgbg = cv2.createBackgroundSubtractorMOG2(history=20, detectShadows=False)while 1: ret, frame = cap.read() fgmask = fgbg.apply(frame) cv2.imshow(frame,fgmask) k = cv2.waitKey(30) & 0xff if k == 27: breakcv2.destroyAllWindows()cap.release()

createBackgroundSubtractorMOG2這個API有3個參數，detectShadows表明是否檢測影子，history指明當前幀受之前多少幀的影響, varThreshold設定閾值，越高的值表明越多的像素被歸為背景。

Morphological Transformations 形態變換

要用卷積核的。

1，cv2.erode()

腐蝕，亮色腐蝕（和卷積核有關）。

2，cv2.dilate()

膨脹，默認的卷積核是亮色膨脹。

3，cv2.morphologyEx()

根據flag的不同，可以是open或者close。

Histograms

1，cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])

frame = cv2.imread(gemfield.jpg,0)hist = cv2.calcHist([frame],[0],None,[256],[0,256])

參數介紹：

1，images : 輸入的圖像，用方括弧框起來：「[img]」。

2，channels : it is also given in square brackets. It is the index of channel for which we calculate histogram. For example, if input is grayscale image, its value is [0]. For color image, you can pass [0],[1] or [2] to calculate histogram of blue,green or red channel respectively。

3. mask : mask image. To find histogram of full image, it is given as 「None」. But if you want to find histogram of particular region of image, you have to create a mask image for that and give it as mask。

4. histSize : this represents our BIN count. Need to be given in square brackets. For full scale, we pass [256]。BIN就是pixal value的範圍的個數，比方說10到20就是一個BIN。

5. ranges : this is our RANGE. Normally, it is [0,256]。

Perspective Transformation 透視變換

這個非常有用。

1，cv2.getPerspectiveTransform()

得到透射變換矩陣。

2，cv2.warpPerspective()

直接透射變換圖像。

3，cv2.perspectiveTransform()

使用透射變換矩陣把之前坐標系的點轉換到新的坐標系中。

Contours 輪廓

1，cv2.findContours()

2，cv2.drawContours()

import numpy as npimport cv2frame = cv2.imread(gemfield.jpg)imgray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)ret,thresh = cv2.threshold(imgray,127,255,0)image, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)#畫出所有的contoursframe = cv2.drawContours(frame, contours, -1, (0,255,0), 3)#畫出第4個contourframe = cv2.drawContours(frame, contours, 3, (0,255,0), 3)#也可以這麼弄cnt = contours[4]frame = cv2.drawContours(frame, [cnt], 0, (0,255,0), 3)

如上面的代碼展示的那樣，cv2.findContours()函數有3個參數，第一個是frame，第二個參數指定尋找contour的模式（主要是因為輪廓會嵌套，所以衍生出各種模式），第三個參數是contour 的近似方法。返回值是三個，image、contours、hierarchy。返回值中的contours 是個python list，是image中的所有contours，該list中的每一個元素是一個numpy array，代表一個單獨的輪廓的邊界上的點的坐標。hierarchy表明輪廓的嵌套層級。

話說findContours()函數的第三個參數是contour的近似方法，這到底是什麼意思呢？它代表的意思就是如何去用點表示一個輪廓。比如：

cv2.CHAIN_APPROX_NONE，存儲輪廓線上的所有的點；
cv2.CHAIN_APPROX_SIMPLE，在上面的基礎上去掉冗餘的點；

3，cv2.moments()

輪廓矩。用一個字典M存儲輪廓矩的相關信息。然後根據這些信息計算得到自己想要的，比方說中心位置的x、y坐標。

4，cv2.contourArea(）

計算輪廓面積。

5，cv2.arcLength()

計算輪廓周長。

6，Contour Approximation

實現了Douglas-Peucker演算法，根據輪廓來擬合自己想要找的形狀。

7， cv2.convexHull()

這個有點像 contour approximation，但其實不是。這個函數檢查輪廓曲線是否有不是凸的部分，如果有，把它弄平或者凸。也可以單獨使用函數 cv2.isContourConvex()來僅僅做判斷（不返回修正後的曲線）。

8， cv2.boundingRect()

在輪廓上擬合矩形。為了擬合出面積最小的矩形，還可以使用cv2.minAreaRect()和cv2.boxPoints()來擬合出旋轉的矩形。

9， cv2.minEnclosingCircle()

在輪廓上擬合出圓。

10，cv2.fitEllipse()

在輪廓上擬合出橢圓。

11， cv2.fitLine()

在輪廓上擬合出直線。

12，cv2.matchShapes()

比較2個輪廓的異同。

HOG 方向梯度直方圖

1，cv2.HOGDescriptor()

hog = cv2.HOGDescriptor()hog.setSVMDetector( cv2.HOGDescriptor_getDefaultPeopleDetector() )while True: ret, frame = camera.read() found, w = hog.detectMultiScale(frame, winStride=(8,8), padding=(32,32), scale=1.05) for x, y, w, h in found: # the HOG detector returns slightly larger rectangles than the real objects. # so we slightly shrink the rectangles to get a nicer output. pad_w, pad_h = int(0.15*w), int(0.05*h) cv2.rectangle(frame, (x+pad_w, y+pad_h), (x+w-pad_w, y+h-pad_h), (0, 255, 0), 1) cv2.imshow(img, frame)

一般結合SVM分類器來檢測行人，這方面比較成功。

Image Inpainting

1，cv2.inpaint()

需要mask。

人臉檢測

要弄懂什麼是級聯啊。

1，cv2.CascadeClassifier()

視頻領域中的目標追蹤

1，cv2.meanShift()

使用聚類演算法進行目標追蹤。返回一個矩形的track window。

2， cv2.calcOpticalFlowPyrLK()

OpenCV實現的Lucas-Kanade 光流演算法，對於鏡頭多變的video來說，不好弄。

一些有用的github項目

1，鏡頭分割（切換）檢測

Breakthrough/PySceneDetect