OpenCV3學習筆記整理

05-19

OpenCV3學習筆記整理

1. 概述

OpenCV 3是一種先進的計算機視覺庫，可以用於各種圖像和視頻處理操作，通過opencv 3能很容易的實現一些有前景且功能先進的應用（比如：人臉識別或目標跟蹤等）。理解與計算機視覺相關的演算法、模型以及opencv 3 API背後的基本概念，有助於開發現實世界中的各種應用程序（比如：安全和監視領域的工具）。

幫助文檔：OpenCV documentation index

2. 常用文件操作介面

靜態圖片的操作：

img = cv2.imread(SrcPicPathName, param1)

SrcPicPath: 圖片路徑帶名字

Param1：read mode，總共有以下幾種（一般可省略，默認返回BGR圖像）

IMREAD_UNCHANGED = -1, //!< If set, return the loaded image as is (with alpha channel, otherwise it gets cropped).IMREAD_GRAYSCALE = 0, //!< If set, always convert image to the single channel grayscale image.IMREAD_COLOR = 1, //!< If set, always convert image to the 3 channel BGR color image.IMREAD_ANYDEPTH = 2, //!< If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit.IMREAD_ANYCOLOR = 4, //!< If set, the image is read in any possible color format.IMREAD_LOAD_GDAL = 8, //!< If set, use the gdal driver for loading the image.IMREAD_REDUCED_GRAYSCALE_2 = 16, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/2.IMREAD_REDUCED_COLOR_2 = 17, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/2.IMREAD_REDUCED_GRAYSCALE_4 = 32, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/4.IMREAD_REDUCED_COLOR_4 = 33, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/4.IMREAD_REDUCED_GRAYSCALE_8 = 64, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/8.IMREAD_REDUCED_COLOR_8 = 65, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/8.IMREAD_IGNORE_ORIENTATION = 128 //!< If set, do not rotate the image according to EXIFs orientation flag.

cv2.imread最終返回BGR格式的圖像，一般的色彩空間是red-green-blue即RGB，但cv2使用的色彩空間卻是BGR格式的，位元組順序相反

cv2.imread會刪除所有alpha通道的信息（透明度）

img擁有三個屬性：（img.share、img.size、img.datatype）

shape：numpy返回包含寬度、高度和通道數（如果圖像是彩色的）的數組，這在調試圖像類型時很有用；如果圖像是單色或灰度的，將不包含通道值

size：該屬性是指圖像像素的大小。

datatype：該屬性會得到圖像的數據類型（通常為一個無符號整數類型的變數和該類型占的位數，比如uint8類型）

建議使用numpy數組索引操作方法，如獲取感興趣區域：my_roi = img[0:100, 100:200]

img.item(x, y, ch):

X: 表示img某個像素的x行坐標

Y：表示img某個像素的y列坐標

Ch：表示img的某個通道

img.itemset((x,y,ch), val):

表示將x行y列ch通道的像素值設定為val

img.copy()：

對原始圖片的拷貝

cv2.imwrite(DstPicNameAndFormat， img)：

將讀到的或已經處理了的img寫入到特定路徑並制定格式，如.jpg，.png等等

img = cv2.resize(img,(1280,720),interpolation=cv2.INTER_AREA)：

重新調整圖片大小，第一個參數是原始圖片，第二個參數是縮放後的圖像大小元祖，最後一個參數是插值方式，分別有如下幾種插值方式：

CV_INTER_NN --> 最近鄰插值,

CV_INTER_LINEAR --> 雙線性插值 (預設使用)

CV_INTER_AREA --> 使用象素關係重採樣。當圖像縮小時候，該方法可以避免波紋出現。當圖像放大時，類似於 CV_INTER_NN 方法..

CV_INTER_CUBIC --> 立方插值.

cv2.imshow(「my img」, img)：

顯示名為「my img」的img圖片

cv2.waitKey():

Waitkey的參數為等待鍵盤觸發的時間，單位為毫秒，其返回值是-1（表示沒有鍵被按下）或ACSII碼

總所周知的bug，在一些系統中，waitkey返回的值比ASCII碼還要大，可通過如下tip只取低八位來保證只獲取ASCII碼：

keycode = cv2.waitKey(1)if keycode != -1: keycode &= 0xFF

Opencv的窗口只有在調用waitKey()函數時才會更新，waitKey()函數只有在opencv窗口成為活動窗口時，才能捕獲輸入信息。

cv2.destroyAllwindows()：

釋放所有由opencv創建的所有窗口

cv2.destroyWindows(my window)：

只釋放由opencv創建且名為「my window」的窗口

Cv2.namedWindow(』my window『)：

創建一個名為「my window」的窗口，圖片可show到這個窗口上

動態視頻文件的讀寫操作（opencv提供了VideoCapture類和VideoWriter類來支持各種格式的視頻文件，其中AVI格式不同系統都是支持的）：

videoCapture = cv2.VideoCapture(SrcVideopathandname)SrcVideopathandname：如MyInputVid.avi，視頻的路徑和名字fps = videoCapture.get(cv2.CAP_PROP_FPS) # 獲取視頻幀率size = (int(videoCapture.get(cv2.CAP_PROP_FRAME_WIDTH)), int(videoCapture.get(cv2.CAP_PROP_FRAME_HEIGHT))) # 獲取視頻寬高

常用的視頻屬性：

CAP_PROP_POS_MSEC =0, //!< Current position of the video file in milliseconds.CAP_PROP_POS_FRAMES =1, //!< 0-based index of the frame to be decoded/captured next.CAP_PROP_POS_AVI_RATIO =2, //!< Relative position of the video file: 0=start of the film, 1=end of the film.CAP_PROP_FRAME_WIDTH =3, //!< Width of the frames in the video stream.CAP_PROP_FRAME_HEIGHT =4, //!< Height of the frames in the video stream.CAP_PROP_FPS =5, //!< Frame rate.CAP_PROP_FOURCC =6, //!< 4-character code of codec. see VideoWriter::fourcc .CAP_PROP_FRAME_COUNT =7, //!< Number of frames in the video file.CAP_PROP_FORMAT =8, //!< Format of the %Mat objects returned by VideoCapture::retrieve().CAP_PROP_MODE =9, //!< Backend-specific value indicating the current capture mode.CAP_PROP_BRIGHTNESS =10, //!< Brightness of the image (only for those cameras that support).CAP_PROP_CONTRAST =11, //!< Contrast of the image (only for cameras).CAP_PROP_SATURATION =12, //!< Saturation of the image (only for cameras).CAP_PROP_HUE =13, //!< Hue of the image (only for cameras).CAP_PROP_GAIN =14, //!< Gain of the image (only for those cameras that support).

videoWriter = cv2.VideoWriter(MyOutputVid.avi, cv2.VideoWriter_fourcc(I,4,2,0), fps, size)

Param1：視頻文件名，必須指定

Param2：視頻編解碼器，必須指定，常用的有：

I,4,2,0：該選項是一個未壓縮的YUV顏色編碼，兼容性好，但產生文件較大，文件擴展名為.avi

P,T,M,I：該選項是MPEG-1編碼類型，文件擴展名為.avi

X,V,T,D：該選項是MPEG-4編碼類型，得到的視頻大小處於平均值，文件擴展名為.avi

T,H,E,O：該選項是Ogg Vorbis，文件擴展名為.ogv

F,L,V,1：該選項是一個flash視頻，文件擴展名為.flv

success, frame = videoCapture.read() #根據videoCapture抓取到的圖像逐幀寫到videoWriter指向的文件內while success: # Loop until there are no more frames. videoWriter.write(frame) success, frame = videoCapture.read()

捕獲攝像頭的幀（也是使用VideoCapture類，但是不是傳入視頻文件名，而是需要傳入設備索引）：

import cv2cameraCapture = cv2.VideoCapture(0) # 從索引號為0的攝像頭捕獲圖像fps = 30 # an assumptionsize = (int(cameraCapture.get(cv2.CAP_PROP_FRAME_WIDTH)),int(cameraCapture.get(cv2.CAP_PROP_FRAME_HEIGHT))) # camera捕獲的寬高，get無法返回準確的攝像頭幀速率，它總是返回0videoWriter = cv2.VideoWriter(MyOutputVid.avi, cv2.VideoWriter_fourcc(I,4,2,0), fps, size) # 以什麼樣的方式寫入特定文件名success, frame = cameraCapture.read()numFramesRemaining = 10 * fps - 1 #總的幀數while success and numFramesRemaining > 0: videoWriter.write(frame) # 捕獲到的幀寫入到文件里 success, frame = cameraCapture.read() numFramesRemaining -= 1cameraCapture.release() # 釋放攝像頭同步一組攝像頭或一個多頭攝像頭（例如立體攝像頭），read()不再合適，一般使用如下方法：success0 = cameraCapture0.grab()success1 = cameraCapture1.grab()if success0 and success1: frame0 = cameraCapture0.retrieve() frame1 = cameraCapture1.retrieve()

3. 常用檢測方法

a. Canny邊緣檢測

import cv2import numpy as npimg = cv2.imread("../images/statue_small.jpg", 0)cv2.imwrite("canny.jpg", cv2.Canny(img, 200, 300))cv2.imshow("canny", cv2.imread("canny.jpg"))cv2.waitKey()cv2.destroyAllWindows()

Canny邊緣檢測演算法的五個步驟：

i. 使用高斯濾波器對圖像進行去燥

ii. 計算梯度

iii. 在邊緣上使用非最大抑制（NMS）

iv. 在檢測到的邊緣上使用雙（double）閾值去除假陽性（false postitive）

v. 分析所有的邊緣及其之間的連接，以保留真知的邊緣並消除不明顯的邊緣

b. 輪廓檢測

import cv2import numpy as npimg = np.zeros((200, 200), dtype=np.uint8) # 設定一張全黑的200x200大小的imgimg[50:150, 50:150] = 255 # 將特定區域設為白色ret, thresh = cv2.threshold(img, 127, 255, 0) # 閾值image, contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) # 直接調用 cv2.findContours獲取輪廓信息，該函數包含三個參數：輸入圖像、層次類型和輪廓逼近方法color = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR) # 灰度轉為色彩img = cv2.drawContours(color, contours, -1, (0,255,0), 2) # 畫出輪廓cv2.imshow("contours", color) #顯示圖像cv2.waitKey()cv2.destroyAllWindows()

c. 邊界框、最小矩形區域和最小閉圓的輪廓

import cv2import numpy as npimg = cv2.pyrDown(cv2.imread("hammer.jpg", cv2.IMREAD_UNCHANGED))ret, thresh = cv2.threshold(cv2.cvtColor(img.copy(), cv2.COLOR_BGR2GRAY) ,127, 255, cv2.THRESH_BINARY)image, contours, hier = cv2.findContours(thresh, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE) for c in contours: # find bounding box coordinates x,y,w,h = cv2.boundingRect(c) # 先計算一個簡單的邊界框 cv2.rectangle(img, (x,y), (x+w, y+h), (0, 255, 0), 2) # 畫出這個邊界 # find minimum area rect = cv2.minAreaRect(c) # 計算最小矩形區域 # calculate coordinates of the minimum area rectangle box = cv2.boxPoints(rect) # normalize coordinates to integers box = np.int0(box) # draw contours cv2.drawContours(img, [box], 0, (0,0, 255), 3) # 畫出這個矩形 # calculate center and radius of minimum enclosing circle (x,y),radius = cv2.minEnclosingCircle(c) # 檢查邊界最小閉圓 # cast to integers center = (int(x),int(y)) radius = int(radius) # draw the circle img = cv2.circle(img,center,radius,(0,255,0),2)cv2.drawContours(img, contours, -1, (255, 0, 0), 1)cv2.imshow("contours", img)

d. 直線檢測

import cv2import numpy as np img = cv2.imread(lines.jpg)gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)edges = cv2.Canny(gray,50,120)minLineLength = 20maxLineGap = 5lines = cv2.HoughLinesP(edges,1,np.pi/180,100,minLineLength,maxLineGap) # 調用cv2.HoughLinesP檢測直線for x1,y1,x2,y2 in lines[0]: cv2.line(img,(x1,y1),(x2,y2),(0,255,0),2)cv2.imshow("edges", edges)cv2.imshow("lines", img)cv2.waitKey()cv2.destroyAllWindows()

e. 圓檢測

import cv2import numpy as np planets = cv2.imread(planet_glow.jpg)gray_img = cv2.cvtColor(planets, cv2.COLOR_BGR2GRAY)img = cv2.medianBlur(gray_img, 5)cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)circles = cv2.HoughCircles(img,cv2.HOUGH_GRADIENT,1,120,param1=100,param2=30,minRadius=0,maxRadius=0)circles = np.uint16(np.around(circles))for i in circles[0,:]: # draw the outer circle cv2.circle(planets,(i[0],i[1]),i[2],(0,255,0),2) # draw the center of the circle cv2.circle(planets,(i[0],i[1]),2,(0,0,255),3)cv2.imwrite("planets_circles.jpg", planets)cv2.imshow("HoughCirlces", planets)cv2.waitKey()cv2.destroyAllWindows()

4. 總結

《OpenCV 3計算機視覺（python語言實現第2版）》早已看完，但是如不常用，一些東西可能記得不太清，俗話說好記性不如爛筆頭，所以稍稍做個筆記，等哪天用到了當個工具查詢使用。