Caffe2教程--2.圖像載入與預處理

tags: [Deep Learning]

Caffe2 教程

本教程來自Caffe2官網

Caffe2 官網:caffe2.ai/

Caffe2 github: github.com/caffe2/caffe


翻譯與整理:張天亮

郵箱:tianliangjay@gmail.com

Blog:xingkongliang.github.io


本教程包含6個部分:

  1. Caffe2常用函數(workspace,operators,nets)
  2. 圖像載入和預處理
  3. 載入預訓練的模型
  4. Python Op教程
  5. 一個簡單的回歸模型
  6. MNIST數據集的LeNet網路

2. 圖像載入和預處理

%matplotlib inlineimport skimageimport skimage.io as ioimport skimage.transform import sysimport numpy as npimport mathfrom matplotlib import pyplotimport matplotlib.image as mpimgprint("Required modules imported.")

Caffe使用BGR的順序

由於OpenCV在Caffe中的傳統支持,並且它粗糲藍綠紅(BGR)順序的圖像,而不是更常用的紅綠藍(RGB)順序,所以Cafffe2也期望BGR順序。

# You can load either local IMAGE_FILE or remote URL# For Round 1 of this tutorial, try a local image.IMAGE_LOCATION = images/cat.jpgimg = skimage.img_as_float(skimage.io.imread(IMAGE_LOCATION)).astype(np.float32)# test color reading# show the original imagepyplot.figure()pyplot.subplot(1,2,1)pyplot.imshow(img)pyplot.axis(on)pyplot.title(Original image = RGB)# show the image in BGR - just doing RGB->BGR temporarily for displayimgBGR = img[:, :, (2, 1, 0)]#pyplot.figure()pyplot.subplot(1,2,2)pyplot.imshow(imgBGR)pyplot.axis(on)pyplot.title(OpenCV, Caffe2 = BGR)

Caffe喜歡CHW順序

  • H:Height
  • W:Width
  • C:Channel(as in color)

# Image came in sideways - it should be a portait image!# How you detect this depends on the platform# Could be a flag from the camera object# Could be in the EXIF data# ROTATED_IMAGE = "https://upload.wikimedia.org/wikipedia/commons/8/87/Cell_Phone_Tower_in_Ladakh_India_with_Buddhist_Prayer_Flags.jpg"ROTATED_IMAGE = "images/cell-tower.jpg"imgRotated = skimage.img_as_float(skimage.io.imread(ROTATED_IMAGE)).astype(np.float32)pyplot.figure()pyplot.imshow(imgRotated)pyplot.axis(on)pyplot.title(Rotated image)# Image came in flipped or mirrored - text is backwards!# Again detection depends on the platform# This one is intended to be read by drivers in their rear-view mirror# MIRROR_IMAGE = "https://upload.wikimedia.org/wikipedia/commons/2/27/Mirror_image_sign_to_be_read_by_drivers_who_are_backing_up_-b.JPG"MIRROR_IMAGE = "images/mirror-image.jpg"imgMirror = skimage.img_as_float(skimage.io.imread(MIRROR_IMAGE)).astype(np.float32)pyplot.figure()pyplot.imshow(imgMirror)pyplot.axis(on)pyplot.title(Mirror image)

圖像處理操作

鏡像

# Run me to flip the image back and forthimgMirror = np.fliplr(imgMirror)pyplot.figure()pyplot.imshow(imgMirror)pyplot.axis(off)pyplot.title(Mirror image)

旋轉

# Run me to rotate the image 90 degreesimgRotated = np.rot90(imgRotated)pyplot.figure()pyplot.imshow(imgRotated)pyplot.axis(off)pyplot.title(Rotated image)

調整

# Model is expecting 224 x 224, so resize/crop needed.# Here are the steps we use to preprocess the image.# (1) Resize the image to 256*256, and crop out the center.input_height, input_width = 224, 224print("Models input shape is %dx%d") % (input_height, input_width)#print("Original image is %dx%d") % (skimage.)img256 = skimage.transform.resize(img, (256, 256))pyplot.figure()pyplot.imshow(img256)pyplot.axis(on)pyplot.title(Resized image to 256x256)print("New image shape:" + str(img256.shape))

重新縮放(Rescaling)

print("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!")print("Models input shape is %dx%d") % (input_height, input_width)aspect = img.shape[1]/float(img.shape[0])print("Orginal aspect ratio: " + str(aspect))if(aspect>1): # landscape orientation - wide image res = int(aspect * input_height) imgScaled = skimage.transform.resize(img, (input_height, res))if(aspect<1): # portrait orientation - tall image res = int(input_width/aspect) imgScaled = skimage.transform.resize(img, (res, input_width))if(aspect == 1): imgScaled = skimage.transform.resize(img, (input_height, input_width))pyplot.figure()pyplot.imshow(imgScaled)pyplot.axis(on)pyplot.title(Rescaled image)print("New image shape:" + str(imgScaled.shape) + " in HWC")

Output:

Original image shape:(360, 480, 3) and remember it should be in H, W, C!Models input shape is 224x224Orginal aspect ratio: 1.33333333333New image shape:(224, 298, 3) in HWC

裁剪

# Compare the images and cropping strategies# Try a center crop on the original for gigglesprint("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!")def crop_center(img,cropx,cropy): y,x,c = img.shape startx = x//2-(cropx//2) starty = y//2-(cropy//2) return img[starty:starty+cropy,startx:startx+cropx]# yes, the function above should match resize and take a tuple...pyplot.figure()# Original imageimgCenter = crop_center(img,224,224)pyplot.subplot(1,3,1)pyplot.imshow(imgCenter)pyplot.axis(on)pyplot.title(Original)# Now lets see what this does on the distorted imageimg256Center = crop_center(img256,224,224)pyplot.subplot(1,3,2)pyplot.imshow(img256Center)pyplot.axis(on)pyplot.title(Squeezed)# Scaled imageimgScaledCenter = crop_center(imgScaled,224,224)pyplot.subplot(1,3,3)pyplot.imshow(imgScaledCenter)pyplot.axis(on)pyplot.title(Scaled)

上採樣

imgTiny = "images/Cellsx128.png"imgTiny = skimage.img_as_float(skimage.io.imread(imgTiny)).astype(np.float32)print "Original image shape: ", imgTiny.shapeimgTiny224 = skimage.transform.resize(imgTiny, (224, 224))print "Upscaled image shape: ", imgTiny224.shape

Output:

Original image shape: (128, 128, 4)Upscaled image shape: (224, 224, 4)

PNG格式圖像是4個通道。

最終處理和批處理

我們首先將圖像的數據順序切換到BGR,然後重新編碼用於GPU處理的列(HWC->CHW),然後向圖像添加第四維(N),表示圖像的數量。最終的順序是:N,C,H,W。

# this next line helps with being able to rerun this section# if you want to try the outputs of the different crop strategies above# swap out imgScaled with img (original) or img256 (squeezed)imgCropped = crop_center(imgScaled,224,224)print "Image shape before HWC --> CHW conversion: ", imgCropped.shape# (1) Since Caffe expects CHW order and the current image is HWC,# we will need to change the order.imgCropped = imgCropped.swapaxes(1, 2).swapaxes(0, 1)print "Image shape after HWC --> CHW conversion: ", imgCropped.shapepyplot.figure()for i in range(3): # For some reason, pyplot subplot follows Matlabs indexing # convention (starting with 1). Well, well just follow it... pyplot.subplot(1, 3, i+1) pyplot.imshow(imgCropped[i]) pyplot.axis(off) pyplot.title(RGB channel %d % (i+1))# (2) Caffe uses a BGR order due to legacy OpenCV issues, so we# will change RGB to BGR.imgCropped = imgCropped[(2, 1, 0), :, :]print "Image shape after BGR conversion: ", imgCropped.shape# for discussion later - not helpful at this point# (3) We will subtract the mean image. Note that skimage loads# image in the [0, 1] range so we multiply the pixel values# first to get them into [0, 255].#mean_file = os.path.join(CAFFE_ROOT, python/caffe/imagenet/ilsvrc_2012_mean.npy)#mean = np.load(mean_file).mean(1).mean(1)#img = img * 255 - mean[:, np.newaxis, np.newaxis]pyplot.figure()for i in range(3): # For some reason, pyplot subplot follows Matlabs indexing # convention (starting with 1). Well, well just follow it... pyplot.subplot(1, 3, i+1) pyplot.imshow(imgCropped[i]) pyplot.axis(off) pyplot.title(BGR channel %d % (i+1))# (4) finally, since caffe2 expect the input to have a batch term# so we can feed in multiple images, we will simply prepend a# batch dimension of size 1. Also, we will make sure image is# of type np.float32.imgCropped = imgCropped[np.newaxis, :, :, :].astype(np.float32)print Final input shape is:, imgCropped.shape

重點

# HWC to CHWimgCropped = imgCropped.swapaxes(1, 2).swapaxes(0, 1)# RGB to BGRimgCropped = imgCropped[(2, 1, 0), :, :]# prepend a batch dimension of size 1imgCropped = imgCropped[np.newaxis, :, :, :].astype(np.float32)

推薦閱讀:

Caffe2 教程--5. A Toy Regression
【人工智慧學習總結3】圖像的相似度衡量指標、二值化方法評估指標(二)
(科普)簡單的人臉識別
Rocket Training: 一種提升輕量網路性能的訓練方法
【小林的OpenCV基礎課 番外】卷積與濾波

TAG:深度學習DeepLearning | 卷積神經網路CNN | 計算機視覺 |