TensorFlow 教程 #06 - CIFAR-10

01-24

本文主要演示了在CIFAR-10數據集上進行圖像識別。
其中有大段之前教程的文字及代碼，如果看過的朋友可以快速翻閱。
01 - 簡單線性模型/ 02 - 卷積神經網路/ 03 - PrettyTensor/ 04 - 保存 & 恢復/ 05 - 集成學習

by Magnus Erik Hvass Pedersen / GitHub / Videos on YouTube

中文翻譯 thrillerist/Github

如有轉載，請附上本文鏈接。__________________________________________________________________________

簡介

這篇教程介紹了如何創建一個在CIRAR-10數據集上進行圖像分類的卷積神經網路。同時也說明了在訓練和測試時如何使用不同的網路。

本文基於上一篇教程，你需要了解基本的TensorFlow和附加包Pretty Tensor。其中大量代碼和文字與之前教程相似，如果你已經看過可以快速地瀏覽本文。

流程圖

下面的圖表直接顯示了之後實現的卷積神經網路中數據的傳遞。首先有一個扭曲（distorts）輸入圖像的預處理層，用來人為地擴大訓練集。接著有兩個卷積層，兩個全連接層和一個softmax分類層。在後面會有更大的圖示來顯示權重和卷積層的輸出，教程 #02 有卷積如何工作的更多細節。

在這種情況下圖像是誤分類的。圖像上有一隻狗，但神經網路不確定它是狗還是貓，認為更有可能是貓。

from IPython.display import ImageImage("images/06_network_flowchart.png")

導入

%matplotlib inlineimport matplotlib.pyplot as pltimport tensorflow as tfimport numpy as npfrom sklearn.metrics import confusion_matriximport timefrom datetime import timedeltaimport mathimport os# Use PrettyTensor to simplify Neural Network construction.import prettytensor as pt

使用Python3.5.2（Anaconda）開發，TensorFlow版本是：

tf.__version__

"0.12.0-rc0"

PrettyTensor 版本:

pt.__version__

"0.7.1"

載入數據

import cifar10

設置電腦上保存數據集的路徑。

# cifar10.data_path = "data/CIFAR-10/"

CIFAR-10數據集大概有163MB，如果給定路徑沒有找到文件的話，將會自動下載。

cifar10.maybe_download_and_extract()

Data has apparently already been downloaded and unpacked.

載入分類名稱。

class_names = cifar10.load_class_names()class_names

Loading data: data/CIFAR-10/cifar-10-batches-py/batches.meta
["airplane",
"automobile",
"bird",
"cat",
"deer",
"dog",
"frog",
"horse",

"ship",
"truck"]

載入訓練集。這個函數返回圖像、整形分類號碼、以及用One-Hot編碼的分類號數組，稱為標籤。

images_train, cls_train, labels_train = cifar10.load_training_data()

Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_1
Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_2
Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_3
Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_4
Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_5

載入測試集。

images_test, cls_test, labels_test = cifar10.load_test_data()

Loading data: data/CIFAR-10/cifar-10-batches-py/test_batch

現在已經載入了CIFAR-10數據集，它包含60,000張圖像以及相關的標籤（圖像的分類）。數據集被分為兩個獨立的子集，即訓練集和測試集。

print("Size of:")print("- Training-set: {}".format(len(images_train)))print("- Test-set: {}".format(len(images_test)))

Size of:

- Training-set: 50000
- Test-set: 10000

數據維度

下面的代碼中多次用到數據維度。cirfa10模塊中已經定義好了這些，因此我們只需要import進來。

from cifar10 import img_size, num_channels, num_classes

圖像是32 x 32像素的，但我們將圖像裁剪至24 x 24像素。

img_size_cropped = 24

用來繪製圖片的幫助函數

這個函數用來在3x3的柵格中畫9張圖像，然後在每張圖像下面寫出真實類別和預測類別。

def plot_images(images, cls_true, cls_pred=None, smooth=True): assert len(images) == len(cls_true) == 9 # Create figure with sub-plots. fig, axes = plt.subplots(3, 3) # Adjust vertical spacing if we need to print ensemble and best-net. if cls_pred is None: hspace = 0.3 else: hspace = 0.6 fig.subplots_adjust(hspace=hspace, wspace=0.3) for i, ax in enumerate(axes.flat): # Interpolation type. if smooth: interpolation = "spline16" else: interpolation = "nearest" # Plot image. ax.imshow(images[i, :, :, :], interpolation=interpolation) # Name of the true class. cls_true_name = class_names[cls_true[i]] # Show true and predicted classes. if cls_pred is None: xlabel = "True: {0}".format(cls_true_name) else: # Name of the predicted class. cls_pred_name = class_names[cls_pred[i]] xlabel = "True: {0} Pred: {1}".format(cls_true_name, cls_pred_name) # Show the classes as the label on the x-axis. ax.set_xlabel(xlabel) # Remove ticks from the plot. ax.set_xticks([]) ax.set_yticks([]) # Ensure the plot is shown correctly with multiple plots # in a single Notebook cell. plt.show()

繪製幾張圖像來看看數據是否正確

# Get the first images from the test-set.images = images_test[0:9]# Get the true classes for those images.cls_true = cls_test[0:9]# Plot the images and labels using our helper-function above.plot_images(images=images, cls_true=cls_true, smooth=False)

上面像素化的圖像是神經網路的輸入。如果我們對圖像進行平滑處理，可能更易於人眼識別。

plot_images(images=images, cls_true=cls_true, smooth=True)

TensorFlow圖

TensorFlow的全部目的就是使用一個稱之為計算圖（computational graph）的東西，它會比直接在Python中進行相同計算量要高效得多。TensorFlow比Numpy更高效，因為TensorFlow了解整個需要運行的計算圖，然而Numpy只知道某個時間點上唯一的數學運算。

TensorFlow也能夠自動地計算需要優化的變數的梯度，使得模型有更好的表現。這是由於圖是簡單數學表達式的結合，因此整個圖的梯度可以用鏈式法則推導出來。

TensorFlow還能利用多核CPU和GPU，Google也為TensorFlow製造了稱為TPUs（Tensor Processing Units）的特殊晶元，它比GPU更快。

一個TensorFlow圖由下面幾個部分組成，後面會詳細描述：

佔位符變數（Placeholder）用來改變圖的輸入。
模型變數（Model）將會被優化，使得模型表現得更好。
模型本質上就是一些數學函數，它根據Placeholder和模型的輸入變數來計算一些輸出。
一個cost度量用來指導變數的優化。
一個優化策略會更新模型的變數。

另外，TensorFlow圖也包含了一些調試狀態，比如用TensorBoard列印log數據，本教程不涉及這些。

佔位符（Placeholder）變數

Placeholder是作為圖的輸入，我們每次運行圖的時候都可能改變它們。將這個過程稱為feeding placeholder變數，後面將會描述這個。

首先我們為輸入圖像定義placeholder變數。這讓我們可以改變輸入到TensorFlow圖中的圖像。這也是一個張量（tensor），代表一個多維向量或矩陣。數據類型設置為float32，形狀設為[None, img_size, img_size, num_channels]代表tensor可能保存著任意數量的圖像，每張圖像寬高都為img_size，有num_channels個顏色通道。

x = tf.placeholder(tf.float32, shape=[None, img_size, img_size, num_channels], name="x")

接下來我們為輸入變數x中的圖像所對應的真實標籤定義placeholder變數。變數的形狀是[None, num_classes]，這代表著它保存了任意數量的標籤，每個標籤是長度為num_classes的向量，本例中長度為10。

y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name="y_true")

我們也可以為class-number提供一個placeholder，但這裡用argmax來計算它。這裡只是TensorFlow中的一些操作，沒有執行什麼運算。

y_true_cls = tf.argmax(y_true, dimension=1)

預處理的幫助函數

下面的幫助函數創建了用來預處理輸入圖像的TensorFlow計算圖。這裡並未執行計算，函數只是給TensorFlow計算圖添加了節點。

神經網路在訓練和測試階段的預處理方法不同：

對於訓練來說，輸入圖像是隨機裁剪、水平翻轉的，並且用隨機值來調整色調、對比度和飽和度。這樣就創建了原始輸入圖像的隨機變體，人為地擴充了訓練集。後面會顯示一些扭曲過的圖像樣本。
對於測試，輸入圖像根據中心裁剪，其他不作調整。

def pre_process_image(image, training): # This function takes a single image as input, # and a boolean whether to build the training or testing graph. if training: # For training, add the following to the TensorFlow graph. # Randomly crop the input image. image = tf.random_crop(image, size=[img_size_cropped, img_size_cropped, num_channels]) # Randomly flip the image horizontally. image = tf.image.random_flip_left_right(image) # Randomly adjust hue, contrast and saturation. image = tf.image.random_hue(image, max_delta=0.05) image = tf.image.random_contrast(image, lower=0.3, upper=1.0) image = tf.image.random_brightness(image, max_delta=0.2) image = tf.image.random_saturation(image, lower=0.0, upper=2.0) # Some of these functions may overflow and result in pixel # values beyond the [0, 1] range. It is unclear from the # documentation of TensorFlow 0.10.0rc0 whether this is # intended. A simple solution is to limit the range. # Limit the image pixels between [0, 1] in case of overflow. image = tf.minimum(image, 1.0) image = tf.maximum(image, 0.0) else: # For training, add the following to the TensorFlow graph. # Crop the input image around the centre so it is the same # size as images that are randomly cropped during training. image = tf.image.resize_image_with_crop_or_pad(image, target_height=img_size_cropped, target_width=img_size_cropped) return image

下面函數中，輸入batch中每張圖像都調用以上函數。

def pre_process(images, training): # Use TensorFlow to loop over all the input images and call # the function above which takes a single image as input. images = tf.map_fn(lambda image: pre_process_image(image, training), images) return images

為了繪製扭曲過的圖像，我們為TensorFlow創建預處理graph，後面將會運行它。

distorted_images = pre_process(images=x, training=True)

創建主要處理程序的幫助函數

下面的幫助函數創建了卷積神經網路的主要部分。這裡使用之前教程描述過的Pretty Tensor。

def main_network(images, training): # Wrap the input images as a Pretty Tensor object. x_pretty = pt.wrap(images) # Pretty Tensor uses special numbers to distinguish between # the training and testing phases. if training: phase = pt.Phase.train else: phase = pt.Phase.infer # Create the convolutional neural network using Pretty Tensor. # It is very similar to the previous tutorials, except # the use of so-called batch-normalization in the first layer. with pt.defaults_scope(activation_fn=tf.nn.relu, phase=phase): y_pred, loss = x_pretty. conv2d(kernel=5, depth=64, name="layer_conv1", batch_normalize=True). max_pool(kernel=2, stride=2). conv2d(kernel=5, depth=64, name="layer_conv2"). max_pool(kernel=2, stride=2). flatten(). fully_connected(size=256, name="layer_fc1"). fully_connected(size=128, name="layer_fc2"). softmax_classifier(num_classes=num_classes, labels=y_true) return y_pred, loss

創建神經網路的幫助函數

下面的幫助函數創建了整個神經網路，包含上面定義的預處理以及主要處理模塊。

注意，神經網路被編碼到"network"變數作用域中。因為我們實際上在TensorFlow圖中創建了兩個神經網路。像這樣指定一個變數作用域，可以在兩個神經網路中復用變數，因此訓練網路優化過的變數可以在測試網路中復用。

def create_network(training): # Wrap the neural network in the scope named "network". # Create new variables during training, and re-use during testing. with tf.variable_scope("network", reuse=not training): # Just rename the input placeholder variable for convenience. images = x # Create TensorFlow graph for pre-processing. images = pre_process(images=images, training=training) # Create TensorFlow graph for the main processing. y_pred, loss = main_network(images=images, training=training) return y_pred, loss

為訓練階段創建神經網路

首先創建一個保存當前優化迭代次數的TensorFlow變數。在之前的教程中，是使用一個Python變數，但本教程中，我們想用checkpoints中的其他TensorFlow變數來保存。

trainable=False表示TensorFlow不會優化此變數。

global_step = tf.Variable(initial_value=0, name="global_step", trainable=False)

創建訓練用的神經網路。函數 create_network()返回y_pred和loss，但在訓練時我們只需用到loss函數。

_, loss = create_network(training=True)

創建最小化loss函數的優化器。同時將global_step傳給優化器，這樣每次迭代它都減一。

optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(loss, global_step=global_step)

創建測試階段的神經網路

現在創建測試階段的神經網路。同樣的，create_network() 返回輸入圖像的預測標籤 y_pred，優化過程也用到 loss函數。測試時我們只需要y_pred。

y_pred, _ = create_network(training=False)

然後我們計算預測類別號的整形數字。網路的輸出y_pred是一個10個元素的數組。類別號是數組中最大元素的索引。

y_pred_cls = tf.argmax(y_pred, dimension=1)

然後創建一個布爾向量，用來告訴我們每張圖片的真實類別是否與預測類別相同。

correct_prediction = tf.equal(y_pred_cls, y_true_cls)

上面的計算先將布爾值向量類型轉換成浮點型向量，這樣子False就變成0，True變成1，然後計算這些值的平均數，以此來計算分類的準確率。

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Saver

為了保存神經網路的變數（這樣不必再次訓練網路就能重載），我們創建一個稱為Saver-object的對象，它用來保存及恢復TensorFlow圖的所有變數。在這裡並未保存什麼東西，（保存操作）在後面的optimize()函數中完成。

saver = tf.train.Saver()

獲取權重

下面，我們要繪製神經網路的權重。當使用Pretty Tensor來創建網路時，層的所有變數都是由Pretty Tensoe間接創建的。因此我們要從TensorFlow中獲取變數。

我們用layer_conv1 和 layer_conv2代表兩個卷積層。這也叫變數作用域（不要與上面描述的defaults_scope混淆了）。PrettyTensor會自動給它為每個層創建的變數命名，因此我們可以通過層的作用域名稱和變數名來取得某一層的權重。

函數實現有點笨拙，因為我們不得不用TensorFlow函數get_variable()，它是設計給其他用途的，創建新的變數或重用現有變數。創建下面的幫助函數很簡單。

def get_weights_variable(layer_name): # Retrieve an existing variable named "weights" in the scope # with the given layer_name. # This is awkward because the TensorFlow function was # really intended for another purpose. with tf.variable_scope("network/" + layer_name, reuse=True): variable = tf.get_variable("weights") return variable

藉助這個幫助函數我們可以獲取變數。這些是TensorFlow的objects。你需要類似的操作來獲取變數的內容： contents = session.run(weights_conv1) ，下面會提到這個。

weights_conv1 = get_weights_variable(layer_name="layer_conv1")weights_conv2 = get_weights_variable(layer_name="layer_conv2")

獲取layer的輸出

同樣的，我們還需要獲取卷積層的輸出。這個函數與上面獲取權重的函數有所不同。這裡我們找回卷積層輸出的最後一個張量。

def get_layer_output(layer_name): # The name of the last operation of the convolutional layer. # This assumes you are using Relu as the activation-function. tensor_name = "network/" + layer_name + "/Relu:0" # Get the tensor with this name. tensor = tf.get_default_graph().get_tensor_by_name(tensor_name) return tensor

取得卷積層的輸出以便之後繪製。

output_conv1 = get_layer_output(layer_name="layer_conv1")output_conv2 = get_layer_output(layer_name="layer_conv2")

運行TensorFlow

創建TensorFlow會話（session）

一旦創建了TensorFlow圖，我們需要創建一個TensorFlow會話，用來運行圖。

session = tf.Session()

初始化或恢復變數

訓練神經網路會花上很長時間，特別是當你沒有GPU的時候。因此我們在訓練時保存checkpoints，這樣就能在其他時間繼續訓練（比如晚上），以後也可以不用訓練神經網路就用這些來分析結果。

如果你想重新訓練神經網路，就需要先刪掉這些checkpoints。

這是用來保存checkpoints的文件夾。

save_dir = "checkpoints/"

如果文件夾不存在則創建。

if not os.path.exists(save_dir): os.makedirs(save_dir)

這是checkpoints的基本文件名，TensorFlow會在後面添加迭代次數等。

save_path = os.path.join(save_dir, "cifar10_cnn")

試著載入最新的checkpoint。如果checkpoint不存在或改變了TensorFlow圖的話，可能會失敗並拋出異常。

try: print("Trying to restore last checkpoint ...") # Use TensorFlow to find the latest checkpoint - if any. last_chk_path = tf.train.latest_checkpoint(checkpoint_dir=save_dir) # Try and load the data in the checkpoint. saver.restore(session, save_path=last_chk_path) # If we get to this point, the checkpoint was successfully loaded. print("Restored checkpoint from:", last_chk_path)except: # If the above failed for some reason, simply # initialize all the variables for the TensorFlow graph. print("Failed to restore checkpoint. Initializing variables instead.") session.run(tf.global_variables_initializer())

Trying to restore last checkpoint ...

Restored checkpoint from: checkpoints/cifar10_cnn-150000

創建隨機訓練batch的幫助函數

在訓練集中有50,000張圖。用這些圖像計算模型的梯度會花很多時間。因此，在優化器的每次迭代里只用到了一小部分的圖像。

如果內存耗盡導致電腦死機或變得很慢，你應該試著減少這些數量，但同時可能還需要更優化的迭代。

train_batch_size = 64

函數從訓練集中挑選一個隨機的training-batch。

def random_batch(): # Number of images in the training-set. num_images = len(images_train) # Create a random index. idx = np.random.choice(num_images, size=train_batch_size, replace=False) # Use the random index to select random images and labels. x_batch = images_train[idx, :, :, :] y_batch = labels_train[idx, :] return x_batch, y_batch

執行優化迭代的幫助函數

函數用來執行一定數量的優化迭代，以此來逐漸改善網路層的變數。在每次迭代中，會從訓練集中選擇新的一批數據，然後TensorFlow在這些訓練樣本上執行優化。每100次迭代會列印出進度。每1000次迭代後會保存一個checkpoint，最後一次迭代完畢也會保存。

def optimize(num_iterations): # Start-time used for printing time-usage below. start_time = time.time() for i in range(num_iterations): # Get a batch of training examples. # x_batch now holds a batch of images and # y_true_batch are the true labels for those images. x_batch, y_true_batch = random_batch() # Put the batch into a dict with the proper names # for placeholder variables in the TensorFlow graph. feed_dict_train = {x: x_batch, y_true: y_true_batch} # Run the optimizer using this batch of training data. # TensorFlow assigns the variables in feed_dict_train # to the placeholder variables and then runs the optimizer. # We also want to retrieve the global_step counter. i_global, _ = session.run([global_step, optimizer], feed_dict=feed_dict_train) # Print status to screen every 100 iterations (and last). if (i_global % 100 == 0) or (i == num_iterations - 1): # Calculate the accuracy on the training-batch. batch_acc = session.run(accuracy, feed_dict=feed_dict_train) # Print status. msg = "Global Step: {0:>6}, Training Batch Accuracy: {1:>6.1%}" print(msg.format(i_global, batch_acc)) # Save a checkpoint to disk every 1000 iterations (and last). if (i_global % 1000 == 0) or (i == num_iterations - 1): # Save all variables of the TensorFlow graph to a # checkpoint. Append the global_step counter # to the filename so we save the last several checkpoints. saver.save(session, save_path=save_path, global_step=global_step) print("Saved checkpoint.") # Ending time. end_time = time.time() # Difference between start and end-times. time_dif = end_time - start_time # Print the time-usage. print("Time usage: " + str(timedelta(seconds=int(round(time_dif)))))

用來繪製錯誤樣本的幫助函數

函數用來繪製測試集中被誤分類的樣本。

def plot_example_errors(cls_pred, correct): # This function is called from print_test_accuracy() below. # cls_pred is an array of the predicted class-number for # all images in the test-set. # correct is a boolean array whether the predicted class # is equal to the true class for each image in the test-set. # Negate the boolean array. incorrect = (correct == False) # Get the images from the test-set that have been # incorrectly classified. images = images_test[incorrect] # Get the predicted classes for those images. cls_pred = cls_pred[incorrect] # Get the true classes for those images. cls_true = cls_test[incorrect] # Plot the first 9 images. plot_images(images=images[0:9], cls_true=cls_true[0:9], cls_pred=cls_pred[0:9])

繪製混淆（confusion）矩陣的幫助函數

def plot_confusion_matrix(cls_pred): # This is called from print_test_accuracy() below. # cls_pred is an array of the predicted class-number for # all images in the test-set. # Get the confusion matrix using sklearn. cm = confusion_matrix(y_true=cls_test, # True class for test-set. y_pred=cls_pred) # Predicted class. # Print the confusion matrix as text. for i in range(num_classes): # Append the class-name to each line. class_name = "({}) {}".format(i, class_names[i]) print(cm[i, :], class_name) # Print the class-numbers for easy reference. class_numbers = [" ({0})".format(i) for i in range(num_classes)] print("".join(class_numbers))

計算分類的幫助函數

這個函數用來計算圖像的預測類別，同時返回一個代表每張圖像分類是否正確的布爾數組。

由於計算可能會耗費太多內存，就分批處理。如果你的電腦死機了，試著降低batch-size。

# Split the data-set in batches of this size to limit RAM usage.batch_size = 256def predict_cls(images, labels, cls_true): # Number of images. num_images = len(images) # Allocate an array for the predicted classes which # will be calculated in batches and filled into this array. cls_pred = np.zeros(shape=num_images, dtype=np.int) # Now calculate the predicted classes for the batches. # We will just iterate through all the batches. # There might be a more clever and Pythonic way of doing this. # The starting index for the next batch is denoted i. i = 0 while i < num_images: # The ending index for the next batch is denoted j. j = min(i + batch_size, num_images) # Create a feed-dict with the images and labels # between index i and j. feed_dict = {x: images[i:j, :], y_true: labels[i:j, :]} # Calculate the predicted class using TensorFlow. cls_pred[i:j] = session.run(y_pred_cls, feed_dict=feed_dict) # Set the start-index for the next batch to the # end-index of the current batch. i = j # Create a boolean array whether each image is correctly classified. correct = (cls_true == cls_pred) return correct, cls_pred

def predict_cls_test(): return predict_cls(images = images_test, labels = labels_test, cls_true = cls_test)

計算分類準確率的幫助函數

這個函數計算了給定布爾數組的分類準確率，布爾數組表示每張圖像是否被正確分類。比如， cls_accuracy([True, True, False, False, False]) = 2/5 = 0.4。這個函數也返回了正確分類的數量。

def classification_accuracy(correct): # When averaging a boolean array, False means 0 and True means 1. # So we are calculating: number of True / len(correct) which is # the same as the classification accuracy. # Return the classification accuracy # and the number of correct classifications. return correct.mean(), correct.sum()

展示性能的幫助函數

函數用來列印測試集上的分類準確率。

為測試集上的所有圖片計算分類會花費一段時間，因此我們直接從這個函數里調用上面的函數，這樣就不用每個函數都重新計算分類。

def print_test_accuracy(show_example_errors=False, show_confusion_matrix=False): # For all the images in the test-set, # calculate the predicted classes and whether they are correct. correct, cls_pred = predict_cls_test() # Classification accuracy and the number of correct classifications. acc, num_correct = classification_accuracy(correct) # Number of images being classified. num_images = len(correct) # Print the accuracy. msg = "Accuracy on Test-Set: {0:.1%} ({1} / {2})" print(msg.format(acc, num_correct, num_images)) # Plot some examples of mis-classifications, if desired. if show_example_errors: print("Example errors:") plot_example_errors(cls_pred=cls_pred, correct=correct) # Plot the confusion matrix, if desired. if show_confusion_matrix: print("Confusion Matrix:") plot_confusion_matrix(cls_pred=cls_pred)

繪製卷積權重的幫助函數

def plot_conv_weights(weights, input_channel=0): # Assume weights are TensorFlow ops for 4-dim variables # e.g. weights_conv1 or weights_conv2. # Retrieve the values of the weight-variables from TensorFlow. # A feed-dict is not necessary because nothing is calculated. w = session.run(weights) # Print statistics for the weights. print("Min: {0:.5f}, Max: {1:.5f}".format(w.min(), w.max())) print("Mean: {0:.5f}, Stdev: {1:.5f}".format(w.mean(), w.std())) # Get the lowest and highest values for the weights. # This is used to correct the colour intensity across # the images so they can be compared with each other. w_min = np.min(w) w_max = np.max(w) abs_max = max(abs(w_min), abs(w_max)) # Number of filters used in the conv. layer. num_filters = w.shape[3] # Number of grids to plot. # Rounded-up, square-root of the number of filters. num_grids = math.ceil(math.sqrt(num_filters)) # Create figure with a grid of sub-plots. fig, axes = plt.subplots(num_grids, num_grids) # Plot all the filter-weights. for i, ax in enumerate(axes.flat): # Only plot the valid filter-weights. if i<num_filters: # Get the weights for the i"th filter of the input channel. # The format of this 4-dim tensor is determined by the # TensorFlow API. See Tutorial #02 for more details. img = w[:, :, input_channel, i] # Plot image. ax.imshow(img, vmin=-abs_max, vmax=abs_max, interpolation="nearest", cmap="seismic") # Remove ticks from the plot. ax.set_xticks([]) ax.set_yticks([]) # Ensure the plot is shown correctly with multiple plots # in a single Notebook cell. plt.show()

繪製卷積層輸出的幫助函數

def plot_layer_output(layer_output, image): # Assume layer_output is a 4-dim tensor # e.g. output_conv1 or output_conv2. # Create a feed-dict which holds the single input image. # Note that TensorFlow needs a list of images, # so we just create a list with this one image. feed_dict = {x: [image]} # Retrieve the output of the layer after inputting this image. values = session.run(layer_output, feed_dict=feed_dict) # Get the lowest and highest values. # This is used to correct the colour intensity across # the images so they can be compared with each other. values_min = np.min(values) values_max = np.max(values) # Number of image channels output by the conv. layer. num_images = values.shape[3] # Number of grid-cells to plot. # Rounded-up, square-root of the number of filters. num_grids = math.ceil(math.sqrt(num_images)) # Create figure with a grid of sub-plots. fig, axes = plt.subplots(num_grids, num_grids) # Plot all the filter-weights. for i, ax in enumerate(axes.flat): # Only plot the valid image-channels. if i<num_images: # Get the images for the i"th output channel. img = values[0, :, :, i] # Plot image. ax.imshow(img, vmin=values_min, vmax=values_max, interpolation="nearest", cmap="binary") # Remove ticks from the plot. ax.set_xticks([]) ax.set_yticks([]) # Ensure the plot is shown correctly with multiple plots # in a single Notebook cell. plt.show()

輸入圖像變體的樣本

為了人為地增加訓練用的圖像數量，神經網路預處理獲取輸入圖像的隨機變體。這讓神經網路在識別和分類圖像時更加靈活。

這是用來繪製輸入圖像變體的幫助函數。

def plot_distorted_image(image, cls_true): # Repeat the input image 9 times. image_duplicates = np.repeat(image[np.newaxis, :, :, :], 9, axis=0) # Create a feed-dict for TensorFlow. feed_dict = {x: image_duplicates} # Calculate only the pre-processing of the TensorFlow graph # which distorts the images in the feed-dict. result = session.run(distorted_images, feed_dict=feed_dict) # Plot the images. plot_images(images=result, cls_true=np.repeat(cls_true, 9))

幫助函數獲取測試集圖像以及它的分類號。

def get_test_image(i): return images_test[i, :, :, :], cls_test[i]

從測試集中取一張圖像以及它的真實類別。

img, cls = get_test_image(16)

畫出圖像的9張隨機變體。如果你重新運行代碼，可能會得到不太一樣的結果。

plot_distorted_image(img, cls)

執行優化

我的筆記本電腦是4核的，每個2GHz。電腦帶有一個GPU，但對TensorFlow來說不太快，因此只用了CPU。在電腦上迭代10,000次大概花了1個小時。本教程中我執行了150,000次優化迭代，共花了15個小時。我讓它在夜裡以及白天的幾個時間段運行。

由於我們在優化過程中保存了checkpoints，重新運行代碼時會載入最後的那個checkpoint，所以可以先停止，等晚點再繼續執行優化。

if False: optimize(num_iterations=1000)

結果

在150,000次優化迭代之後，測試集上的分類準確率大約79%-80%。下面畫出了一些誤分類的圖像。其中有一些即使人眼也很難分辨出來，也有一些是合乎情理的錯誤，比如大型車和卡車，貓與狗，但有些錯誤就有點奇怪了。

print_test_accuracy(show_example_errors=True, show_confusion_matrix=True)

Accuracy on Test-Set: 79.3% (7932 / 10000)
Example errors:

Confusion Matrix:
[775 20 71 8 14 4 18 10 44 36] (0) airplane
[ 7 914 5 0 3 7 9 3 14 38] (1) automobile
[ 32 2 724 28 42 44 94 17 9 8] (2) bird
[ 18 7 48 508 56 209 99 29 7 19] (3) cat
[ 4 2 45 25 769 29 75 43 3 5] (4) deer
[ 8 6 34 89 35 748 38 32 1 9] (5) dog
[ 4 2 18 9 14 14 930 4 2 3] (6) frog
[ 6 2 23 18 31 55 17 833 0 15] (7) horse
[ 31 29 15 11 8 7 15 0 856 28] (8) ship
[ 13 67 4 5 0 4 7 7 18 875] (9) truck
(0) (1) (2) (3) (4) (5) (6) (7) (8) (9)

卷積權重

下面展示了一些第一個卷積層的權重（或濾波）。共有3個輸入通道，因此有三組（數據），你可以改變input_channel來改變繪製結果。

權重正值是紅的，負值是藍的。

plot_conv_weights(weights=weights_conv1, input_channel=0)

Min: -0.61643, Max: 0.63949
Mean: -0.00177, Stdev: 0.16469

下面展示了一些第二個卷積層的權重（或濾波）。它們比第一個卷積層的權重更接近零，你可以看到比較低的標準差。

plot_conv_weights(weights=weights_conv2, input_channel=1)

Min: -0.73326, Max: 0.25344

Mean: -0.00394, Stdev: 0.05466

卷積層的輸出

繪製圖像的幫助函數。

def plot_image(image): # Create figure with sub-plots. fig, axes = plt.subplots(1, 2) # References to the sub-plots. ax0 = axes.flat[0] ax1 = axes.flat[1] # Show raw and smoothened images in sub-plots. ax0.imshow(image, interpolation="nearest") ax1.imshow(image, interpolation="spline16") # Set labels. ax0.set_xlabel("Raw") ax1.set_xlabel("Smooth") # Ensure the plot is shown correctly with multiple plots # in a single Notebook cell. plt.show()

繪製一張測試集中的圖像。未處理的像素圖像作為神經網路的輸入。

img, cls = get_test_image(16)plot_image(img)

將原始圖像作為神經網路的輸入，然後畫出第一個卷積層的輸出。

plot_layer_output(output_conv1, image=img)

將同樣的圖像作為輸入，畫出第二個卷積層的輸出。

plot_layer_output(output_conv2, image=img)

預測的類別標籤

獲取圖像的預測類別標籤和類別號。

label_pred, cls_pred = session.run([y_pred, y_pred_cls], feed_dict={x: [img]})

列印預測類別標籤。

# Set the rounding options for numpy.np.set_printoptions(precision=3, suppress=True)# Print the predicted label.print(label_pred[0])

[ 0. 0. 0. 0.493 0. 0.49 0.006 0.01 0. 0. ]

預測類別標籤是長度為10的數組，每個元素代表著神經網路有多大信心認為圖像是該類別。

在這個例子中，索引3的值是0.493，5的值為0.490。這表示神經網路相信圖像要麼是類別3，要麼是類別5，即貓或狗。

class_names[3]

"cat"

class_names[5]

"dog"

關閉TensorFlow會話

現在我們已經用TensorFlow完成了任務，關閉session，釋放資源。

# This has been commented out in case you want to modify and experiment# with the Notebook without having to restart it.# session.close()

總結

這篇教程介紹了如何創建一個在CIRAR-10數據集上進行圖像分類的卷積神經網路。測試集上的分類準確率大概79-80%。

同時也畫出了卷積層的輸出，但很難看出神經網路如何分辨並分類圖像。需要更好的可視化技巧。

練習

下面使一些可能會讓你提升TensorFlow技能的一些建議練習。為了學習如何更合適地使用TensorFlow，實踐經驗是很重要的。

在你對這個Notebook進行修改之前，可能需要先備份一下。

執行10,000次迭代，看看分類準確率如何。將會保存一個checkpoint來儲存TensorFlow圖的所有變數。
再執行100,000次迭代，看看分類準確率有沒有提升。然後再執行100,000次。準確率有提升嗎，你認為值得這些增加的計算時間嗎？
試著再預處理階段改變圖像的變體。
試著改變神經網路的結構。你可以讓神經網路更大或更小。這對訓練時間或分類準確率有什麼影響？要注意的是，當你改變了神經網路結構時，就無法重新載入checkpoints了。
試著在第二個卷積層使用batch-normalization。也試試在倆個層中都刪掉它。
研究一些CIFAR-10上的更好的神經網路，試著實現它們。
向朋友解釋程序如何工作。