TensorFlow學習筆記之五——源碼分析之最近演算法

05-07

import numpy as np

import tensorflow as tf

# Import MINST data

import input_data

mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

#這裡主要是導入數據，數據通過input_data.py已經下載到/tmp/data/目錄之下了，這裡下載數據的時候，需要提前用瀏覽器嘗試是否可以打開

#MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges，如果打不開，下載數據階段會報錯。而且一旦數據下載中斷，需要將之前下載的未完成的數據清空，重新

#進行下載，否則會出現CRC Check錯誤。read_data_sets是input_data.py裡面的一個函數，主要是將數據解壓之後，放到對應的位置。

# In this example, we limit mnist data

Xtr, Ytr = mnist.train.next_batch(5000) #5000 for training (nn candidates)

Xte, Yte = mnist.test.next_batch(200) #200 for testing

#mnist.train.next_batch，其中train和next_batch都是在input_data.py里定義好的數據項和函數。此處主要是取得一定數量的數據。

# Reshape images to 1D

Xtr = np.reshape(Xtr, newshape=(-1, 28*28))

Xte = np.reshape(Xte, newshape=(-1, 28*28))

#將二維的圖像數據一維化，利於後面的相加操作。

# tf Graph Input

xtr = tf.placeholder("float", [None, 784])

xte = tf.placeholder("float", [784])

#設立兩個空的類型，並沒有給具體的數據。這也是為了基於這兩個類型，去實現部分的graph。

# Nearest Neighbor calculation using L1 Distance

# Calculate L1 Distance

distance = tf.reduce_sum(tf.abs(tf.add(xtr, tf.neg(xte))), reduction_indices=1)

# Predict: Get min distance index (Nearest neighbor)

pred = tf.arg_min(distance, 0)

#最近鄰居演算法，算最近的距離的鄰居，並且獲取該鄰居的下標，這裡只是基於空的類型，實現的graph，並未進行真實的計算。

accuracy = 0.

# Initializing the variables

init = tf.initialize_all_variables()

#初始化所有的變數和未分配數值的佔位符，這個過程是所有程序中必須做的，否則可能會讀出隨機數值。

# Launch the graph

with tf.Session() as sess:

sess.run(init)

# loop over test data

for i in range(len(Xte)):

# Get nearest neighbor

nn_index = sess.run(pred, feed_dict={xtr: Xtr, xte: Xte[i,:]})

# Get nearest neighbor class label and compare it to its true label

print "Test", i, "Prediction:", np.argmax(Ytr[nn_index]), "True Class:", np.argmax(Yte[i])

# Calculate accuracy

if np.argmax(Ytr[nn_index]) == np.argmax(Yte[i]):

accuracy += 1./len(Xte)

print "Done!"

print "Accuracy:", accuracy

#for循環迭代計算每一個測試數據的預測值，並且和真正的值進行對比，並計算精確度。該演算法比較經典的是不需要提前訓練，直接在測試階段進行識別。

源代碼地址:https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/2%20-%20Basic%20Classifiers/nearest_neighbor.py

相關API：

tf.reduce_sum(input_tensor, reduction_indices=None, keep_dims=False, name=None)

Computes the sum of elements across dimensions of a tensor.

Reduces input_tensor along the dimensions given in reduction_indices. Unless keep_dims is true, the rank of the tensor is reduced by 1 for each entry in reduction_indices. If keep_dims is true, the reduced dimensions are retained with length 1.

If reduction_indices has no entries, all dimensions are reduced, and a tensor with a single element is returned.

For example:

# x is [[1, 1, 1]

# [1, 1, 1]]

tf.reduce_sum(x) ==> 6

tf.reduce_sum(x, 0) ==> [2, 2, 2]

tf.reduce_sum(x, 1) ==> [3, 3]

tf.reduce_sum(x, 1, keep_dims=True) ==> [[3], [3]]

tf.reduce_sum(x, [0, 1]) ==> 6

Args:

input_tensor: The tensor to reduce. Should have numeric type.

reduction_indices: The dimensions to reduce. If None (the default), reduces all dimensions.

keep_dims: If true, retains reduced dimensions with length 1.

name: A name for the operation (optional).

Returns:

The reduced tensor.

點評：這個API主要是降維使用，在這個例子中，將測試圖片和所有圖片相加後的二維矩陣，降為每個圖片只有一個最終結果的一維矩陣。

2016年5月23日