理解tf.train.slice_input_producer()和tf.train.batch()

02-05

TensorFlow確實不那麼好上手，感覺應該一邊學一邊寫點什麼。這次還是研究TensorFlow讀取數據的問題，主要是tf.train.slice_input_producer()和tf.train.batch()兩個函數。

【1】tf.train.slice_input_producer()

self.queue = tf.train.slice_input_producer([self.images, self.labels],n shuffle=input_size is not None)n

先讀文檔：

輸入：tensor_list: A list of Tensor objects. Every Tensor in tensor_list must have the same size in the first dimension.

這是因為第一個維度是樣本數/標籤數：

返回值：A list of tensors, one for each element of tensor_list. If the tensor in tensor_list has shape [N, a, b, .., z], then the corresponding output tensor will have shape [a, b, ..., z].

返回的結果如下：

是一個list。

先看一下運行tf.train.slice_input_producer()之前圖的狀態，：

看一下運行tf.train.slice_input_producer()之後圖的狀態：

node {n name: "create_inputs/Const"n op: "Const"n attr {n key: "dtype"n value {n type: DT_STRINGn }n }n attr {n key: "value"n value {n tensor {n dtype: DT_STRINGn tensor_shape {n dim {n size: 5598n }n }n string_val: "00000-00000.png"n string_val: "00000-00001.png"n string_val: "00000-00002.png"n }n }n }n}nnode {n name: "create_inputs/Const_1"n op: "Const"n attr {n key: "dtype"n value {n type: DT_STRINGn }n }n attr {n key: "value"n value {n tensor {n dtype: DT_STRINGn tensor_shape {n dim {n size: 5598n }n }n string_val: "00000-00000.png"n string_val: "00000-00001.png"n string_val: "00000-00002.jpg"n }n }n }n}nnode {n name: "create_inputs/input_producer/Shape"n op: "Const"n attr {n key: "dtype"n value {n type: DT_INT32n }n }n attr {n key: "value"n value {n tensor {n dtype: DT_INT32n tensor_shape {n dim {n size: 1n }n }n int_val: 5598n }n }n }n}nnode {n name: "create_inputs/input_producer/strided_slice/stack"n op: "Const"n attr {n key: "dtype"n value {n type: DT_INT32n }n }n attr {n key: "value"n value {n tensor {n dtype: DT_INT32n tensor_shape {n dim {n size: 1n }n }n int_val: 0n }n }n }n}nnode {n name: "create_inputs/input_producer/strided_slice/stack_1"n op: "Const"n attr {n key: "dtype"n value {n type: DT_INT32n }n }n attr {n key: "value"n value {n tensor {n dtype: DT_INT32n tensor_shape {n dim {n size: 1n }n }n int_val: 1n }n }n }n}nnode {n name: "create_inputs/input_producer/strided_slice/stack_2"n op: "Const"n attr {n key: "dtype"n value {n type: DT_INT32n }n }n attr {n key: "value"n value {n tensor {n dtype: DT_INT32n tensor_shape {n dim {n size: 1n }n }n int_val: 1n }n }n }n}nnode {n name: "create_inputs/input_producer/strided_slice"n op: "StridedSlice"n input: "create_inputs/input_producer/Shape"n input: "create_inputs/input_producer/strided_slice/stack"n input: "create_inputs/input_producer/strided_slice/stack_1"n input: "create_inputs/input_producer/strided_slice/stack_2"n attr {n key: "Index"n value {n type: DT_INT32n }n }n attr {n key: "T"n value {n type: DT_INT32n }n }n attr {n key: "begin_mask"n value {n i: 0n }n }n attr {n key: "ellipsis_mask"n value {n i: 0n }n }n attr {n key: "end_mask"n value {n i: 0n }n }n attr {n key: "new_axis_mask"n value {n i: 0n }n }n attr {n key: "shrink_axis_mask"n value {n i: 1n }n }n}nnode {n name: "create_inputs/input_producer/input_producer/range/start"n op: "Const"n attr {n key: "dtype"n value {n type: DT_INT32n }n }n attr {n key: "value"n value {n tensor {n dtype: DT_INT32n tensor_shape {n }n int_val: 0n }n }n }n}nnode {n name: "create_inputs/input_producer/input_producer/range/delta"n op: "Const"n attr {n key: "dtype"n value {n type: DT_INT32n }n }n attr {n key: "value"n value {n tensor {n dtype: DT_INT32n tensor_shape {n }n int_val: 1n }n }n }n}nnode {n name: "create_inputs/input_producer/input_producer/range"n op: "Range"n input: "create_inputs/input_producer/input_producer/range/start"n input: "create_inputs/input_producer/strided_slice"n input: "create_inputs/input_producer/input_producer/range/delta"n attr {n key: "Tidx"n value {n type: DT_INT32n }n }n}nnode {n name: "create_inputs/input_producer/input_producer/RandomShuffle"n op: "RandomShuffle"n input: "create_inputs/input_producer/input_producer/range"n attr {n key: "T"n value {n type: DT_INT32n }n }n attr {n key: "seed"n value {n i: 1234n }n }n attr {n key: "seed2"n value {n i: 10n }n }n}nnode {n name: "create_inputs/input_producer/input_producer"n op: "FIFOQueueV2"n attr {n key: "capacity"n value {n i: 32n }n }n attr {n key: "component_types"n value {n list {n type: DT_INT32n }n }n }n attr {n key: "container"n value {n s: ""n }n }n attr {n key: "shapes"n value {n list {n shape {n }n }n }n }n attr {n key: "shared_name"n value {n s: ""n }n }n}nnode {n name: "create_inputs/input_producer/input_producer/input_producer_EnqueueMany"n op: "QueueEnqueueManyV2"n input: "create_inputs/input_producer/input_producer"n input: "create_inputs/input_producer/input_producer/RandomShuffle"n attr {n key: "Tcomponents"n value {n list {n type: DT_INT32n }n }n }n attr {n key: "timeout_ms"n value {n i: -1n }n }n}nnode {n name: "create_inputs/input_producer/input_producer/input_producer_Close"n op: "QueueCloseV2"n input: "create_inputs/input_producer/input_producer"n attr {n key: "cancel_pending_enqueues"n value {n b: falsen }n }n}nnode {n name: "create_inputs/input_producer/input_producer/input_producer_Close_1"n op: "QueueCloseV2"n input: "create_inputs/input_producer/input_producer"n attr {n key: "cancel_pending_enqueues"n value {n b: truen }n }n}nnode {n name: "create_inputs/input_producer/input_producer/input_producer_Size"n op: "QueueSizeV2"n input: "create_inputs/input_producer/input_producer"n}nnode {n name: "create_inputs/input_producer/input_producer/ToFloat"n op: "Cast"n input: "create_inputs/input_producer/input_producer/input_producer_Size"n attr {n key: "DstT"n value {n type: DT_FLOATn }n }n attr {n key: "SrcT"n value {n type: DT_INT32n }n }n}nnode {n name: "create_inputs/input_producer/input_producer/mul/y"n op: "Const"n attr {n key: "dtype"n value {n type: DT_FLOATn }n }n attr {n key: "value"n value {n tensor {n dtype: DT_FLOATn tensor_shape {n }n float_val: 0.03125n }n }n }n}nnode {n name: "create_inputs/input_producer/input_producer/mul"n op: "Mul"n input: "create_inputs/input_producer/input_producer/ToFloat"n input: "create_inputs/input_producer/input_producer/mul/y"n attr {n key: "T"n value {n type: DT_FLOATn }n }n}nnode {n name: "create_inputs/input_producer/input_producer/fraction_of_32_full/tags"n op: "Const"n attr {n key: "dtype"n value {n type: DT_STRINGn }n }n attr {n key: "value"n value {n tensor {n dtype: DT_STRINGn tensor_shape {n }n string_val: "create_inputs/input_producer/input_producer/fraction_of_32_full"n }n }n }n}nnode {n name: "create_inputs/input_producer/input_producer/fraction_of_32_full"n op: "ScalarSummary"n input: "create_inputs/input_producer/input_producer/fraction_of_32_full/tags"n input: "create_inputs/input_producer/input_producer/mul"n attr {n key: "T"n value {n type: DT_FLOATn }n }n}nnode {n name: "create_inputs/input_producer/input_producer_Dequeue"n op: "QueueDequeueV2"n input: "create_inputs/input_producer/input_producer"n attr {n key: "component_types"n value {n list {n type: DT_INT32n }n }n }n attr {n key: "timeout_ms"n value {n i: -1n }n }n}nnode {n name: "create_inputs/input_producer/Gather"n op: "Gather"n input: "create_inputs/Const"n input: "create_inputs/input_producer/input_producer_Dequeue"n attr {n key: "Tindices"n value {n type: DT_INT32n }n }n attr {n key: "Tparams"n value {n type: DT_STRINGn }n }n attr {n key: "validate_indices"n value {n b: truen }n }n}nnode {n name: "create_inputs/input_producer/Gather_1"n op: "Gather"n input: "create_inputs/Const_1"n input: "create_inputs/input_producer/input_producer_Dequeue"n attr {n key: "Tindices"n value {n type: DT_INT32n }n }n attr {n key: "Tparams"n value {n type: DT_STRINGn }n }n attr {n key: "validate_indices"n value {n b: truen }n }n}nversions {n producer: 24n}n

用tensorboard查看一下現在Graph的狀態：

其中input_producer內部：

當然我們最關心的是兩個輸入的Const是怎麼與tf.train.slice_input_producer()生成的一堆op連接的：

這個Gather和Gather_1應該就是tf.train.slice_input_producer()返回的[Tensor,Tensor]:

接下來看，用tf.train.slice_input_producer()畫了兩個Gather的Op，生成[Tensor,Tensor]之後，是怎麼處理的：

self.image_list, self.label_list = read_labeled_image_list(self.data_dir, self.data_list)n self.images = tf.convert_to_tensor(self.image_list, dtype=tf.string)n self.labels = tf.convert_to_tensor(self.label_list, dtype=tf.string)n self.queue = tf.train.slice_input_producer([self.images, self.labels],n shuffle=input_size is not None) # not shuffling if it is valn self.image, self.label = read_images_from_disk(self.queue, self.input_size, random_scale, random_mirror, ignore_label, img_mean)n

我學習的這個代碼自己實現了一個read_images_from_disk()函數：

def read_images_from_disk(input_queue, input_size, random_scale, random_mirror, ignore_label, img_mean): # optional pre-processing argumentsn img_contents = tf.read_file(input_queue[0])n label_contents = tf.read_file(input_queue[1])n img = tf.image.decode_png(img_contents, channels=3)n img_r, img_g, img_b = tf.split(axis=2, num_or_size_splits=3, value=img)n img = tf.cast(tf.concat(axis=2, values=[img_b, img_g, img_r]), dtype=tf.float32)n # Extract mean.n img -= img_meann # img = tf.nn label = tf.image.decode_png(label_contents, channels=1)nn if input_size is not None:n h, w = input_sizenn if random_scale:n img, label = image_scaling(img, label)nn if random_mirror:n img, label = image_mirroring(img, label)n n img, label = random_crop_and_pad_image_and_labels(img, label, h, w, ignore_label)nn return img, labeln

注意到tf.read_file()的輸入是一個類型為string的Tensor。input_queue[0]，input_queue[1]都是包含一個string的Tensor。

read_file(

    filename,

name=None )

Args:

filename: A Tensor of type string.
name: A name for the operation (optional).

Returns:

A Tensor of type string.

看看執行完兩句tf.read_file()，Graph發生了什麼變化：

由此可見，TensorFlow讀取數據的時候，其實並沒有真正讀取數據（所謂「讀取」者，即非「讀取」，名為「讀取」。。。），而是通過圖像名的list生成Constant的Tensor，然後通過tf.train.slice_input_producer()每次取一對【圖像-標籤】對，交給ReadFile這個Op。

接下來的代碼對數據進行了處理，注意這裡重載了「-」運算符。

img_contents = tf.read_file(input_queue[0])n label_contents = tf.read_file(input_queue[1])n # Yuxuan: Change to pngn img = tf.image.decode_png(img_contents, channels=3)n img_r, img_g, img_b = tf.split(axis=2, num_or_size_splits=3, value=img)n img = tf.cast(tf.concat(axis=2, values=[img_b, img_g, img_r]), dtype=tf.float32)n # Extract mean.n img -= img_meann # img = tf.nn label = tf.image.decode_png(label_contents, channels=1)n

弄明白了「每執行一條tf的語句，就等於在Graph上添加一個op」，下面的if也很好理解了，這裡是說如果設置了input_size，random_scale和random_mirror，就在Graph上添加一系列預處理的op，反之，則不添加op。

【2】tf.train.batch()

先看看在什麼地方調用的tf.train.batch()

coord = tf.train.Coordinator()n n with tf.name_scope("create_inputs"):n reader = ImageReader(n args.data_dir,n args.data_list,n input_size,n args.random_scale,n args.random_mirror,n args.ignore_label,n IMG_MEAN,n coord)n image_batch, label_batch = reader.dequeue(args.batch_size)n

tf.train.batch()在ImageReader類的dequeue()函數中使用

def dequeue(self, num_elements):n image_batch, label_batch = tf.train.batch([self.image, self.label],n num_elements)n return image_batch, label_batchn

看一下文檔：

tf.train.batch

batch(

    tensors,
    batch_size,
    num_threads=1,
    capacity=32,
    enqueue_many=False,

 
    shapes=None,
    dynamic_pad=False,
    allow_smaller_final_batch=False,
    shared_name=None,
    name=None

)

Creates batches of tensors in tensors.

The argument tensors can be a list or a dictionary of tensors.

在這裡就是[self.image, self.label]

The value returned by the function will be of the same type as tensors.

image_batch, label_batch和self.image, self.label的類型一樣，但是shape並不一樣：

注意self.image是(713,713,3),image_batch是(1,713,713,3)，其中1是batch size.

This function is implemented using a queue. A QueueRunner for the queue is added to the current Graphs QUEUE_RUNNER collection.

關於Queue和QueueRunner：

TensorFlow中的Queue和QueueRunner

If enqueue_many is False, tensors is assumed to represent a single example【注意控制的是輸入的tensors】. An input tensor with shape [x, y, z] will be output as a tensor with shape [batch_size, x, y, z].

If enqueue_many is True, tensors is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of tensors should have the same size in the first dimension. 【輸入的就是一個batch】 If an input tensor has shape [*, x, y, z], the output will have shape [batch_size, x, y, z]. 【調整batch_size的大小】The capacity argument controls the how long the prefetching is allowed to grow the queues.

The returned operation is a dequeue operation and will throw tf.errors.OutOfRangeError if the input queue is exhausted. If this operation is feeding another input queue, its queue runner will catch this exception,【這種情況由queue runner負責管理】 however, if this operation is used in your main thread you are responsible for catching this yourself.【main thread->自行管理】

N.B.: If dynamic_pad is False, you must ensure that either (i) the shapes argument is passed, or (ii) all of the tensors in tensors must have fully-defined shapes. ValueError will be raised if neither of these conditions holds.

If dynamic_pad is True, it is sufficient that the rank of the tensors is known【張量的「階」即張量的維數】, but individual dimensions may have shape None. In this case, for each enqueue the dimensions with value None may have a variable length; upon dequeue, the output tensors will be padded on the right to the maximum shape of the tensors in the current minibatch. For numbers, this padding takes value 0. For strings, this padding is the empty string. See PaddingFIFOQueue for more info.

If allow_smaller_final_batch is True, a smaller batch value than batch_size is returned when the queue is closed and there are not enough elements to fill the batch, otherwise the pending elements are discarded.【Queue最後一個batch怎麼處理：丟棄還是保留】 In addition, all output tensors static shapes, as accessed via the shape property will have a first Dimension value of None, and operations that depend on fixed batch_size would fail.【看這個情況，還是不要用了。。。】

Args:

tensors: The list or dictionary of tensors to enqueue. 入隊的tensors
batch_size: The new batch size pulled from the queue.
num_threads: The number of threads enqueuing tensors. The batching will be nondeterministic if num_threads > 1.
capacity: An integer. The maximum number of elements in the queue.
enqueue_many: Whether each tensor in tensors is a single example.
shapes: (Optional) The shapes for each example. Defaults to the inferred shapes for tensors.
dynamic_pad: Boolean. Allow variable dimensions in input shapes. The given dimensions are padded upon dequeue so that tensors within a batch have the same shapes.
allow_smaller_final_batch: (Optional) Boolean. If True, allow the final batch to be smaller if there are insufficient items left in the queue.
shared_name: (Optional). If set, this queue will be shared under the given name across multiple sessions.
name: (Optional) A name for the operations.

Returns:

A list or dictionary of tensors with the same types as

`tensors` (except if the input is a list of one element,

then it returns a tensor, not a list).

Raises:

ValueError: If the shapes are not specified, and cannot be inferred from the elements of tensors.

這是執行tf.train.batch()之前的Graph：

這是執行tf.train.batch()之後的Graph:

查看一下多出來的batch：

可以看到內部的fifo_queue的各種操作。

理解tf.train.slice_input_producer()和tf.train.batch()

tf.train.batch

A list or dictionary of tensors with the same types as

tensors (except if the input is a list of one element,

then it returns a tensor, not a list).

`tensors` (except if the input is a list of one element,