學習筆記TF058:人臉識別

人臉識別,基於人臉部特徵信息識別身份的生物識別技術。攝像機、攝像頭採集人臉圖像或視頻流,自動檢測、跟蹤圖像中人臉,做臉部相關技術處理,人臉檢測、人臉關鍵點檢測、人臉驗證等。《麻省理工科技評論》(MIT Technology Review),2017年全球十大突破性技術榜單,支付寶「刷臉支付」(Paying with Your Face)入圍。

人臉識別優勢,非強制性(採集方式不容易被察覺,被識別人臉圖像可主動獲取)、非接觸性(用戶不需要與設備接觸)、並發性(可同時多人臉檢測、跟蹤、識別)。深度學習前,人臉識別兩步驟:高維人工特徵提取、降維。傳統人臉識別技術基於可見光圖像。深度學習+大數據(海量有標註人臉數據)為人臉識別領域主流技術路線。神經網路人臉識別技術,大量樣本圖像訓練識別模型,無需人工選取特徵,樣本訓練過程自行學習,識別準確率可以達到99%。

人臉識別技術流程。

人臉圖像採集、檢測。人臉圖像採集,攝像頭把人臉圖像採集下來,靜態圖像、動態圖像、不同位置、不同表情。用戶在採集設備拍報範圍內,採集設置自動搜索並拍攝。人臉檢測屬於目標檢測(object detection)。對要檢測目標對象概率統計,得到待檢測對象特徵,建立目標檢測模型。用模型匹配輸入圖像,輸出匹配區域。人臉檢測是人臉識別預處理,準確標定人臉在圖像的位置大小。人臉圖像模式特徵豐富,直方圖特徵、顏色特徵、模板特徵、結構特徵、哈爾特徵(Haar-like feature)。人臉檢測挑出有用信息,用特徵檢測人臉。人臉檢測演算法,模板匹配模型、Adaboost模型,Adaboost模型速度。精度綜合性能最好,訓練慢、檢測快,可達到視頻流實時檢測效果。

人臉圖像預處理。基於人臉檢測結果,處理圖像,服務特徵提取。系統獲取人臉圖像受到各種條件限制、隨機干擾,需縮放、旋轉、拉伸、光線補償、灰度變換、直方圖均衡化、規範化、幾何校正、過濾、銳化等圖像預處理。

人臉圖像特徵提取。人臉圖像信息數字化,人臉圖像轉變為一串數字(特徵向量)。如,眼睛左邊、嘴唇右邊、鼻子、下巴位置,特徵點間歐氏距離、曲率、角度提取出特徵分量,相關特徵連接成長特徵向量。

人臉圖像匹配、識別。提取人臉圖像特徵數據與資料庫存儲人臉特徵模板搜索匹配,根據相似程度對身份信息進行判斷,設定閾值,相似度越過閾值,輸出匹配結果。確認,一對一(1:1)圖像比較,證明「你就是你」,金融核實身份、信息安全領域。辨認,一對多(1:N)圖像匹配,「N人中找你」,視頻流,人走進識別範圍就完成識別,安防領域。

人臉識別分類。

人臉檢測。檢測、定點陣圖片人臉,返回高業餓呀人臉框坐標。對人臉分析、處理的第一步。「滑動窗口」,選擇圖像矩形區域作滑動窗口,窗口中提取特徵對圖像區域描述,根據特徵描述判斷窗口是否人臉。不斷遍歷需要觀察窗口。

人臉關鍵點檢測。定位、返回人臉五官、輪廓關鍵點坐標位置。人臉輪廓、眼睛、眉毛、嘴唇、鼻子輪廓。Face++提供高達106點關鍵點。人臉關鍵點定位技術,級聯形回歸(cascaded shape regression, CSR)。人臉識別,基於DeepID網路結構。DeepID網路結構類似卷積神經網路結構,倒數第二層,有DeepID層,與卷積層4?最大池化層3相連,卷積神經網路層數越高視野域越大,既考慮局部特徵,又考慮全局特徵。輸入層 31x39x1?卷積層1 28x36x20(卷積核4x4x1)、最大池化層1 12x18x20(過濾器2x2)、卷積層2 12x16x20(卷積核3x3x20)、最大池化層2 6x8x40(過濾器2x2)、卷積層3 4x6x60(卷積核3x3x40)、最大池化層2 2x3x60(過濾器2x2)、卷積層4 2x2x80(卷積核2x2x60)、DeepID層 1x160、全連接層 Softmax。《Deep Learning Face Representation from Predicting 10000 Classes》 mmlab.ie.cuhk.edu.hk/pd

人臉驗證。分析兩張人臉同一人可能性大小。輸入兩張人臉,得到置信度分類、相應閾值,評估相似度。

人臉屬性檢測。人臉屬性辯識、人臉情緒分析。Betaface | Advanced face recognition 在線人臉識別測試。給出人年齡、是否有鬍子、情緒(高興、正常、生氣、憤怒)、性別、是否帶眼鏡、膚色。

人臉識別應用,美圖秀秀美顏應用、世紀佳緣查看潛在配偶「面相」相似度,支付領域「刷臉支付」,安防領域「人臉鑒權」。Face++、商湯科技,提供人臉識別SDK。

人臉檢測。davidsandberg/facenet 。

Florian Schroff、Dmitry Kalenichenko、James Philbin論文《FaceNet: A Unified Embedding for Face Recognition and Clustering》 A Unified Embedding for Face Recognition and Clustering 。davidsandberg/facenet 。

LFW(Labeled Faces in the Wild Home)數據集。LFW Face Database : Main 。美國馬薩諸塞大學阿姆斯特分校計算機視覺實驗室整理。13233張圖片,5749人。4096人只有一張圖片,1680人多於一張。每張圖片尺寸250x250。人臉圖片在每個人物名字文件夾下。

數據預處理。校準代碼 github.com/davidsandber

檢測所用數據集校準為和預訓練模型所用數據集大小一致。

設置環境變數

export PYTHONPATH=[...]/facenet/src

校準命令

for N in {1..4}; do python src/align/align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done

預訓練模型20170216-091149.zip drive.google.com/file/d

訓練集 MS-Celeb-1M數據集 MS-Celeb-1M: Challenge of Recognizing One Million Celebrities in the Real World - Microsoft Research 。微軟人臉識別資料庫,名人榜選擇前100萬名人,搜索引擎採集每個名人100張人臉圖片。預訓練模型準確率0.993+-0.004。

檢測。python src/validate_on_lfw.py datasets/lfw/lfw_mtcnnpy_160 models

基準比較,採用facenet/data/pairs.txt,官方隨機生成數據,匹配和不匹配人名和圖片編號。

十折交叉驗證(10-fold cross validation),精度測試方法。數據集分成10份,輪流將其中9份做訓練集,1份做測試保,10次結果均值作演算法精度估計。一般需要多次10折交叉驗證求均值。

from __future__ import absolute_import

from __future__ import division

from __future__ import print_function

import tensorflow as tf

import numpy as np

import argparse

import facenet

import lfw

import os

import sys

import math

from sklearn import metrics

from scipy.optimize import brentq

from scipy import interpolate

def main(args):

with tf.Graph().as_default():

with tf.Session() as sess:

# Read the file containing the pairs used for testing

# 1. 讀入之前的pairs.txt文件

# 讀入後如[[Abel_Pacheco,1,4]]

pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))

# Get the paths for the corresponding images

# 獲取文件路徑和是否匹配關係對

paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs, args.lfw_file_ext)

# Load the model

# 2. 載入模型

facenet.load_model(args.model)

# Get input and output tensors

# 獲取輸入輸出張量

images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")

embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")

phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")

#image_size = images_placeholder.get_shape()[1] # For some reason this doesnt work for frozen graphs

image_size = args.image_size

embedding_size = embeddings.get_shape()[1]

# Run forward pass to calculate embeddings

# 3. 使用前向傳播驗證

print(Runnning forward pass on LFW images)

batch_size = args.lfw_batch_size

nrof_images = len(paths)

nrof_batches = int(math.ceil(1.0*nrof_images / batch_size)) # 總共批次數

emb_array = np.zeros((nrof_images, embedding_size))

for i in range(nrof_batches):

start_index = i*batch_size

end_index = min((i+1)*batch_size, nrof_images)

paths_batch = paths[start_index:end_index]

images = facenet.load_data(paths_batch, False, False, image_size)

feed_dict = { images_placeholder:images, phase_train_placeholder:False }

emb_array[start_index:end_index,:] = sess.run(embeddings, feed_dict=feed_dict)

# 4. 計算準確率、驗證率,十折交叉驗證方法

tpr, fpr, accuracy, val, val_std, far = lfw.evaluate(emb_array,

actual_issame, nrof_folds=args.lfw_nrof_folds)

print(Accuracy: %1.3f+-%1.3f % (np.mean(accuracy), np.std(accuracy)))

print(Validation rate: %2.5f+-%2.5f @ FAR=%2.5f % (val, val_std, far))

# 得到auc值

auc = metrics.auc(fpr, tpr)

print(Area Under Curve (AUC): %1.3f % auc)

# 1得到錯誤率(eer)

eer = brentq(lambda x: 1. - x - interpolate.interp1d(fpr, tpr)(x), 0., 1.)

print(Equal Error Rate (EER): %1.3f % eer)

def parse_arguments(argv):

parser = argparse.ArgumentParser()

parser.add_argument(lfw_dir, type=str,

help=Path to the data directory containing aligned LFW face patches.)

parser.add_argument(--lfw_batch_size, type=int,

help=Number of images to process in a batch in the LFW test set., default=100)

parser.add_argument(model, type=str,

help=Could be either a directory containing the meta_file and ckpt_file or a model protobuf (.pb) file)

parser.add_argument(--image_size, type=int,

help=Image size (height, width) in pixels., default=160)

parser.add_argument(--lfw_pairs, type=str,

help=The file containing the pairs to use for validation., default=data/pairs.txt)

parser.add_argument(--lfw_file_ext, type=str,

help=The file extension for the LFW dataset., default=png, choices=[jpg, png])

parser.add_argument(--lfw_nrof_folds, type=int,

help=Number of folds to use for cross validation. Mainly used for testing., default=10)

return parser.parse_args(argv)

if __name__ == __main__:

main(parse_arguments(sys.argv[1:]))

性別、年齡識別。dpressel/rude-carnie 。

Adience 數據集。Face Image Project - Data 。26580張圖片,2284類,年齡範圍8個區段(0~2?4~6、8~13、15~20、25~32、38~43、48~53、60~),含有雜訊、姿勢、光照變化。aligned # 經過剪裁對齊數據,faces # 原始數據。fold_0_data.txt至fold_4_data.txt 全部數據標記。fold_frontal_0_data.txt至fold_frontal_4_data.txt 僅用近似正面姿態面部標記。數據結構 user_id 用戶Flickr帳戶ID、original_image 圖片文件名、face_id 人標識符、age、gender、x、y、dx、dy 人臉邊框、tilt_ang 切斜角度、fiducial_yaw_angle 基準偏移角度、fiducial_score 基準分數。flickr.com/

數據預處理。腳本把數據處理成TFRecords格式。github.com/dpressel/rud 。https://github.com/GilLevi/AgeGenderDeepLearning/tree/master/Folds文件夾,已經對訓練集、測試集劃分、標註。gender_train.txt、gender_val.txt 圖片列表 Adience 數據集處理TFRecords文件。圖片處理為大小256x256 JPEG編碼RGB圖像。tf.python_io.TFRecordWriter寫入TFRecords文件,輸出文件output_file。

構建模型。年齡、性別訓練模型,Gil Levi、Tal Hassner論文《Age and Gender Classification Using Convolutional Neural Networks》Age and Gender Classification Using Convolutional Neural Networks 。模型 github.com/dpressel/rud 。tenforflow.contrib.slim。

from __future__ import absolute_import

from __future__ import division

from __future__ import print_function

from datetime import datetime

import time

import os

import numpy as np

import tensorflow as tf

from data import distorted_inputs

import re

from tensorflow.contrib.layers import *

from tensorflow.contrib.slim.python.slim.nets.inception_v3 import inception_v3_base

TOWER_NAME = tower

def select_model(name):

if name.startswith(inception):

print(selected (fine-tuning) inception model)

return inception_v3

elif name == bn:

print(selected batch norm model)

return levi_hassner_bn

print(selected default model)

return levi_hassner

def get_checkpoint(checkpoint_path, requested_step=None, basename=checkpoint):

if requested_step is not None:

model_checkpoint_path = %s/%s-%s % (checkpoint_path, basename, requested_step)

if os.path.exists(model_checkpoint_path) is None:

print(No checkpoint file found at [%s] % checkpoint_path)

exit(-1)

print(model_checkpoint_path)

print(model_checkpoint_path)

return model_checkpoint_path, requested_step

ckpt = tf.train.get_checkpoint_state(checkpoint_path)

if ckpt and ckpt.model_checkpoint_path:

# Restore checkpoint as described in top of this program

print(ckpt.model_checkpoint_path)

global_step = ckpt.model_checkpoint_path.split(/)[-1].split(-)[-1]

return ckpt.model_checkpoint_path, global_step

else:

print(No checkpoint file found at [%s] % checkpoint_path)

exit(-1)

def _activation_summary(x):

tensor_name = re.sub(%s_[0-9]*/ % TOWER_NAME, , x.op.name)

tf.summary.histogram(tensor_name + /activations, x)

tf.summary.scalar(tensor_name + /sparsity, tf.nn.zero_fraction(x))

def inception_v3(nlabels, images, pkeep, is_training):

batch_norm_params = {

"is_training": is_training,

"trainable": True,

# Decay for the moving averages.

"decay": 0.9997,

# Epsilon to prevent 0s in variance.

"epsilon": 0.001,

# Collection containing the moving mean and moving variance.

"variables_collections": {

"beta": None,

"gamma": None,

"moving_mean": ["moving_vars"],

"moving_variance": ["moving_vars"],

}

}

weight_decay = 0.00004

stddev=0.1

weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)

with tf.variable_scope("InceptionV3", "InceptionV3", [images]) as scope:

with tf.contrib.slim.arg_scope(

[tf.contrib.slim.conv2d, tf.contrib.slim.fully_connected],

weights_regularizer=weights_regularizer,

trainable=True):

with tf.contrib.slim.arg_scope(

[tf.contrib.slim.conv2d],

weights_initializer=tf.truncated_normal_initializer(stddev=stddev),

activation_fn=tf.nn.relu,

normalizer_fn=batch_norm,

normalizer_params=batch_norm_params):

net, end_points = inception_v3_base(images, scope=scope)

with tf.variable_scope("logits"):

shape = net.get_shape()

net = avg_pool2d(net, shape[1:3], padding="VALID", scope="pool")

net = tf.nn.dropout(net, pkeep, name=droplast)

net = flatten(net, scope="flatten")

with tf.variable_scope(output) as scope:

weights = tf.Variable(tf.truncated_normal([2048, nlabels], mean=0.0, stddev=0.01), name=weights)

biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name=biases)

output = tf.add(tf.matmul(net, weights), biases, name=scope.name)

_activation_summary(output)

return output

def levi_hassner_bn(nlabels, images, pkeep, is_training):

batch_norm_params = {

"is_training": is_training,

"trainable": True,

# Decay for the moving averages.

"decay": 0.9997,

# Epsilon to prevent 0s in variance.

"epsilon": 0.001,

# Collection containing the moving mean and moving variance.

"variables_collections": {

"beta": None,

"gamma": None,

"moving_mean": ["moving_vars"],

"moving_variance": ["moving_vars"],

}

}

weight_decay = 0.0005

weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)

with tf.variable_scope("LeviHassnerBN", "LeviHassnerBN", [images]) as scope:

with tf.contrib.slim.arg_scope(

[convolution2d, fully_connected],

weights_regularizer=weights_regularizer,

biases_initializer=tf.constant_initializer(1.),

weights_initializer=tf.random_normal_initializer(stddev=0.005),

trainable=True):

with tf.contrib.slim.arg_scope(

[convolution2d],

weights_initializer=tf.random_normal_initializer(stddev=0.01),

normalizer_fn=batch_norm,

normalizer_params=batch_norm_params):

conv1 = convolution2d(images, 96, [7,7], [4, 4], padding=VALID, biases_initializer=tf.constant_initializer(0.), scope=conv1)

pool1 = max_pool2d(conv1, 3, 2, padding=VALID, scope=pool1)

conv2 = convolution2d(pool1, 256, [5, 5], [1, 1], padding=SAME, scope=conv2)

pool2 = max_pool2d(conv2, 3, 2, padding=VALID, scope=pool2)

conv3 = convolution2d(pool2, 384, [3, 3], [1, 1], padding=SAME, biases_initializer=tf.constant_initializer(0.), scope=conv3)

pool3 = max_pool2d(conv3, 3, 2, padding=VALID, scope=pool3)

# can use tf.contrib.layer.flatten

flat = tf.reshape(pool3, [-1, 384*6*6], name=reshape)

full1 = fully_connected(flat, 512, scope=full1)

drop1 = tf.nn.dropout(full1, pkeep, name=drop1)

full2 = fully_connected(drop1, 512, scope=full2)

drop2 = tf.nn.dropout(full2, pkeep, name=drop2)

with tf.variable_scope(output) as scope:

weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name=weights)

biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name=biases)

output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name)

return output

def levi_hassner(nlabels, images, pkeep, is_training):

weight_decay = 0.0005

weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)

with tf.variable_scope("LeviHassner", "LeviHassner", [images]) as scope:

with tf.contrib.slim.arg_scope(

[convolution2d, fully_connected],

weights_regularizer=weights_regularizer,

biases_initializer=tf.constant_initializer(1.),

weights_initializer=tf.random_normal_initializer(stddev=0.005),

trainable=True):

with tf.contrib.slim.arg_scope(

[convolution2d],

weights_initializer=tf.random_normal_initializer(stddev=0.01)):

conv1 = convolution2d(images, 96, [7,7], [4, 4], padding=VALID, biases_initializer=tf.constant_initializer(0.), scope=conv1)

pool1 = max_pool2d(conv1, 3, 2, padding=VALID, scope=pool1)

norm1 = tf.nn.local_response_normalization(pool1, 5, alpha=0.0001, beta=0.75, name=norm1)

conv2 = convolution2d(norm1, 256, [5, 5], [1, 1], padding=SAME, scope=conv2)

pool2 = max_pool2d(conv2, 3, 2, padding=VALID, scope=pool2)

norm2 = tf.nn.local_response_normalization(pool2, 5, alpha=0.0001, beta=0.75, name=norm2)

conv3 = convolution2d(norm2, 384, [3, 3], [1, 1], biases_initializer=tf.constant_initializer(0.), padding=SAME, scope=conv3)

pool3 = max_pool2d(conv3, 3, 2, padding=VALID, scope=pool3)

flat = tf.reshape(pool3, [-1, 384*6*6], name=reshape)

full1 = fully_connected(flat, 512, scope=full1)

drop1 = tf.nn.dropout(full1, pkeep, name=drop1)

full2 = fully_connected(drop1, 512, scope=full2)

drop2 = tf.nn.dropout(full2, pkeep, name=drop2)

with tf.variable_scope(output) as scope:

weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name=weights)

biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name=biases)

output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name)

return output

訓練模型。github.com/dpressel/rud

from __future__ import absolute_import

from __future__ import division

from __future__ import print_function

from six.moves import xrange

from datetime import datetime

import time

import os

import numpy as np

import tensorflow as tf

from data import distorted_inputs

from model import select_model

import json

import re

LAMBDA = 0.01

MOM = 0.9

tf.app.flags.DEFINE_string(pre_checkpoint_path, ,

"""If specified, restore this pretrained model """

"""before beginning any training.""")

tf.app.flags.DEFINE_string(train_dir, /home/dpressel/dev/work/AgeGenderDeepLearning/Folds/tf/test_fold_is_0,

Training directory)

tf.app.flags.DEFINE_boolean(log_device_placement, False,

"""Whether to log device placement.""")

tf.app.flags.DEFINE_integer(num_preprocess_threads, 4,

Number of preprocessing threads)

tf.app.flags.DEFINE_string(optim, Momentum,

Optimizer)

tf.app.flags.DEFINE_integer(image_size, 227,

Image size)

tf.app.flags.DEFINE_float(eta, 0.01,

Learning rate)

tf.app.flags.DEFINE_float(pdrop, 0.,

Dropout probability)

tf.app.flags.DEFINE_integer(max_steps, 40000,

Number of iterations)

tf.app.flags.DEFINE_integer(steps_per_decay, 10000,

Number of steps before learning rate decay)

tf.app.flags.DEFINE_float(eta_decay_rate, 0.1,

Learning rate decay)

tf.app.flags.DEFINE_integer(epochs, -1,

Number of epochs)

tf.app.flags.DEFINE_integer(batch_size, 128,

Batch size)

tf.app.flags.DEFINE_string(checkpoint, checkpoint,

Checkpoint name)

tf.app.flags.DEFINE_string(model_type, default,

Type of convnet)

tf.app.flags.DEFINE_string(pre_model,

,#./inception_v3.ckpt,

checkpoint file)

FLAGS = tf.app.flags.FLAGS

# Every 5k steps cut learning rate in half

def exponential_staircase_decay(at_step=10000, decay_rate=0.1):

print(decay [%f] every [%d] steps % (decay_rate, at_step))

def _decay(lr, global_step):

return tf.train.exponential_decay(lr, global_step,

at_step, decay_rate, staircase=True)

return _decay

def optimizer(optim, eta, loss_fn, at_step, decay_rate):

global_step = tf.Variable(0, trainable=False)

optz = optim

if optim == Adadelta:

optz = lambda lr: tf.train.AdadeltaOptimizer(lr, 0.95, 1e-6)

lr_decay_fn = None

elif optim == Momentum:

optz = lambda lr: tf.train.MomentumOptimizer(lr, MOM)

lr_decay_fn = exponential_staircase_decay(at_step, decay_rate)

return tf.contrib.layers.optimize_loss(loss_fn, global_step, eta, optz, clip_gradients=4., learning_rate_decay_fn=lr_decay_fn)

def loss(logits, labels):

labels = tf.cast(labels, tf.int32)

cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(

logits=logits, labels=labels, name=cross_entropy_per_example)

cross_entropy_mean = tf.reduce_mean(cross_entropy, name=cross_entropy)

tf.add_to_collection(losses, cross_entropy_mean)

losses = tf.get_collection(losses)

regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)

total_loss = cross_entropy_mean + LAMBDA * sum(regularization_losses)

tf.summary.scalar(tl (raw), total_loss)

#total_loss = tf.add_n(losses + regularization_losses, name=total_loss)

loss_averages = tf.train.ExponentialMovingAverage(0.9, name=avg)

loss_averages_op = loss_averages.apply(losses + [total_loss])

for l in losses + [total_loss]:

tf.summary.scalar(l.op.name + (raw), l)

tf.summary.scalar(l.op.name, loss_averages.average(l))

with tf.control_dependencies([loss_averages_op]):

total_loss = tf.identity(total_loss)

return total_loss

def main(argv=None):

with tf.Graph().as_default():

model_fn = select_model(FLAGS.model_type)

# Open the metadata file and figure out nlabels, and size of epoch

# 打開元數據文件md.json,這個文件是在預處理數據時生成。找出nlabels、epoch大小

input_file = os.path.join(FLAGS.train_dir, md.json)

print(input_file)

with open(input_file, r) as f:

md = json.load(f)

images, labels, _ = distorted_inputs(FLAGS.train_dir, FLAGS.batch_size, FLAGS.image_size, FLAGS.num_preprocess_threads)

logits = model_fn(md[nlabels], images, 1-FLAGS.pdrop, True)

total_loss = loss(logits, labels)

train_op = optimizer(FLAGS.optim, FLAGS.eta, total_loss, FLAGS.steps_per_decay, FLAGS.eta_decay_rate)

saver = tf.train.Saver(tf.global_variables())

summary_op = tf.summary.merge_all()

sess = tf.Session(config=tf.ConfigProto(

log_device_placement=FLAGS.log_device_placement))

tf.global_variables_initializer().run(session=sess)

# This is total hackland, it only works to fine-tune iv3

# 本例可以輸入預訓練模型Inception V3,可用來微調 Inception V3

if FLAGS.pre_model:

inception_variables = tf.get_collection(

tf.GraphKeys.VARIABLES, scope="InceptionV3")

restorer = tf.train.Saver(inception_variables)

restorer.restore(sess, FLAGS.pre_model)

if FLAGS.pre_checkpoint_path:

if tf.gfile.Exists(FLAGS.pre_checkpoint_path) is True:

print(Trying to restore checkpoint from %s % FLAGS.pre_checkpoint_path)

restorer = tf.train.Saver()

tf.train.latest_checkpoint(FLAGS.pre_checkpoint_path)

print(%s: Pre-trained model restored from %s %

(datetime.now(), FLAGS.pre_checkpoint_path))

# 將ckpt文件存儲在run-(pid)目錄

run_dir = %s/run-%d % (FLAGS.train_dir, os.getpid())

checkpoint_path = %s/%s % (run_dir, FLAGS.checkpoint)

if tf.gfile.Exists(run_dir) is False:

print(Creating %s % run_dir)

tf.gfile.MakeDirs(run_dir)

tf.train.write_graph(sess.graph_def, run_dir, model.pb, as_text=True)

tf.train.start_queue_runners(sess=sess)

summary_writer = tf.summary.FileWriter(run_dir, sess.graph)

steps_per_train_epoch = int(md[train_counts] / FLAGS.batch_size)

num_steps = FLAGS.max_steps if FLAGS.epochs < 1 else FLAGS.epochs * steps_per_train_epoch

print(Requested number of steps [%d] % num_steps)

for step in xrange(num_steps):

start_time = time.time()

_, loss_value = sess.run([train_op, total_loss])

duration = time.time() - start_time

assert not np.isnan(loss_value), Model diverged with loss = NaN

# 每10步記錄一次摘要文件,保存一個檢查點文件

if step % 10 == 0:

num_examples_per_step = FLAGS.batch_size

examples_per_sec = num_examples_per_step / duration

sec_per_batch = float(duration)

format_str = (%s: step %d, loss = %.3f (%.1f examples/sec; %.3f sec/batch))

print(format_str % (datetime.now(), step, loss_value,

examples_per_sec, sec_per_batch))

# Loss only actually evaluated every 100 steps?

if step % 100 == 0:

summary_str = sess.run(summary_op)

summary_writer.add_summary(summary_str, step)

if step % 1000 == 0 or (step + 1) == num_steps:

saver.save(sess, checkpoint_path, global_step=step)

if __name__ == __main__:

app.run-正在西部數碼(www.west.cn)進行交易()

驗證模型。github.com/dpressel/rud

from __future__ import absolute_import

from __future__ import division

from __future__ import print_function

from datetime import datetime

import math

import time

from data import inputs

import numpy as np

import tensorflow as tf

from model import select_model, get_checkpoint

from utils import *

import os

import json

import csv

RESIZE_FINAL = 227

GENDER_LIST =[M,F]

AGE_LIST = [(0, 2),(4, 6),(8, 12),(15, 20),(25, 32),(38, 43),(48, 53),(60, 100)]

MAX_BATCH_SZ = 128

tf.app.flags.DEFINE_string(model_dir, ,

Model directory (where training data lives))

tf.app.flags.DEFINE_string(class_type, age,

Classification type (age|gender))

tf.app.flags.DEFINE_string(device_id, /cpu:0,

What processing unit to execute inference on)

tf.app.flags.DEFINE_string(filename, ,

File (Image) or File list (Text/No header TSV) to process)

tf.app.flags.DEFINE_string(target, ,

CSV file containing the filename processed along with best guess and score)

tf.app.flags.DEFINE_string(checkpoint, checkpoint,

Checkpoint basename)

tf.app.flags.DEFINE_string(model_type, default,

Type of convnet)

tf.app.flags.DEFINE_string(requested_step, , Within the model directory, a requested step to restore e.g., 9000)

tf.app.flags.DEFINE_boolean(single_look, False, single look at the image or multiple crops)

tf.app.flags.DEFINE_string(face_detection_model, , Do frontal face detection with model specified)

tf.app.flags.DEFINE_string(face_detection_type, cascade, Face detection model type (yolo_tiny|cascade))

FLAGS = tf.app.flags.FLAGS

def one_of(fname, types):

return any([fname.endswith(. + ty) for ty in types])

def resolve_file(fname):

if os.path.exists(fname): return fname

for suffix in (.jpg, .png, .JPG, .PNG, .jpeg):

cand = fname + suffix

if os.path.exists(cand):

return cand

return None

def classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer):

try:

num_batches = math.ceil(len(image_files) / MAX_BATCH_SZ)

pg = ProgressBar(num_batches)

for j in range(num_batches):

start_offset = j * MAX_BATCH_SZ

end_offset = min((j + 1) * MAX_BATCH_SZ, len(image_files))

batch_image_files = image_files[start_offset:end_offset]

print(start_offset, end_offset, len(batch_image_files))

image_batch = make_multi_image_batch(batch_image_files, coder)

batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()})

batch_sz = batch_results.shape[0]

for i in range(batch_sz):

output_i = batch_results[i]

best_i = np.argmax(output_i)

best_choice = (label_list[best_i], output_i[best_i])

print(Guess @ 1 %s, prob = %.2f % best_choice)

if writer is not None:

f = batch_image_files[i]

writer.writerow((f, best_choice[0], %.2f % best_choice[1]))

pg.update()

pg.done()

except Exception as e:

print(e)

print(Failed to run all images)

def classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer):

try:

print(Running file %s % image_file)

image_batch = make_multi_crop_batch(image_file, coder)

batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()})

output = batch_results[0]

batch_sz = batch_results.shape[0]

for i in range(1, batch_sz):

output = output + batch_results[i]

output /= batch_sz

best = np.argmax(output) # 最可能性能分類

best_choice = (label_list[best], output[best])

print(Guess @ 1 %s, prob = %.2f % best_choice)

nlabels = len(label_list)

if nlabels > 2:

output[best] = 0

second_best = np.argmax(output)

print(Guess @ 2 %s, prob = %.2f % (label_list[second_best], output[second_best]))

if writer is not None:

writer.writerow((image_file, best_choice[0], %.2f % best_choice[1]))

except Exception as e:

print(e)

print(Failed to run image %s % image_file)

def list_images(srcfile):

with open(srcfile, r) as csvfile:

delim = , if srcfile.endswith(.csv) else t

reader = csv.reader(csvfile, delimiter=delim)

if srcfile.endswith(.csv) or srcfile.endswith(.tsv):

print(skipping header)

_ = next(reader)

return [row[0] for row in reader]

def main(argv=None): # pylint: disable=unused-argument

files = []

if FLAGS.face_detection_model:

print(Using face detector (%s) %s % (FLAGS.face_detection_type, FLAGS.face_detection_model))

face_detect = face_detection_model(FLAGS.face_detection_type, FLAGS.face_detection_model)

face_files, rectangles = face_detect.run(FLAGS.filename)

print(face_files)

files += face_files

config = tf.ConfigProto(allow_soft_placement=True)

with tf.Session(config=config) as sess:

label_list = AGE_LIST if FLAGS.class_type == age else GENDER_LIST

nlabels = len(label_list)

print(Executing on %s % FLAGS.device_id)

model_fn = select_model(FLAGS.model_type)

with tf.device(FLAGS.device_id):

images = tf.placeholder(tf.float32, [None, RESIZE_FINAL, RESIZE_FINAL, 3])

logits = model_fn(nlabels, images, 1, False)

init = tf.global_variables_initializer()

requested_step = FLAGS.requested_step if FLAGS.requested_step else None

checkpoint_path = %s % (FLAGS.model_dir)

model_checkpoint_path, global_step = get_checkpoint(checkpoint_path, requested_step, FLAGS.checkpoint)

saver = tf.train.Saver()

saver.restore(sess, model_checkpoint_path)

softmax_output = tf.nn.softmax(logits)

coder = ImageCoder()

# Support a batch mode if no face detection model

if len(files) == 0:

if (os.path.isdir(FLAGS.filename)):

for relpath in os.listdir(FLAGS.filename):

abspath = os.path.join(FLAGS.filename, relpath)

if os.path.isfile(abspath) and any([abspath.endswith(. + ty) for ty in (jpg, png, JPG, PNG, jpeg)]):

print(abspath)

files.append(abspath)

else:

files.append(FLAGS.filename)

# If it happens to be a list file, read the list and clobber the files

if any([FLAGS.filename.endswith(. + ty) for ty in (csv, tsv, txt)]):

files = list_images(FLAGS.filename)

writer = None

output = None

if FLAGS.target:

print(Creating output file %s % FLAGS.target)

output = open(FLAGS.target, w)

writer = csv.writer(output)

writer.writerow((file, label, score))

image_files = list(filter(lambda x: x is not None, [resolve_file(f) for f in files]))

print(image_files)

if FLAGS.single_look:

classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer)

else:

for image_file in image_files:

classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer)

if output is not None:

output.close()

if __name__ == __main__:

app.run-正在西部數碼(www.west.cn)進行交易()

微軟臉部圖片識別性別、年齡網站 how-old.net/ 。圖片識別年齡、性別。根據問題搜索圖片。

參考資料:

《TensorFlow技術解析與實戰》

歡迎推薦上海機器學習工作機會,我的微信:qingxingfengzi


推薦閱讀:

ROC、AUC、K-S
Python vs R : 在機器學習和數據分析領域中的對比
Softmax 函數的特點和作用是什麼?
你用機器學習做過哪些有趣的事兒?

TAG:TensorFlow | 机器学习 | 深度学习DeepLearning |