標籤：

行人重識別

ICCV 2017 Person Re-ID相關論文

04-19

Cross-view Asymmetric Metric Learning for Unsupervised Re-id 【code】
Deeply-Learned Part-Aligned Representations for Person Re-Identification 【github】
Group Re-Id via Unsupervised Transfer of Sparse Features Encoding
In Defense of the Triplet Loss for Person Re-Identification 【github】
Jointly Attentive Spatial-Temporal Pooling Networks for Video-based Person Re-Identification 【github】
Pose-driven Deep Convolutional Model for Person Re-identification
RGB-Infrared Cross-Modality Person Re-Identification
SVDNet for Pedestrian Retrieval 【github】
Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro 【github】

論文打包下載：【下載地址】

分享幾篇有價值的 idea：

1. Cross-view Asymmetric Metric Learning for Unsupervised Re-id

非監督學習方法，可以很大程度上解決標定數據的問題（手動標註 cross camera 數據的代價很大）。

論文主要貢獻就是提出一種非監督、非對稱的度量學習方法。

為了有效處理不同的camera view change帶來的變化，提出了一種基於聚類的非對稱度量學習方法（CAMEL），即提出通過一個共享空間（Shared Space）來減少不同view之間的差異（view-specific bias），可以理解為camera 視角帶來的特徵warp變化。

那麼如何對相似和非相似目標進行度量呢？答案就是聚類。

個人感覺，非監督方法是一個方向，但目前並不是太成熟。

2.Deeply-Learned Part-Aligned Representations for Person Re-Identification

文章主要解決的問題是 Part-Aligned，對局部區域進行有效對齊，來看一張圖：

圖像無法對齊帶來的問題是，本來相似性的目標因為位置差異無法匹配（4-5），同樣，僅僅背景相似導致錯誤匹配（2-3,5-6）。

文章思路比較簡答，通過一種簡單的方法對人體目標進行分解，得到不同的 Part Region，並計算每個 Region 的表達，通過多個 Region 計算結合得到 Score。

Part-Learned 效果：

基於 FCN 提取的有效 Part 區域，思路是很有價值的。

http://4.In Defense of the Triplet Loss for Person Re-Identification

基於 Triplet 的又一篇力作，作者首先比較了 Triplet 與代理分類方法，指出 Classification & Verification 兩種 Loss 方法的明顯缺點：

Classification：當目標很大時，會增加網路參數，很多參數再訓練結束後被丟棄；

Verification：判斷兩張圖片的相似度，一對一比較，效率比較低；

與之相比，Triplet Loss 的優點在於能夠自動提取有效的比對特徵，實現端到端的訓練。

缺點在於：

需要 Hard Example 進行有效的相似特徵挖掘；
過於 Hard 又會導致訓練過程震蕩，無法收斂；

作者提出了一種新的 Triplet 改進，並與多種 Triplet 變體進行了比較，來看比較結果 Table：

5.Jointly Attentive Spatial-Temporal Pooling Networks for Video-based Person Re-Identification

提出一種 video-based 方法，基於 jointly attentive spatial-temporal pooling (ASTPN)，採用了 pair-wise 比對的方法，藉助 Attention 模型實現關聯特徵提取。

通過圖中可以看到，人在進行對比的時候，是做了 Part Alignment 的，Attention （注意力模型）在 NLP 用的比較多，可以理解為區域加權。

Attentive temporal pooling architecture

下面是整體框架：

另外還有：

> Pose-driven 基於姿態估計的方法

> RGB-Infrared 基於紅外的方法

> Unlabeled Samples Generated by GAN 基於對抗網路生成無標籤樣本的方法

有興趣可以自己過一遍，不再逐一介紹。

推薦閱讀：

※CVPR 2018 Person Re-ID相關論文
※AlignedReID 論文筆記

TAG:行人重識別 |