OTB-2015 database與OpenCV320 tracking API

首先介紹OTB2015的基本情況並談談我的看法,然後介紹最新版本OpenCV中新加入的幾個跟蹤演算法和測試結果,供大家參考。

OTB2015介紹:Visual Tracker Benchmark 是OTB2013的擴展,發表於2015`TPAMI,包含100個視頻序列,論文中也叫做TB-100,或OTB100,或OTB2015,反正都指這個資料庫:

  • Wu Y, Lim J, Yang M H. Object tracking benchmark [J]. TPAMI, 2015.

  1. 雖說是100個視頻序列,但其實是指總共有100個標註的跟蹤目標,下載下來是98個序列,其中Skating2和Jogging都是一個序列兩個標註目標(建議分別拷貝一份方便跑庫);注意這100個序列中有26個是灰度序列,74個彩色序列,顏色類方法會稍微差一點;
  2. 包括36個body,和26個face/head的視頻,整個資料庫總共58897幀,序列也是有長有短,最短的幾十幀,最長的3000+幀,包含short-term和long-term;

  3. 11種跟蹤難點:光照變化 Illumination Variation,尺度變化 Scale Variation,遮擋 Occlusion,形變 Deformation,運動模糊 Motion Blur,快速運動 Fast Motion,平面內旋轉 In-Plane Rotation,平面外旋轉 Out-of-Plane Rotation,出視野(完全消失) Out-of-View,背景雜波(背景相似目標干擾) Background Clutters,低解析度 Low Resolution。(注意,VOT中沒有出視野這一情況,有遮擋但沒有消失,任何時候目標的標註都追求真實準確,而OTB有目標完全消失的情況,這時候的標註框只是理論上的位置,猜的~~沒法說錯也沒法說對;VOT的序列長度都比較短,即short-term,而OTB有long-term測試)
  4. 2種評價指標:Precision plot,指標是中心位置偏差 center location error,這個指標沒法反映跟蹤的尺度和大小是否準確,所以用的不多,我也基本不看這個;Success plot,指標是交疊比 overlap score,這個比較靠譜,VOT中的accuracy也是這個,大部分論文也只放這個結果,請大家只看這個就夠了。

  5. 3種評價方式:none-pass evaluation (OPE) 一遍過評測,傳統方式,第一幀給groundtruth,沒有隨機性的演算法只跑一遍就可以了;temporal robustness evaluation (TRE)時域魯棒性評測,加入時間干擾,從任意幀開始跟蹤測試;spatial robustness evaluation (SRE)空域魯棒性評測,給第一幀的groundtruth加干擾去初始化。我一般只看OPE,這個與TRE和SRE基本一致。(15TPAMI中加入了Restart,跟蹤失敗下一幀馬上重新初始化, 這個比較像VOT的重新初始化,不過VOT 5幀以後再初始化更科學合理,Restart的次數反映演算法的robust,Restart次數越少、平均交疊比分數越高、說明演算法越好
  6. 例外情況:設想一下,有個演算法非常非常差,跟一幀丟一幀,那它就會持續初始化,跑出來的結果有一半是groundtruth,會造成robust很低但accuracy比較高的情況;再設想一下,有個演算法總是趨向於選擇比較大的跟蹤框,極限情況如果跟蹤框一直是整個圖像,那交疊比永遠不可能為0,永遠不會Restart,會造成accuracy很低但robust很高的情況;正常演算法的accuracy和robust是比較均衡的,如果一高一低一定要當心。

  7. 用法用量:主頁提供了測試代碼,如果要測試一個新演算法,比如要跑ECO-HC,首先將演算法放在trackers文件夾下,文件夾命名為ECO-HC,裡面一定要包括一個標準介面run_ECO-HC,具體封裝參考其他tracker去寫,需要返回跟蹤結果res和幀率fps;然後在configSeqs.m里配置序列,在configTrackers.m里配置演算法,運行main_running.m跑演算法在所有序列上的結果;最後運行perfPlot.m畫圖。

OTB2015分析:無疑OTB2013和OTB2015是目前使用最廣泛最權威的資料庫,(見上一篇文章)今年CVPR所有的論文都跑了OTB2015,但我只看到兩篇跑了VOT2016,因為OTB2015相對簡單。本周與@Qiang Wang王強深入交換了關於OTB2015的意見 (其實是我單方面請教~),雙方一致認為,OTB2015對目標跟蹤演算法的發展功不可沒,但這個庫影響力太大也造成很多人會基於這個庫去研發演算法並調參,刷庫行為違背了吳毅老師公開測試庫的初衷,演算法普適能力堪憂;Qiang wang還表示,OTB2015相對VOT2016或VOT2017難度低一些,目前基於CNN訓練的方法很容易在OTB2015上過擬合,看似爆高的performance實則水分巨大,放到VOT2016和VOT2017就縮水嚴重;最後,我們共同期待這些CNN訓練演算法在VOT2017的表現,競賽限制了訓練範圍,哪些演算法會被打回原形,到時候一目了然,也方便我們真實鑒別哪些CNN方案是真正可行的。(Winsty叔看好SINT,Qiang Wang看好SiamFC,我看好。。哦,我不好看)

我也很贊同以上看法,但我接觸的CNN類演算法比較少,就相關濾波類演算法來說:由於卷積特徵沒有過擬合問題,通常相關濾波類演算法寥寥幾個參數,過擬合現象應該不會很嚴重,如果排除CNN-based方法去比較performance,我認為可信度還是非常高的,畢竟有100個序列,調參也是很累的。。KCF的出現讓整個跟蹤演算法水平上升了一大截,邊界效應也由SRDCF(及後續C-COT, ECO, DCCO)和CFLB(及後續BACF, CSR-DCF, CF+CA)等工作得到有效解決,但目前相關濾波類演算法對快速形變平面內旋轉依然無力,VOT2017新增的10個序列也基本上是這兩個方面,而OTB2015中這兩類挑戰的數量比較少難度不夠,可能OTB2015沒法凸顯某些演算法針對兩個問題的性能提升,所以我也推薦OTB2015和VOT2016(+1)的結果綜合來看。

最後總結下我關於OTB2015的看法:相關濾波類演算法在OTB2015上的性能依然是可靠可信的,但也要堤防過擬合,建議結合VOT2016(+1)的結果。一個演算法,如果在OTB2015上比較高,不能絕對說明這個演算法好,只能說有潛力,再參考VOT就比較靠譜了;一個演算法,如果在OTB2015上不好,那這個演算法基本上也好不到哪裡去,難道你要告訴我,你的演算法很牛逼,但只有這100個 (相對VOT簡單一些的) 序列你跟不上???

由於顏色,序列長度,遮擋消失的序列差異,color-based方法(如ASMS、DAT、Staple、CSR-DCF等)在VOT上要稍微好一點,long-term類方法(如TLD、LCT、LMCF、ECO-HC等)在OTB上要稍微好一點。

/***********************************分割線*********************************/

OpenCV3.2.0 tracking API:OpenCV: Tracking API 對很多人來說OpenCV是最方便最簡單的工具箱,正好最新版的OpenCV3.2.0包含了6個新演算法其中還包括KCF,所以我做了簡單整理和介紹並測試,方便大家參考使用。

  1. 安裝:由於tracking API在contrib中,需要在安裝OpenCV3.2.0時連同contrib一起cmake,我用的是VS2015 + OpenCV3.2.0 + contrib,注意OpenCV和contrib的版本必須一致,編譯流程網上很多,推薦幾個:opencv3.2+opencv_contrib+cmake - cosmispower的博客 - 博客頻道 - CSDN.NET 玩轉OpenCV3——contrib庫 OpenCV學習筆記(08):opencv3.2+cmake3.8+VS2013,編譯opencv_contrib - CV_Jason的博客 - 博客頻道 - CSDN.NET
  2. 使用:這個就很簡單了,調用非常簡單,三句話搞定:

    #include <iostream>n#include <fstream>n#include <string>nn#include "opencv2/opencv_modules.hpp"n#include "opencv2/highgui/highgui.hpp"n#include "opencv2/core/core.hpp"n#include "opencv2/opencv.hpp"nn#include "opencv2/tracking.hpp"nnusing namespace std;nusing namespace cv;nnnint main(int argc, char** argv)n{nt//可選BOOSTING, MIL, KCF, TLD, MEDIANFLOW, or GOTURNntPtr<Tracker> tracker = Tracker::create("KCF");nntVideoCapture video(0);ntif (!video.isOpened()) {nttcout << "cannot read video!" << endl;nttreturn -1;nt}nntMat frame;ntvideo.read(frame);ntRect2d box(270, 120, 180, 260);nn //第一幀初始化trackernttracker->init(frame, box);nntwhile (video.read(frame)) n {n //跟蹤目標並更新模型ntttracker->update(frame, box);nnttrectangle(frame, box, Scalar(255, 0, 0), 2, 1);nttimshow("Tracking", frame);nttint k = waitKey(1);nttif (k == 27) n break;nt}n}n

可選演算法:

BOOSTING:對應OTB-2015裡面的OAB,效果速度都不錯的經典演算法

The MIL algorithm trains a classifier in an online manner to separate the object from the background.Multiple Instance Learning avoids the drift problem for a robust tracking. The implementation is based on [7] .Original code can be found here vision.ucsd.edu/~bbaben

MIL:古老但經典的tracking-by-detection方法

The MIL algorithm trains a classifier in an online manner to separate the object from the background.Multiple Instance Learning avoids the drift problem for a robust tracking. The implementation is based on [7] .Original code can be found here vision.ucsd.edu/~bbaben

MEDIANFLOW:與TLD同一作者,作為TLD的tracking component

Median Flow tracker implementation.Implementation of a paper [81] .The tracker is suitable for very smooth and predictable movements when object is visible throughout the whole sequence. Its quite and accurate for this type of problems (in particular, it was shown by authors to outperform MIL). During the implementation period the code at Events | AON2, the courtesy of the author Arthur Amarra, was used for the reference purpose.

TLD:經典long-term方法,從idea到performance都是傳統演算法的佼佼者

TLD is a novel tracking framework that explicitly decomposes the long-term tracking task into tracking, learning and detection.The tracker follows the object from frame to frame. The detector localizes all appearances that have been observed so far and corrects the tracker if necessary. The learning estimates detector』s errors and updates it to avoid these errors in the future. The implementation is based on [82] .The Median Flow algorithm (see cv::TrackerMedianFlow) was chosen as a tracking component in this implementation, following authors. Tracker is supposed to be able to handle rapid motions, partial occlusions, object absence etc.

KCF:老朋友了,其實是CSK擴展了CN特徵,目前支持GRAY和CN特徵 (叫KCF怎麼不支持HOG特徵?ARE YOU KIDDING ME?)

KCF is a novel tracking framework that utilizes properties of circulant matrix to enhance the processing speed. This tracking method is an implementation of[71] which is extended to KFC with color-names features ([34]). The original paper of KCF is available at Page Redirection as well as the matlab implementation. For more information about KCF with color-names features, please refer to Coloring Visual Tracking.

GOTURN:好像需要GPU支持?沒有GPU也沒興趣,沒跑過,OpenCV也可以跑CNN了,可以去玩玩

GOTURN ([70]) is kind of trackers based on Convolutional Neural Networks (CNN). While taking all advantages of CNN trackers, GOTURN is much faster due to offline training without online fine-tuning nature. GOTURN tracker addresses the problem of single target tracking: given a bounding box label of an object in the first frame of the video, we track that object through the rest of the video. NOTE: Current method of GOTURN does not handle occlusions; however, it is fairly robust to viewpoint changes, lighting changes, and deformations. Inputs of GOTURN are two RGB patches representing Target and Search patches resized to 227x227. Outputs of GOTURN are predicted bounding box coordinates, relative to Search patch coordinate system, in format X1,Y1,X2,Y2. Original paper is here: davheld.github.io/GOTUR As long as original authors implementation: davheld/GOTURN Implementation of training algorithm is placed in separately here due to 3d-party dependencies: Auron-X/GOTURN_Training_Toolkit GOTURN architecture goturn.prototxt and trained model goturn.caffemodel are accessible on opencv_extra GitHub repository.

/***********************************分割線*********************************/

實測分析:在OTB2015上測試一下tracking API中的前5個演算法,測試方式採用OPE,所有演算法都用默認參數,只跑一遍,畫success結果,僅供參考。

首先附上OTB2015論文結果作為演算法水平的參考,Struck在0.46,CSK在0.38,TLD比CSK好一點點:

然後是我跑的tracking API的結果,並附上原始Matlab代碼的KCF(HOG+gaussian)的結果作為對比:

哦哦哦,我的天,請原諒我,不是我乾的,我也不知道怎麼了!

都用默認配置,是不是哪裡配置有問題,有用過的歡迎甲流指點?

這個KCF是CSK的水平,比matlab代碼的KCF低了一截,看來沒有HOG特徵是不行的;BOOSTING的結果與OTB2015的OAB結果基本符合,速度還行;MIL比BOOSTING又慢又差,雖然剛剛加入OpenCV但演算法畢竟很老了;MEDIANFLOW這個超快,雖然accuracy有點低;最搞不懂的是TLD,到底怎麼了,我只跑了0.281 & 18.5fps,在OTB2015論文中是0.4061 & 28.1fps,而且竟然比作為base tracker的MEDIANFLOW還差一點,好吧,肯定是我哪裡搞錯了,快來人幫幫我,找找問題出在哪裡? (僅供參考,歡迎指正,forgive me!)

上面的結果或許讓某些同學失望了,為了表達我的歉意,也有同學反映MATLAB代碼不方便用來做工程項目,這裡推薦幾份實現質量比較高的C++ version的corr. filter演算法,供學習研究:

CSK:foolwood/CSK Qiang Wang實現的

I just want to build a c++ preject for CSK.

##It looks like MATLAB.

##Simple gui(I will change it to the KCF version 2 via Trackbar) Its quite difficult to draw gui like MATLAB,but the Trackbar function is quite useful!(if you tried KCF MATLAB code)

KCF:joaofaro/KCFcpp (高人氣推薦) 原作者提供的

"KCFC++", command: ./KCF 原版HOG特徵的KCF演算法

Description: KCF on HOG features, ported to C++ OpenCV. The original Matlab tracker placed 3rd in VOT 2014.

"KCFLabC++", command: ./KCF lab 擴展Lab顏色特徵的KCF演算法

Description: KCF on HOG and Lab features, ported to C++ OpenCV. The Lab features are computed by quantizing CIE-Lab colors into 15 centroids, obtained from natural images by k-means.

The CSK tracker [2] is also implemented as a bonus, simply by using raw grayscale as features (the filter becomes single-channel). 調用灰度特徵就是CSK

KCF:foolwood/KCF 還是Qiang Wang,完全實現MATLAB代碼,可以對照學習

It denpends on OpenCV, so you have to install OpenCV first.

I change fhog from computeHOG32D,newly opencv_contrib

Now I use fhog from Piotrs Computer Vision Matlab Toolbox and wrapper by Tomas Vojir

This algorithm is belong to the author of KCFJo?o F. Henriques, Rui Caseiro, Pedro Martins, Jorge Batista

CN:mostafaizz/ColorTracker 按照MATLAB代碼實現的

This C++ code is an implementation of the visual tracking method proposed in [1]. The implementation is based on the Matlab code provided by the authors of the paper.

The implementation in C++ using openCV was done by Mostafa Izz.

DSST:liliumao/KCF-DSST 在原作者KCF的C++代碼基礎上,增加scale filter實現DSST,比DSST的MATLAB代碼多了gaussian-kernel

This package includes a C++ class with several tracking methods based on the Kernelized Correlation Filter (KCF) [1, 2] for translation changes and the Discriminative Scale Space Tracker (DSST) [3] .

DSST scaling changing part is added to the original kcftracker.cpp/hpp file. Original recttools.hpp and ffttools.hpp are also modified. Debug mode is added to the Cmakelists.txt.

fDSST:TuringKi/fDSST_cpp,沒用過

C++ re-implementation of fast Discriminative Scale Space Tracking.

SAMF:vojirt/kcf (高人氣推薦) 最後這個雖說是KCF代碼,但擴展了CN特徵,還加了7步長的尺度檢測scale detection,這不SAMF是什麼

This is a C++ reimplementation of algorithm presented in "High-Speed Tracking with Kernelized Correlation Filters" paper. For more info and implementation in other languages visit the autors webpage!.

It is extended by a scale estimation (use several 7 different scales steps) and by a RGB (channels) and Color Names [2] features. Data for Color Names features were obtained from SAMF tracker.

It is free for research use. If you find it useful or use it in your research, please acknowledge my git repository and cite the original paper [1].

The code depends on OpenCV 2.4+ library and is build via cmake toolchain.

歡迎補充其他資源,我也會持續更新。

推薦閱讀:

SamplePairing:針對圖像處理領域的高效數據增強方式 | PaperDaily #34

TAG:图像处理 | 计算机视觉 | 机器学习 |