標籤:

CVPR 2018視頻分析論文關注

4/14 更新。

本文主要關注與視頻分析相關的論文,重點為:Spatial-Temporal feature, Temporal Reasoning, Relation Network, Representation for Spatial-Temporal feature。歡迎補充。

值得注意的是,今年視頻方面出現了許多討論/反思性質的文章。而針對Action Localization這塊,最大的問題還是標註不明確,時序間隔很難區分,也就是說數據集本身就不是很好,接下來應該會有相關的工作。

Video Tracking:

  1. End-to-end Flow Correlation Tracking with Spatial-temporal Attention
  2. A Twofold Siamese Network for Real-Time Object Tracking
  3. Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking

Video Captioning:

  1. Reconstruction Network for Video Captioning

Relation Network:

  1. Learning to Compare: Relation Network for Few-Shot Learning
  2. Relation Network for Object Detection
  3. Recurrent Residual Module for Fast Inference in Videos
  4. Iterative Visual Reasoning Beyond Convolutions (Feifei組)
  5. Referring Relationships (Feifei組)

Video Understanding:

  1. What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets(spotlight,Feifei組)
  2. What have we learned from deep representations for action recognition?
  3. A Closer Look at Spatiotemporal Convolutions for Action Recognition
  4. Rethinking Spatiotemporal Feature Learning For Video Understanding
  5. On the Integration of Optical Flow and Action Recognition (很推薦的一篇文章, @林天威 寫了論文筆記)
  6. End-to-End Learning of Motion Representation for Video Understanding (Tencent AI Lab)
  7. Guess Where? Actor-Supervision for Spatiotemporal Action Localization
  8. A Unifying Contrast Maximization Framework for Event Cameras, with Applications to Motion, Depth, and Optical Flow Estimation
  9. Video Representation Learning Using Discriminative Pooling
  10. Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
  11. Fast End-to-End Trainable Guided Filter
  12. Density-aware Single Image De-raining using a Multi-stream Dense Network

Video Classification/Action Recognition:

  1. Non-local Neural Networks
  2. Appearance-and-Relation Networks for Video Classification
  3. Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition
  4. Learning to Localize Sound Source in Visual Scenes
  5. Towards Universal Representation for Unseen Action Recognition
  6. Non-Linear Temporal Subspace Representations for Activity Recognition
  7. Fine-grained Activity Recognition in Baseball Videos (workshop)
  8. Learning Latent Super-Events to Detect Multiple Activities in Videos

Video Segmentation:

  1. Actor and Action Video Segmentation from a Sentence
  2. Dynamic Video Segmentation Network
  3. Low-Latency Video Semantic Segmentation (spotlight,Dahua Lin組)
  4. CNN in MRF: Video Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF
  5. Efficient Video Object Segmentation via Network Modulation

Video Question Answer:

  1. Motion-Appearance Co-Memory Networks for Video Question Answering 作者@高繼揚 ,有很多視頻方向的工作,有興趣的同學可以關注。

推薦閱讀:

【人工智慧學習總結3】圖像的相似度衡量指標、二值化方法評估指標(二)
KCF學習筆記 【目標跟蹤】
機器視覺在不同行業的應用分析
【論文筆記】Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in
【重磅】商湯科技 C 輪戰略融資 6 億美元,估值達45億美元成世界第一AI獨角獸!阿里領投

TAG:計算機視覺 |