Learning a Hierarchical Latent-Variable Model of 3D Shapes 筆記

07-14

來自專欄 V.DeepLearninghttps://arxiv.org/pdf/1705.05994.pdf?

arxiv.org

2D圖片的風格化處理已應用到日常的方方面面，本文介紹了基於資料庫ModelNet40 (3D CAD benchmark) 用於無監督學習的多物體3D voxel data的latent-variable model(隱變數模型) —— VSL (variational shape learner)。

以往的神經網路多是學習簡單的隱變數表現，如deep belief networks, deep auto-encoders & 3D CNN。但是以上皆是基於single vector representation of 3D shape，生成模型局限於single layer of latent variable。

1. 原理&結構

VSL則基於Multilevel的隱變數結構可以利用底層的隱變數（例如Edges的shape或者place

of edge）來抽象的描述出高層隱變數的特點，並利用bayesian的變形作為loss

function。

接近input的local層含有較多的low-level feature，描述了each level of feature abstraction, skip-connection 以top-down方向鏈接了每個local層，遠離input的local層含有較多的higher-level feature信息。Global層包含了每個local層的feature信息和相對應的位置。

這樣的結構帶來2個好處：

1）Straightforward parametrization of generative model

2）Cutting-off overfitting