【實驗】Adversarial Video Generation

02-12

這個項目實現了一個生成對抗網路來預測未來的視頻幀，詳見Mathieu，Couprie＆LeCun的「Deep Multi-Scale Video Prediction Beyond Mean Square Error」。他們的官方代碼（使用Torch）可以在這裡找到。
這是TensorFlow版本，地址在這裡。

生成對抗網路 - 一個生成器和一個判別器 - 來提高生成圖像的清晰度。給定過去四幀視頻，生成器學習為下一幀生成準確的預測。給定生成的或真實的圖像，判別器學習正確地分類生成和真實的圖像。這兩個網路「競爭」：生成器試圖欺騙判別器將其輸出分類為真實。這迫使生成器生成與實際幀非常相似的圖像。

Author有話說：

Results and Comparison

在Ms.Pac-Man的幀序列數據集上訓練和測試此網路。為了比較對抗性訓練與非對抗性訓練，訓練了一個對抗網路，在生成器和判別器上進行500,000step；並且訓練了一個非對抗網路達1,000,000step（因為非對抗網路的運行速度快了一倍）。每個網路使用GTX 980TI GPU進行訓練約需24小時。

在下面的例子中，遞歸地運行了64幀的網路。（即，生成第一幀的輸入是[input1，input2，input3，input4]，生成第二幀的輸入是[input2，input3，input4，generated1]等）。由於這些網路並不是由原來的遊戲引起的，所以他們無法預測出真正的動作（比如吃豆人轉向哪個方向）。因此，目標不是要與真實圖像完全一樣，而是保持清晰和可能的表現形式。

下面的例子展示了非對抗網路多快地變得模糊。對抗網路在一定程度上表現出一些模糊，但是在整個序列中保持清晰表示要好得多：

這個例子展示了對抗性網路如何能夠圍繞多個輪迴保持Ms.Pac-Man的清晰表示，而非對抗性網路卻沒有這樣做：

儘管對抗性網路在時間上的清晰度和一致性方面顯然是優越的，但非對抗性網路確實會產生一些有趣的/令人驚嘆的失敗：

使用文中概述的誤差測量（峰值信噪比和清晰差異）沒有顯示對抗性和非對抗性訓練之間的顯著差異。我相信這是因為來自Ms.Pac-Man數據集的連續幀在大部分像素中沒有運動，而原始論文是在真實世界的視頻中進行訓練的，而在大部分幀中有運動。儘管如此，對抗性訓練顯然可以在產生幀的清晰度方面產生質的改善，特別是在長時間跨度上。您可以通過在該項目的根目錄下運行tensorboard --logdir =./ Results / Summaries /來查看損失和錯誤統計信息。

Usage

Clone or download this repository.
Prepare your data:

如果您想重現Author的結果，可以在這裡下載Ms.Pac-Man數據集。把文件夾放在一個名為Data /的目錄下，默認行為是在這個項目的根目錄下。否則，您將需要使用第3和第4部分中列出的選項指定數據位置。
如果您想在自己的視頻中進行訓練，請對其進行預處理，使其成為如下所示的幀序列目錄。（名稱和圖像擴展名都不重要，只有結構）：

- Test - Video 1 - frame1.png - frame2.png - frame ... - frameN.png - Video ... - Video N - ... - Train - Video 1 - frame ... - Video ... - Video N - frame ...

3. Process training data:

網路訓練輸入圖像的隨機32x32像素，過濾以確保大多數clips在其中有一些移動。要將輸入數據處理成這種形式，請使用以下選項從/Code目錄運行腳本

python process_data.py

-n/--num_clips= <# clips to process for training> (Default = 5000000)-t/--train_dir= <Directory of full training frames>-c/--clips_dir= <Save directory for processed clips> (I suggest making this a hidden dir so the filesystem doesnt freeze with so many files. DONT `ls` THIS DIRECTORY!)-o/--overwrite (Overwrites the previous data in clips_dir)-H/--help (prints usage)

這可能需要幾個小時才能完成，具體取決於您想要的剪輯數量。

4. Train/Test:

如果您想要Ms.Pac-Man數據集即插即用，可以在這裡下載訓練好的模型。使用-l選項載入它們. (e.g. python avg_runner.py -l ./Models/Adversarial/model.ckpt-500000).
通過使用以下選項從/Code 目錄運行python avg_runner.py 來訓練和測試網路：

-l/--load_path= <Relative/path/to/saved/model>-t/--test_dir= <Directory of test images>-r--recursions= <# recursive predictions to make on test>-a/--adversarial= <{t/f}> (Whether to use adversarial training. Default=True)-n/--name= <Subdirectory of ../Data/Save/*/ in which to save output of this run>-O/--overwrite (Overwrites all previous data for the model with this save name)-T/--test_only (Only runs a test step -- no training)-H/--help (Prints usage)--stats_freq= <How often to print loss/train error stats, in # steps>--summary_freq= <How often to save loss/error summaries, in # steps>--img_save_freq= <How often to save generated images, in # steps>--test_freq= <How often to test the model on test data, in # steps>--model_save_freq= <How often to save the model, in # steps>

坑和結果

Author使用1.0以下TensorFlow版本，直接運行預訓練模型去測試會有so many errors：

1. AttributeError: module object has no attribute SummaryWriter>> tf.train.SummaryWriter ----------->tf.summary.FileWriter2.File "/root/Documents/Adversarial_Video_Generation-master/Code/loss_functions.py", line 118, in adv_loss return tf.reduce_mean(tf.pack(scale_losses))AttributeError: module object has no attribute pack>>tf.pack-----------> tf.stack3..tf.scalar_summary(batch_loss, loss)AttributeError: module object has no attribute >>scalar_summary -----------> tf.summary.scalar(batch_loss, loss)4.AttributeError: module object has no attribute merge_summary >>merge_summary-----------> tf.summary.merge5.TypeError: Expected int32, got list containing Tensors of type _Message instead. >>tf.concat(2,[fw,bw]) -----------> tf.concat([fw,bw],2)6. checkpoint缺少東西，導不進去。

2. 解決前5個語法問題，不能用預訓練，那就自己訓練（數據集已經下載好）。

一開始沒看懂Usage的第三步Process training data，我翻翻數據集目錄發現圖片格式是正確的，於是我直接第四步Train，這時報錯。

path = c.TRAIN_DIR_CLIPS + str(np.random.choice(c.NUM_CLIPS)) + .npzFile "mtrand.pyx", line 1115, in mtrand.RandomState.choiceValueError: a must be greater than 0

3.這是數據集的問題，看來必須運行第三步Process training data，應該是想把圖片打包成數組。都是默認選項。如作者說的一樣，真的處理了好久。完成以後，我去找找處理完的數據在哪裡，翻遍目錄居然找不到……

4.訓練（果真沒有出現錯誤了），默認50萬step，一下午只5萬step……；Stop，測試下5萬的結果怎麼樣。

輸入：四幀

輸出（可調）：包括輸入一共10幀

輸入和輸出的image，下面做成gif

輸入

真實

生成

只訓練5萬step，可以看到損失已經很低了，只比50萬step多一點點。