Paper Sharing Two - DRL with Model Learning and MCTS in Minecraft
Name
Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft (The 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) 2017)
Attachment
- Original: arxiv
- Video: 油管
Detail
Problem or Challenge: Visual-input tasks(Block-placing task in Minecraft).
Q: Is it feasible to apply planning algorithms on a learned model of the environment that is only partially observable, such as in a Minecraft building task?
The challenge lies in the partial observability of the environment with already placed blocks further obscuring the agents view. It is equally important to place blocks systematically to not obstruct the agents pathway.
Assumptions or hypotheses:
Simple Block-placing task.
Methods or Solutions:
- Model-based
- Preditction model :
Input = last four frames + the current action
Output = the next frame + the next reward
3. Monte Carlo tree search algorithm (MCTS) use the model to plan the best sequence of actions.
Experiment or Result:
- Cover all the colored tiles with blocks in 30 actions maximum.
2. Performance comparable to DQN, but learns faster (more training sample efficient.
3. Our approach can quickly learn a meaningful model that can achieve good results with MCTS.
Limitation or Weakness:
- It takes longer to perform one action in comparison with DQN.
- The transition model suffers from the limited information of the last four input frames.
Summary
- Our tests on a block-placing task in Minecraft show that learning a meaningful transition model requires considerably less training data than learning Q-values of comparable scores with DQN.
- Using a recurrent neural network in the future research.
Reference
[1] arXiv:1803.08456
[2] 標題圖片來源:Minecraft Boss Helen Chiang on Her New Role, Breaking Records, and Whats in Store For 2018
推薦閱讀:
TAG:人工智慧 | 強化學習ReinforcementLearning | 深度學習DeepLearning |