想寫一個機器學習類的五子棋,可能么?可能的話需要看什麼書,參考什麼,求指點

如題


本回答謝絕轉載

問題太籠統,你需要簡單介紹一下你的背景(對機器學習掌握到了什麼程度,對計算機博弈掌握到什麼程度),你希望最終實現的五子棋棋力達到什麼水平,以及你所指的機器學習的在你的五子棋中起多大的作用(比如框架還是alpha-beta framework,只不過把個別參數用個簡單的機器學習思想調一下;還是要拋開傳統的搜索框架)。因為信息不全,所以下面的回答只能在一些假設下給一個大致的答案:

假設你對結果的要求不求達到棋力很高,只求達到一個一般的水平,比如一些flash小遊戲中的五子棋智能,歡樂五子棋,或者fiver6的最低等級(豬八戒);假設你要求只是用了機器學習的話,那麼不難。只需要了解並實現一下極大極小搜索(一搜一大把),並寫一個簡單的基於棋形分加權和的審局函數(一搜一大把),最後用隨機調整的方式(比如chessprogramming)訓練一下棋形分的權值就行了。

====1.26補充====

然後想起來以前上課的時候他說過現在棋類博弈就是開始水平很菜很人下了之後,水平很厲害,我卻完全不知道怎麼實現的

你這裡提到的通過人機對弈提高水平這一類學習,對棋類的提升是有限的,在我的知識範圍內並未了解到很成功的案例,應該是你記錯了或者你的授課老師講錯了。

藉助機機對弈提升棋力的案例是比較多的,最早可以追溯到上世紀50年代Samuel的國際跳棋(http://www.cs.virginia.edu/~evans/greatworks/samuel1959.pdf) ,以及此後90年代成功藉助強化學習方法TD-lambda實現的TD-Gammon首次打敗西洋雙陸棋最強的人類選手(TD-Gammon),前文中我提到的隨機調整的演算法(chessprogramming),是目前全球最強的國際象棋引擎Stockfish(Home - Stockfish)所採用的學習策略,雖然簡單,但它幫助Stockfish提高了40-70 ELO等級分。值得注意的是,MCTS(Monte-Carlo Tree Search, @王潛升 提到的UCT是MCTS的一種)也是一類可以很好地用於棋類對弈的強化學習方法,不過它同前面所舉例的機機對弈的學習是有一定的區別的,純粹的MCTS所學到的知識只能用於一次對局,而前面舉的三個例子所學的知識則可以積累並用於未來的對局,所以MCTS恐怕並不滿足你期望的藉助對弈不斷改進提升棋力的目標。

另外還有藉助高水平人人對弈的對局記錄來學習的方式,早期在象棋、圍棋等上都有若干探索,如NeuroChess(NeuroChess),不過到目前為止,這一學習方式只在圍棋上獲得了比較好的結果,採用的是深度卷積神經網路(最新結果:http://arxiv.org/pdf/1412.6564v1.pdf)。

有一本書叫《Reinforcement Learning: State-of-the-Art》,其中第17章給出了強化學習在各類遊戲上應用的綜述。

我是一個大四學生,剛剛考完研究生,之前看了吳恩大機器學習course RA上的,也拿到了相關證書,做了一些基礎的東西,這個是我的畢設,我打算做的智能一點

如果你更看重棋力的話,建議對於五子棋還是採用傳統的Alpha-beta framework, proof-number search, dependency-based search,並藉助類似Stockfish的參數調整策略嘗試學習一些知識。這類傳統技術的相關資源我已在此前回答上提到過(關於象棋五子棋的人工智慧? - 知乎用戶的回答),這些技術可以幫助你取得目前最好的結果(c++ - Gomoku state-of-the-art tech)。

如果你更看重創新性,你可以嘗試應用深度卷積神經網路,目前尚無這方面的成果的公開發表。具體實現上,可以先直接用前文提到的圍棋文獻中的那套方法。效果不一定好,但會比較炫酷。

蝸牛連珠(Slowrenju)作者 @Tianyi Hao 建立了一個用於交流五子棋AI的QQ群,群號是293355594。題主若有進一步的問題,可以與群內的朋友交流。


首先先確認一下,你是否了解不帶開局交換規則的五子棋/連珠是已經solved了。

(參考:五子棋先下的一定贏嗎?有什麼演算法原理可以說明這個問題?)

針對五子棋如何做AI,頂級五子棋AI引擎弈心的作者孫凱曾經總結了一份非常詳細的資料列表,應該說是如果想做五子棋AI一定要參考的。

來自http://www.aiexp.info/gomoku-renju-resources-an-overview.html

I often recieve emails that ask for gomoku/renju resources, especially for AI design, so I write this article to summerize good resources I know. This page will be maintained and updated in the future. If you find any mistake in this page such as broken links, please contact me.

Reading Materials for AI Design

  • Searching for Solutions in Games and Artificial Intelligence by Louis Victor Allis. (Recommend)

  • Chess Programming Wiki is a website which provides good reference for every aspect of chess programming. Although it mainly talks about chess, some basic techniques and ideas of chess AI design are similiar to those of gomoku/renju.

  • Solving Renju by Janos Wagner, Istvan Virag.

  • Go-Moku and Threat-Space Search by Louis Victor Allis, Hendrik Jacob Herik, and M.P.H. Huntjens.

  • Go-Moku Solved by New Search Techniques by Louis Victor Allis, Hendrik Jacob Herik, and M.P.H. Huntjens.

  • Proof-number Search by Louis Victor Allis, Maarten van der Meulen, and H. Jaap Van Den Herik.

  • (In Chinese) XQ Base is a website which provides basic articles on chess programming.

  • (In Chinese) Introduction to XL by Chengtao Chen.

  • (In Chinese) Summary of Pn-search and Db-search by Kai Sun.

Competitive Open-source AIs

  • Pela (with piskvork) by Petr Lastovicka, Czech Republic. (Recommend)

  • Niren (XL) (Original version, or Modified version which supports Gomocup protocol) by Chengtao Chen, China.

  • GM2 (with part of its documents in Chinese) by Feng Liu, China.

  • KalScope by Aean, China.

  • Qingyue Renju by Cong Zhang, China.

Open-source GUIs

  • Piskvork by Petr Lastovicka, Czech Republic. It is a GUI that supports Gomocup protocol. (Recommend)

  • Renlib by Frank Arkbo, Sweden. Renlib is one of the best programs which can help you to build a library of renju openings, analysis and played games. (Recommend)

  • Yixin Board by Kai Sun, China. It is a specially designed GUI for Yixin, supporting Yixin protocol. (Recommend)

Protocols for Computer Gomoku/Renju

  • Gomocup Protocol (via files or via stdin/stdout) by Petr Lastovicka, Czech Republic. The protocol is used in Gomocup, and tens of AIs support it.

  • Yixin Protocol by Kai Sun, China. The protocol is derived from Gomocup protocol. Compared with Gomocup protocol, Yixin protocol introduces more commands enabling Yixin to have some new features such as renju rule support.

  • (In Chinese) Botzone Protocol by AI LAB, Peking University. Botzone is an online platform for AI competitions. It used to support many games including gomoku. However, since it was updated in 2014, it seems that the platform have lost all its old data so that gomoku as well as many other games is no longer supported.

AI Competition and Online Platform

  • Gomocup (2000 - Now) (Recommend)

  • WAI (2012 - Now)

  • Computer Olympiad (1989 - 1992)

  • Renju Computer World Championship (1991, 1998, 2000, 2004(link1,link2))

  • Hungarian Computer Go-Moku Open Tournament (2005 (1st, 2nd))

  • Botzone (2010 - 2013)

  • AI vs. Human tournament (2006, 2011(en,cz))

Famous, Competitive, and Interesting AI List

  • Amoeba by Galli Zoltan, Hungary. It uses Monte-Carlo tree search (MCTS) rather than commonly used algorithms such as alpha-beta search. It can be downloaded at Gomocup.org.

  • Blackstone by Victor Barykin, Russia. It is a commercial software for renju. It is the winner in tournament of the 2nd and the 3rd Renju Computer World Championship (1998, 2000), and the winner in solving problems of the 2nd Renju Computer World Championship (1998).

  • Fiver by Meng Liu, China. A famous classic gomoku engine. It can be downloaded at Nosovsky Japanese Games Home Page.

  • Goro by Victor Barykin, Russia. It is a commercial software for gomoku, the winner of the 6th, the 7th, the 10th, and the 11th Gomocup (2005, 2006, 2009, 2010). It took part in both the first and the second AI vs. Human tournament, playing against one of the best Czech gomoku players in 2006 and 2011. Goro was ranked the 7th in the 15th Gomocup (2014). It can be downloaded at Gomocup.org.

  • Hector for Gomoku by Csaba Jergler, Hungary. It is a general game playing search core module (Hector) along with a compile time connected game specific gomoku module. It took part in Gomocup from 2008 to 2010 and was ranked the 9th in the 11th Gomocup (2010). It has been excluded from Gomocup since 2011 because the old version of Hector stopped working and the author did not send the new version to Gomocup. It is not published, so there is no download available.

  • Hewer by Tomas Kubes, Czech Republic. Hewer was ranked the 3rd in the 15th Gomocup (2014). It can be downloaded at Gomocup.org.

  • Hgarden by Bingqing Han, China. It took part in the first AI vs. Human tournament, playing against one of the best Czech gomoku players in 2006. Hgarden was ranked the 6th in the 15th Gomocup (2014). It can be downloaded at Gomocup.org.

  • Meijin by Oleg Stepanov, Russia. It played against human players in Moscow Open Tournament, 2000, making it become the first program playing with human players in public competitions.

  • Pacifist by Shuai Han, China. Winner of gomoku AI competition on Botzone hosted by AI LAB, Peking University in Decemeber 2010. It is not published, so there is no download available.

  • Pela by Petr Lastovicka, Czech Republic. It is the strongest open-source gomoku engine. Pela was ranked the 8th in the 15th Gomocup (2014). It can be downloaded at Gomocup.org.

  • Pisq by Martin Petricek, Czech Republic. It is the winner of the 1st and the 2nd Gomocup (2000, 2001). It can be downloaded at Gomocup.org.

  • Onix by Janos Wagner and Istvan Virag, Hungary. It is the winner of the 1st Hungarian Computer Go-Moku Open Tournament, 2005. It took part in Gomocup from 2007 to 2011 and was ranked the 5th in the 12th Gomocup (2011). It has been excluded from Gomocup since 2012 due to its instability -- It was reported crashing randomly very often in the 13th Gomocup. Onix can be downloaded at Gomocup.org.

  • Renjusolver by Xiangdong Wen. It is a commercial software for both gomoku and renju. It is the winner in solving problems of the 4th Renju Computer World Championship, 2004. It took part in the second AI vs. Human tournament, playing against one of the best Czech gomoku players in 2011. Renjusolver was ranked the 2nd in the 15th Gomocup (2014). It can be downloaded at Gomocup.org.

  • Super by Tongxiang Zhang, China. It is the winner in solving problems of the 3rd Renju Computer World Championship (2000).

  • Swine by Jirka Fontan, Czech Republic. It is the winner of the 4th and the 5th Gomocup (2003, 2004). It took part in the second AI vs. Human tournament, playing against one of the best Czech gomoku players in 2011. Swine was ranked the 5th in the 15th Gomocup (2014). It can be downloaded at Gomocup.org.

  • Tito by Andrej Tokarjev, Hungary. It is the winner of the 8th, the 9th, and the 12th Gomocup (2007, 2008, 2011). It took part in both the first and the second AI vs. Human tournament, playing against one of the best Czech gomoku players in 2006 and 2011. Tito was ranked the 4th in the 15th Gomocup (2014). It can be downloaded at Gomocup.org.

  • Trunkat by Jiri Trunkat. It is the winner of the 3rd Gomocup (2002). Trunkat can be downloaded at Gomocup.org.

  • Tyson by Gabor Takacs, Hungary. It is winner of the 2nd Hungarian Computer Go-Moku Open Tournament, 2005. It is not published, so there is no download available.

  • Vertex by Artyom Shaposhnikov and Alexander Nosovsky, Russia. It is the winner of the 1st Renju Computer World Championship, 1991. There is no download available.

  • Victoria by V. Allis and L Schoenmaker, Netherlands. It is the first program which is bound to win if it moves first for both freestyle and standard gomoku without modern opening rules. It is the winner of gomoku in the 4th Computer Olympiad. Victoria is not published, so there is no download available. Refer to Allis"s thesisfor more information.

  • Yixin by Kai Sun, China. It is a free software for both gomoku and renju. It is the winner of the 13th, 14th, and the 15th Gomocup (2012, 2013, 2014).

Other Useful Software

  • RenArtist by Yusuke Okuno, Japan. It gives a good solution about making databases and publish them directly on the web.

  • Gomoku Terminator by Shanshan Liu, China. It is a free software which is bound to win if it moves first for freestyle gomoku without modern opening rules.

Rules and Variations

  • Prepared Balanced Opening is the most popular opening rule in computer gomoku. It is used by both Hungarian Computer Go-Moku Open Tournament and Gomocup.

(3 prepared balanced openings used in the 15th Gomocup (provided by Alexander Bogatirev, manager of Team Russia online, member of Gomoku Committee RIF, 2014))

  • Gomoku swap2 is an opening rule for gomoku. The rule is as follows: (1) The first player puts 2 black and 1 white stones anywhere on the board; (2) The second player has 3 options: a. stay with white; b. swap; c. put 2 more stones and let the opponent choose the colour.

  • RIF opening rule is an opening rule for renju adopted by Renju International Federation in 1996.

  • Yamaguchi opening rule is an opening rule for renju developed by Japanese player Yusui Yamaguchi.

  • Swap after 1st move is an opening rule for gomoku. The rule is as follows. (1) The first player puts 1 black stone anywhere on the board; (2) The second player has 2 options: a. stay with white; b. swap.

  • Pente is a strategy board game for two or more players similar to gomoku/renju.

  • Connect 6 is a two-player strategy game similar to gomoku/renju.

Other Useful Links

  • The homepage of Renju International Federation

  • Gomoku World

  • Renju Offline

  • PlayOK

  • Little Golem

  • (In Czech) Piskvorky.cz

  • (In Polish) Gomoku.pl (Forum)

  • (In Chinese) http://Iwzq.com (Forum]

  • (In Estonian) Renju.ee


UCT演算法及其變種,算是強化學習


之前參加過大學生計算機博弈錦標賽,但沒有五子棋這個棋種,做的六子棋,獲了一等獎,當時的參考資料主要有:

(1)《PC遊戲編程》,那裡面對各種基礎演算法闡釋地很清楚(但是注意一下書里的程序有錯誤);

(2)《對弈程序基本技術》,開局庫的實現方式是參照著這個的;

(3)加了VCF,其實我覺得應該更注重VCF的思考,這是我們當時贏的關鍵。

上面的參考書都比較老,在google scholar上查過一些論文,才是比較新的方向,要做畢設的話,肯定有英語文獻的要求,當時做國內的比賽,就沒有深究,但題主肯定是要看的,屬於畢設調研的一部分,應該對思路有些幫助。

還有當時看了很多文獻,發現把計算機博弈方向的研究作為畢設的有很多,東北大學開始的比較早,可以看下開闊下思路。看到上面說的"通過人機對弈提升棋力",我在文獻中可能見過,關鍵詞應該是「神經網路」,如果題主想表達的是一個訓練、學習的過程的話,應該是這個,具體效果沒有調查,因為對於當時的我來說太難了。

同時,也覺得這是偏AI的方向。同為大四,不能像大牛們一樣思考全面,只希望能給題主一些幫助吧。


對弈問題為什麼是機器學習呢?不是人工智慧的問題么?

想要實現簡單的AI。看一下AIMA,也就是人工智慧-一種現代化的方法的前幾章就OK了。。。甚至你自己YY一下也能想到一些簡單的實現方法。。

覺得看書慢的話,上coursera看一下台灣大學的人工智慧視頻。也不錯。

至於裡面能不能機器學習來解決,應該還是可以的。。不過就是學一些權值。。基礎問題還是搜索問題。。整體更加人工智慧一些。


同意樓上幾位說的,個人以為這種簡單複雜度的遊戲根本問題在如何優化搜索上面,機器學習的重要性基本體現不出來。


推薦閱讀:

POI(Person of Interest)中的人工智慧The Machine和撒瑪利亞人在現實中有實現的可能嗎?
如何評價第10期UEC杯世界電腦圍棋賽絕藝奪冠?
如何評價 真實的人類 第二季?
未來 50 年,最有可能出現哪些革命性產品或技術進而顛覆人類的生活方式或思維定勢?
如何看待人工智慧全球大學排名Top50中沒有一所中國大陸大學?

TAG:人工智慧 | 演算法 | 五子棋 | 連珠 | 五子棋AI |