AI 何時會全面超越人類

01-24

簡評：AlphaGo 已經超越了人類的圍棋。牛津、耶魯 AI 學者，寫了一份論文，調研了 352 位從事 AI 工作的研究者，來探討 AI 到底何時能全面超越人類。

人工智慧（AI）的發展將通過重塑交通，改造現代生活，衛生，科學，金融和軍隊[1，2，3]。我們需要更好預期這些進步[4,5]。

這份報告將預測 AI 跑贏人類的時間，如翻譯語言（2024 年），撰寫高中文章（2026 年），開車（2027 年），零售業（2031 年），寫一本暢銷書（2049 年），並且作為外科醫生（2053 年）。
研究人員認為在 45 年內，AI 有 50％的機會超過人類的所有任務，120 年有望取代全部的人類工作，並且實現自動化，
亞洲受訪者比北美受訪者更期待 AI 的工作。

本文有助於研究人員和政策制定者，討論 AI 的趨勢及管理。

介紹

人工智慧（AI）的進步將產生巨大的社會影響。自駕技術在未來十年內，可能會取代數百萬的駕駛工作。除了可能帶來的失業問題，過渡期也將面臨新的挑戰，如重建基礎設施，保護車輛網路安全，適應法律法規[5]。 AI 的研究者和政策相關人員，需要思考多維度，如商業、軍事、技術、倫理、市場、營銷等問題[6]。為了應對這些挑戰，對變革中的人工智慧做出準確預測將是無價的。

支撐未來 AI 進展的客觀證據：計算硬體的趨勢[7]，任務性能[8]和勞動力的自動化[9]。AI 專家的預測也提供了關鍵的附加信息。迄今為止，我們調查了更多更具代表性的 AI 專家樣本[10,11]。我們的問題涵蓋 AI 進展的時間以及 AI 的社會和倫理影響。

調研方法

我們的調查人群是在 2015 年 NIPS 和 ICML 會議上發表的研究人員。共 352 個研究人員回答了我們的調查邀請（我們聯繫的 1634 人，21％給我們反饋）。我們的問題涉及具體「AI 能力的時間（例如摺疊洗衣，語言翻譯）」、「特定職業的優勢（例如卡車司機，外科醫生）」、「在所有任務方面優於人類」、「高級人工智慧的社會影響」。有關詳細信息，請參閱調查內容。

機器跑贏人類的時間

如果所有的任務都由機器有效地實現，那麼會產生巨大的社會後果。我們使用以下定義：

High-level machine intelligence」 (HLMI) is achieved when unaided machines can accomplish every task better and more cheaply than human workers.
高級智能機器人（簡稱：HLMI）可以獨立完成工作並成本更低。

受訪者預估 HLMI 出現的時間。總結平均值，有 50% 的可能性在 45 內出現，有 10% 的可能性在 9 年內出現。

圖1顯示了個體隨機子集的概率預測，以及平均預測。

（圖1：HLMI 取代人類工作的百分比與時間的預測）

而每個參與者都會被問及這樣一個問題「勞動力完全被 HLMI 取代的預估時間」。圖 2 顯示，50% 的研究員認為 122 年能實現，10% 的研究員認為 20 年會實現。

（圖2：HLMI 取代人類具體工作項與時間的預測）

智能爆發，成果，安全

我們要思考一些重要的問題，這關乎著 AI 的未來。

一旦人工智慧研究和開發本身可以自動化，AI 的進步就會爆髮式快速發展？
高級機器智能（HLMI）如何影響經濟增長？
產生極端結果（正面/負面）的可能性是多大？
應該做些什麼來幫助確保發展的有益？

表 S4 顯示了有關這些主題的問題的結果。以下是一些重要發現：

研究人員認為近年來的 AI 正在加速發展。研究人員在 AI 領域平均工作年限是 6 年，其中有 67% 認為職業生涯後半段比前半段發展的更快。
一些研究員認為，一旦實現了 HLMI，AI 系統將在所有任務中迅速成為人類的極大優勢[3,12]。這種加速度被稱為「智能爆發」。10% 研究員認為在 HLMI 實現兩年後，AI 將在所有任務中執行得比人類好得多。
HLMI 被認為可能有積極的結果，但災難性的風險也是存在的。研究員們被問及 HLMI 是否會長期對人類產生積極或消極的影響。「良好」結果的平均概率為25％，「極好」結果的平均概率為 20％。相比之下，「不良」結果的概率為10％，而描述為「極度不良（例如人類滅絕）」的結果的概率為5％。
社會應優先考慮旨在盡量減少人工智慧的潛在風險。88% 的受訪者認為，應該更多的研究人工智慧存在的風險問題。

（圖3：從 2016 年起，HLMI 出現的地區和時間趨勢預測）

亞洲人希望出現 HLMI 的時間比北美人早 44 年

圖3顯示了個體受訪者在預測HLMI意願時的差異。不過，來自不同地區的受訪者卻驚人 HLMI預測的差異。（見圖 S1和表S2）

中國的參與者預測 HLMI 會在 28 年後實現，而美國預測 78 年後。亞洲的平均預測年限為 30年。（見表 S2，註：中國的參與者大部分身處海外）

數據代表性

任何一份報告都不可能迴避偏見，不排除有強烈主觀意識的研究員填寫了此份調研。我們通過縮短問題長度和保密的形式，來減少影響。為了調查可能的無應答偏倚，我們收集了我們的受訪者（n = 406）和無應答的 NIPS / ICML 研究人員的隨機樣本（n = 399）的人口統計學數據。結果示於表S3中。

引用次數，資歷，性別和原籍國之間的差異很小。雖然我們不能排除由於未測量的變數而導致的無應答偏差，但由於我們測量的人口統計變數，我們可以排除較大的偏差。

我們的被調人口數據顯示，我們的受訪者包括許多文章被高度引用的研究人員（主要是機器學習，也包括統計學，計算機科學理論和神經科學），來自 43 個國家。大部分在學術界（82％），而 21％在工業界工作。

討論

為什麼 AI 專家有能力預見AI進展？在政治學領域，長期的研究發現，專家預測的結果往往都不盡如人意[13]。AI的進步，依靠科學突破，可能會出現本質上更難預測。但我們對此有理由保持樂觀。對於許多領域（包括計算機硬體，基因組學，太陽能）的研發工作的長期進展已經非常規範[14]。例如 SAT 問題的解決，遊戲和計算機視覺中 AI 表現的趨勢[8]也會顯示出這種規律性，AI 專家可以利用這些趨勢進行預測。

最後，將個體預測整合可以大大改進隨機個體的預測[15]。進一步的工作可以使用我們的數據進行優化預測。此外，預計未來十年將會實現許多AI里程碑（圖2），為個人專家的可靠性提供實地證據。

參考文獻：

[1] Peter Stone, Rodney Brooks, Erik Brynjolfsson, Ryan Calo, Oren Etzioni, Greg Hager, Julia Hirschberg, Shivaram Kalyanakrishnan, Ece Kamar, Sarit Kraus, et al. One hundred year study on artificial intelligence: Report of the 2015-2016 study panel. Technical report, Stanford University, 2016.
[2] Pedro Domingos. The Master Algorithm : How the Quest for the Ultimate Learning Machine Will Remake Our World. Basic Books, New York, NY, 2015.
[3] Nick Bostrom. Superintelligence: Paths, Dangers, Strategies. Oxford University Press, Oxford, UK, 2014.
[4] Erik Brynjolfsson and Andrew McAfee. The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies. WW Norton & Company, New York, 2014.
[5] Ryan Calo. Robotics and the lessons of cyberlaw. California Law Review, 103:513, 2015.
[6] Tao Jiang, Srdjan Petrovic, Uma Ayyer, Anand Tolani, and Sajid Husain. Self-driving cars: Disruptive or incremental. Applied Innovation Review, 1:3–22, 2015.
[7] William D. Nordhaus. Two centuries of productivity growth in computing. The Journal of Economic History, 67(01):128–159, 2007.
[8] Katja Grace. Algorithmic progress in six domains. Technical report, Machine Intelligence Research Institute, 2013.
[9] Erik Brynjolfsson and Andrew McAfee. Race Against the Machine: How the Digital Revolution Is Accelerating Innovation, Driving Productivity, and Irreversibly Transforming Employment and the Economy. Digital Frontier Press, Lexington, MA, 2012.
[10] Seth D. Baum, Ben Goertzel, and Ted G. Goertzel. How long until human-level ai? results from an expert assessment. Technological Forecasting and Social Change, 78(1):185–195, 2011.
[11] Vincent C. Müller and Nick Bostrom. Future progress in artificial intelligence: A survey of expert opinion. In Vincent C Müller, editor, Fundamental issues of artificial intelligence, chapter part. 5, chap. 4, pages 553–570. Springer, 2016.
[12] Irving John Good. Speculations concerning the first ultraintelligent machine. Advances in computers, 6:31–88, 1966.
[13] Philip Tetlock. Expert political judgment: How good is it? How can we know? Princeton University Press, Princeton, NJ, 2005.
[14] J Doyne Farmer and Fran?ois Lafond. How predictable is technological progress? Research Policy, 45(3):647–665, 2016.
[15] Lyle Ungar, Barb Mellors, Ville Satop??, Jon Baron, Phil Tetlock, Jaime Ramos, and Sam Swift. The good judgment project: A large scale test. Technical report, Association for the Advancement of Artificial Intelligence Technical Report, 2012.
[16] Joe W. Tidwell, Thomas S. Wallsten, and Don A. Moore. Eliciting and modeling probability forecasts of continuous quantities. Paper presented at the 27th Annual Conference of Society for Judgement and Decision Making, Boston, MA, 19 November 2016., 2013.
[17] Thomas S. Wallsten, Yaron Shlomi, Colette Nataf, and Tracy Tomlinson. Efficiently encoding and modeling subjective probability distributions for quantitative variables. Decision,3(3):169, 2016.

補充資料

調查內容

Three sets of questions eliciting HLMI predictions by different framings: asking directlyabout HLMI, asking about the automatability of all human occupations, and asking aboutrecent progress in AI from which we might extrapolate.
Three questions about the probability of an 「intelligence explosion」.
One question about the welfare implications of HLMI.
A set of questions about the effect of different inputs on the rate of AI research (e.g., hardwareprogress).
Two questions about sources of disagreement about AI timelines and 「AI Safety.」
Thirty-two questions about when AI will achieve narrow 「milestones」.
Two sets of questions on AI Safety research: one about AI systems with non-aligned goals,and one on the prioritization of Safety research in general.
A set of demographic questions, including ones about how much thought respondents havegiven to these topics in the past. The questions were asked via an online Qualtrics survey.(The Qualtrics file will be shared to enable replication.) Participants were invited by emailand were offered a financial reward for completing the survey. Questions were asked inroughly the order above and respondents received a randomized subset of questions. Surveyswere completed between May 3rd 2016 and June 28th 2016.