Toward an Integration of Deep Learning and Neuroscience

05-05

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5021692/#fn0037

Abstract

Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures.

神經科學和人工神經網路都關注神經元/人工神經元的編碼、活性和神經網路

Two recent developments have emerged within machine learning that create an opportunity to connect these seemingly divergent perspectives.

兩方面現在出現融合的現象

First,structured architectures are used, including dedicated systems for attention, recursion and various forms of short- and long-term memory storage.

出現架構方面的結構：注意力模型、LSTM

Second, cost functions and training procedures have become more complex and are varied across layers and over time. Here we think about the brain in terms of these ideas.

損失函數和訓練過程變得更加複雜

We hypothesize that (1) the brain optimizes cost functions, (2) the cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structuredarchitecture matched to the computational problems posed by behavior.

假設：1 大腦也優化損失函數 2 損失函數在大腦不同區域和時期都不同 3 優化操作是在針對行為計算問題的預定結構上實現的

In support of these hypotheses, we argue that a range of implementations of credit assignment through multiple layers of neurons are compatible with our current knowledge of neural circuitry, and that the brain』s specialized systems can be interpreted as enabling efficientoptimization for specific problem classes.

大腦可以解釋為是針對特定問題領域的高效優化器

Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism. We suggest directions by which neuroscience could seek to refine and test these hypotheses.

內容目錄如下：

1. Introduction

Putative differences between conventional and brain-like neural network designs. (A) In conventional deep learning, supervised training is based on externally-supplied, labeled data. (B) In the brain, supervised training of networks can still occur via ...

1.1. Hypothesis 1 – the brain optimizes cost functions

1.2. Hypothesis 2 – cost functions are diverse across areas and change over development

1.3. Hypothesis 3 – specialized systems allow efficient solution of key computational problems

2. The brain can optimize cost functions

2.1. Local self-organization and optimization without multi-layer credit assignment

2.2. Biological implementation of optimization

2.2.1. The need for efficient gradient descent in multi-layer networks

2.2.2. Biologically plausible approximations of gradient descent

2.2.2.1. Temporal credit assignment:

2.2.2.2. Spiking networks

2.3. Other principles for biological learning

2.3.1. Exploiting biological neural mechanisms

2.3.2. Learning in the cortical sheet

2.3.3. One-shot learning

Human learning is often one-shot

2.3.4. Active learning

Human learning is often active and deliberate

2.4. Differing biological requirements for supervised and reinforcement learning

3. The cost functions are diverse across brain areas and time

3.1. How cost functions may be represented and applied

3.2. Cost functions for unsupervised learning

3.2.1. Matching the statistics of the input data using generative models

3.2.2. Cost functions that approximate properties(概念屬性) of the world

多感知器官信息互為監督進行監督學習。

3.3. Cost functions for supervised learning

3.4. Repurposing reinforcement learning for diverse internal cost functions

3.4.1. Cost functions for bootstrapping learning in the human environment

3.4.2. Cost functions for learning by imitation and through social feedback

3.4.3. Cost functions for story generation and understanding

4. Optimization occurs in the context of specialized structures

4.1. Structured forms of memory

4.1.1. Content addressable memories

4.1.2. Working memory buffers

4.1.3. Storing state in association with saliency

4.2. Structured routing systems

4.2.1. Attention

4.2.2. Buffers

4.2.3. Discrete gating of information flow between buffers

4.3. Structured state representations to enable efficient algorithms

4.3.1. Continuous predictive control

4.3.2. Hierarchical control

Importantly, many of the control problems we appear to be solving are hierarchical.

4.3.3. Spatial planning

Spatial planning requires solving shortest-path problems subject to constraints.

4.3.4. Variable binding

Language and reasoning appear to present a problem for neural networks (Minsky, 1991; Marcus, 2001; Hadley, 2009):

語言單詞和底層物理等認知的綁定

4.3.5. Hierarchical syntax

Fixed, static hierarchies (e.g., the hierarchical organization of cortical areas Felleman and Van Essen, 1991) only take us so far:

4.3.6. Mental programs and imagination

Humans excel at stitching together sub-actions to form larger actions (Verwey, 1996; Acuna et al., 2014; Sejnowski and Poizner, 2014). Structured, serial, hierarchical probabilistic programs

4.4. Other specialized structures

4.5. Relationships with other cognitive frameworks involving specialized systems

5. Machine learning inspired neuroscience

5.1. Hypothesis 1– existence of cost functions

5.2. Hypothesis 2– biological fine-structure of cost functions

5.3. Hypothesis 3– embedding within a pre-structured architecture

6. Neuroscience inspired machine learning

6.1. Hypothesis 1– existence of cost functions

6.2. Hypothesis 2– biological fine-structure of cost functions

6.3. Hypothesis 3– embedding within a pre-structured architecture

7. Did evolution separate cost functions from optimization algorithms?

8. Conclusions

原文 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5021692/#fn0037

ref：

猴腦-腦功能區拓撲結構賞析Network architecture of the long-distance ... brain

論文解讀：主視覺大腦皮層的深度層級模型：機器視覺可以從中學到些什麼？

createamind.ai