深度學習Paper：An Analog VLSI Deep Machine Learning Implementation（博士學位論文）

01-28

題目：An Analog VLSI Deep Machine Learning Implementation

作者：Junjie Lu

單位：University of Tennessee - Knoxville

Abstract

Machine learning systems provide automated data processing and see a wide range of applications. Direct processing of raw high-dimensional data such as images and videos by machine learning systems is impractical both due to prohibitive power consumption and the 「curse of dimensionality,」 which makes learning tasks exponentially more difficult as imension increases. Deep machine learning (DML) mimics the hierarchical presentation of information in the human brain to achieve robust automated feature extraction, reducing the dimension of such data. However, the computational complexity of DML systems limits large-scale implementations in standard digital computers. Custom analog signal processing (ASP) can yield much higher energy efficiency than digital signal processing (DSP), presenting a means of overcoming these limitations.

The purpose of this work is to develop an analog implementation of DML system.

First, an analog memory is proposed as an essential component of the learning systems. It uses the charge trapped on the floating gate to store analog value in a non-volatile way. The memory is compatible with standard digital CMOS process and allows random-accessible bidirectional updates without the need for on-chip charge pump or high voltage switch.

Second, architecture and circuits are developed to realize an online k-means clustering algorithm in analog signal processing. It achieves automatic recognition of underlying data pattern and online extraction of data statistical parameters. This unsupervised learning system constitutes the computation node in the deep machine learning hierarchy.

Third, a 3-layer, 7-node analog deep machine learning engine is designed featuring online unsupervised trainability and non-volatile floating-gate analog storage. It utilizes massively parallel reconfigurable current-mode analog architecture to realize efficient computation. And algorithm-level feedback is leveraged to provide robustness to circuit imperfections in analog signal processing. At a processing speed of 8300 input vectors per second, it achieves 1×1012 operation per second per Watt of peak energy efficiency.

In addition, an ultra-low-power tunable bump circuit is presented to provide similarity measures in analog signal processing. It incorporates a novel wide-input-range tunable pseudodifferential transconductor. The circuit demonstrates tunability of bump center, width and height with a power consumption significantly lower than previous works.

鏈接：http://pan.baidu.com/s/1slUHvY1 密碼：441z

--------------------------給自己live做個廣告---------------------------

---即將進行的live

報名入口：如何成為一個優秀的電子信息類大學生

時間：2017.11.05 20:00

---我們的專欄

觀芯志：知乎專欄

---往期live

知乎live—Digital IC 設計職位筆試題分析（上）:知乎 Live - 全新的實時問答

知乎live—Digital IC 設計職位筆試題分析（下）:知乎 Live - 全新的實時問答

半導體先進工藝的器件結構和挑戰：知乎 Live - 全新的實時問答

你不了解的微電子行業：知乎 Live - 全新的實時問答

初學者在數字 IC 設計學習中易進入的誤區：知乎 Live - 全新的實時問答

從零學習 TCL 腳本：知乎 Live - 全新的實時問答

TCL 腳本：數字 IC 設計應用篇：知乎 Live - 全新的實時問答