Haiping Huang
Sun Yat-sen University, Guangzhou, China
ISBN 978-981-16-7569-0 e-ISBN 978-981-16-7570-6
https://doi.org/10.1007/978-981-16-7570-6
Jointly published with Higher Education Press
The print edition is not for sale in China (Mainland). Customers from China (Mainland) please order the print book from: Higher Education Press.
Higher Education Press 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publishers, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publishers nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publishers remain neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
Neural networks have become a powerful tool in various domains of scientific research and industrial applications. However, the inner workings of this tool remain unknown, which prohibits us from a deep understanding and further principled design of more powerful network architectures and optimization algorithms. To crack the black box, different disciplines including physics, statistics, information theory, non-convex optimization and so on must be integrated, which may also bridge the gap between the artificial neural networks and the brain. However, in this highly interdisciplinary field, there are few monographs providing a systematic introduction of theoretical physics basics for understanding neural networks, especially covering recent cutting-edge topics of neural networks.
In this book, we provide a physics perspective on the theory of neural networks, and even neural computation in models of the brain. The book covers the basics of statistical mechanics, statistical inference, neural networks, and especially classic and recent mean-field analysis of neural networks of different nature. These mathematically beautiful examples of statistical mechanics analysis of neural networks are expected to inspire further techniques to provide an analytic theory for more complex networks. Future important directions along the line of scientific machine learning and theoretical models of brain computation are also reviewed.
We remark that this book is not a complete review of both fields of artificial neural networks and mean-field theory of neural networks, instead, a biased-viewpoint of statistical physics methods toward understanding the black box of deep learning, especially for beginner-level students and researchers who get interested in the mean-field theory of learning in neural networks.
This book stemmed from a series of lectures about the interplay between statistical mechanics and neural networks. These lectures were given by the author in his PMI (physics, machine and intelligence) group during the years from 2018 to 2020. The book is organized into two partsbasics of statistical mechanics related to the theory of neural networks, and theoretical studies of neural networks including cortical models.
The first part is further divided into nine chapters. Chapter introduces the concepts of replica symmetry and replica symmetry breaking in the spin glass theory of disordered systems.
The second part is divided into nine chapters. Chapter introduces how the statistical mechanics technique can be applied to compute the asymptotic behavior of the spectral density for the Hermitian and the non-Hermitian random matrices. Finally, perspectives on a statistical mechanical theory toward deep learning and even other interesting aspects of intelligence are provided, hopefully inspiring future developments of the interdisciplinary fields across physics, machine learning and theoretical neuroscience and other involved disciplines.
I am grateful for the students efforts in drafting the lecture notes, including preparing figures. Here, I list their contributions to associated chapters. These students in my PMI group are Zhenye Huang (Chaps. ). I also thank the other PMI members, Ziming Chen, Yiming Jiao, Junbin Qiu, Mingshan Xie, Xianbo Xu and Yang Zhao for their reading feedbacks on the draft. I also would like to thank Haijun Zhou, K. Y. Michael Wong, Yoshiyuki Kabashima and Taro Toyoizumi for their encouragements and supports during my Ph.D. and Post-doctoral research career. I finally acknowledge the financial support from the National Natural Science Foundation of China (Grant No. 11805284 Grant No. 12122515).
Haiping Huang
Guangzhou, China
December 2021
Acronyms
AMP
Approximate message passing
AT
De Almeida-Thouless
BA
Bethe approximation
BM
Boltzmann machine
BP
Belief propagation
BPTT
Backpropagation through time
CD
Contrastive-divergence
CLT
Central limit theorem
CNN
Convolution neural network
DG
Dichotomized Gaussian
EA
EdwardsAnderson
EM
Expectation-maximization
gBP
Generalized backpropagation
KL
KullbackLeibler
LIF
Leaky-integrated firing
LSTM
Long short-term memory
MCMC
Markov chain Monte Carlo
MCS
Monte Carlo sweep
MFA
Mean-field approximation
MPM
Maximizer of the posterior marginal
PS
Permutation symmetry
PSB
Permutation symmetry breaking
RAP
Random active path
RBM
Restricted Boltzmann machine
RF
Receptive field
RG
Random guess
RNN
Recurrent neural network
RS
Replica symmetry
RSB
Replica symmetry breaking
SaS
Spike and slab
SDE
Saddle-point equation
SG
Spin glass
SK
Sherrington-Kirkpatrick
sMP
Simplified message passing
SSB
Spontaneous symmetry breaking
TAP
ThoulessAndersonPalmer
Contents
About the Author
Dr. Haiping Huang
received his Ph.D. degree in theoretical physics from the institute of Theoretical Physics, the Chinese Academy of Sciences. He works as an associate professor at the School of Physics, Sun Yat-sen University,China. His research interests include the origin of the computational hardness of the binary perceptron model, the theory of dimension reduction in deep neural networks, and inherent symmetry breaking in unsupervised learning.In 2021, he was awarded Excellent Young Scientists Fund by National Natural Science Foundation of China.