Tutorial: Capabilities of Shallow and Deep Networks
Monday, 11th of September 2017, 9h30-12h00
by Vĕra Kůrková
Although originally biologically inspired neural networks were introduced as multilayer computational models, later shallow (one-hidden-layer) architectures became dominant in applications. Recently, interest in architectures with several hidden layers was renewed due to successes of deep convolutional networks. Experimental evidence motivated theoretical research aiming to characterize tasks for which deep networks are more suitable than shallow ones. This tutorial will review recent theoretical results comparing capabilities of shallow and deep networks. In particular, it will focus on complexity requirements of shallow and deep networks performing high-dimensional tasks.
- Universality and tractability of representations of multivariable mappings by shallow networks
- Sparse representations of functions by shallow networks and output-weight regularization
- Limitations of computation of highly-varying functions by shallow networks
- Probabilistic lower bounds on model complexity of shallow and deep networks
- Constructive lower bounds on model complexity of shallow perceptron networks
- Examples of functions that can be represented compactly by deep architectures but cannot be represented by a compact shallow architecture
- Connections to the No Free Lunch Theorem, pseudo-noise sequences, and the central paradox of coding theory
- Open problems concerning deep and shallow architectures.
This tutorial is self-contained, and is suitable for researchers who already use multilayer neural networks as a tool and wish to understand their mathematical foundations, capabilities and limitations. The tutorial does not require a sophisticated mathematical background.
Vĕra Kůrková received Ph.D. in mathematics from the Charles University, Prague, and DrSc. (Prof.) in theoretical computer science from the Czech Academy of Sciences. Since 1990 she is affiliated as a scientist in the Institute of Computer Science, Prague, in 2002-2009 she was the Head of the Department of Theoretical Computer Science. She is an internationally respected scientist in the field of mathematical theory of neurocomputing and learning. She is a member of the editorial boards of the journals Neu- ral Networks and Neural Processing Letters, in past she also served as an associate editor of IEEE Transactions on Neural Networks and was an editor of special issues of the journals Neural Networks and Neurocomputing. She was the general chair of the conferences ICANN 2008 and ICANNGA 2001. She is the current president of the European Neural Network Society (ENNS) elected for the term (2017-2019).