Skip navigation
Hardcover | $32.00 X | £26.95 | 136 pp. | 6 x 9 in | 25 b&w illus. | September 2016 | ISBN: 9780262034722
eBook | $22.00 X | September 2016 | ISBN: 9780262336703
Mouseover for Online Attention Data

Visual Cortex and Deep Networks

Learning Invariant Representations


The ventral visual stream is believed to underlie object recognition in primates. Over the past fifty years, researchers have developed a series of quantitative models that are increasingly faithful to the biological architecture. Recently, deep learning convolution networks—which do not reflect several important features of the ventral stream architecture and physiology—have been trained with extremely large datasets, resulting in model neurons that mimic object recognition but do not explain the nature of the computations carried out in the ventral stream. This book develops a mathematical framework that describes learning of invariant representations of the ventral stream and is particularly relevant to deep convolutional learning networks.

The authors propose a theory based on the hypothesis that the main computational goal of the ventral stream is to compute neural representations of images that are invariant to transformations commonly encountered in the visual environment and are learned from unsupervised experience. They describe a general theoretical framework of a computational theory of invariance (with details and proofs offered in appendixes) and then review the application of the theory to the feedforward path of the ventral stream in the primate visual cortex.

About the Authors

Tomaso A. Poggio is Eugene McDermott Professor in the Department of Brain and Cognitive Sciences at MIT, where he is also Director of the Center for Brains, Minds, and Machines and Codirector of the Center for Biological and Computational Learning. He is coeditor of Perceptual Learning (MIT Press).

Fabio Anselmi is a Postdoctoral Fellow in the Istituto Italiano di Tecnologia Laboratory for Computational and Statistical Learning at MIT and part of the Center for Brains, Minds, and Machines.


Visual Cortex and Deep Networks proposes intriguing parallels between a hugely successful technique in artificial vision and a fascinating brain region. The ventral visual cortex comprises a set of areas that process images in increasingly more abstract ways, allowing us to learn, recognize, and categorize three-dimensional objects from arbitrary two-dimensional views. The book offers a mathematical theory for how this brain region may achieve this feat, arguing that it operates in much the same way as artificial deep networks.”
Matteo Carandini, Professor of Visual Neuroscience, University College London
“Poggio and Anselmi present a scholarly and rigorous theoretical framework supported by experimental findings and computational simulations of how to build robust and invariant representations. Visual Cortex and Deep Networks features recent theoretical developments, which enable us to formalize the notion of how deep hierarchical structures can provide flexible image representations. Highly recommended.”
Gabriel Kreiman, Associate Professor, Children's Hospital, Harvard Medical School