Yoshua Bengio

Yoshua Bengio is Professor of Computer Science at the Université de Montréal.

  • Deep Learning

    Deep Learning

    Ian Goodfellow, Yoshua Bengio, and Aaron Courville

    An introduction to a broad range of topics in deep learning, covering mathematical and conceptual background, deep learning techniques used in industry, and research perspectives.

    “Written by three experts in the field, Deep Learning is the only comprehensive book on the subject.” —Elon Musk, cochair of OpenAI; cofounder and CEO of Tesla and SpaceX

    Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning.

    The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.

    Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.


  • Optimization for Machine Learning

    Optimization for Machine Learning

    Suvrit Sra, Sebastian Nowozin, and Stephen J. Wright

    An up-to-date account of the interplay between optimization and machine learning, accessible to students and researchers in both communities.

    The interplay between optimization and machine learning is one of the most important developments in modern computational science. Optimization formulations and methods are proving to be vital in designing algorithms to extract essential knowledge from huge volumes of data. Machine learning, however, is not simply a consumer of optimization technology but a rapidly evolving field that is itself generating new optimization ideas. This book captures the state of the art of the interaction between optimization and machine learning in a way that is accessible to researchers in both fields.Optimization approaches have enjoyed prominence in machine learning because of their wide applicability and attractive theoretical properties. The increasing complexity, size, and variety of today's machine learning models call for the reassessment of existing assumptions. This book starts the process of reassessment. It describes the resurgence in novel contexts of established frameworks such as first-order methods, stochastic approximations, convex relaxations, interior-point methods, and proximal methods. It also devotes attention to newer themes such as regularized optimization, robust optimization, gradient and subgradient methods, splitting techniques, and second-order methods. Many of these techniques draw inspiration from other fields, including operations research, theoretical computer science, and subfields of optimization. The book will enrich the ongoing cross-fertilization between the machine learning community and these other fields, and within the broader optimization community.

  • Large-Scale Kernel Machines

    Large-Scale Kernel Machines

    Léon Bottou, Olivier Chapelle, Dennis DeCoste, and Jason Weston

    Solutions for learning from large scale datasets, including kernel learning algorithms that scale linearly with the volume of the data and experiments carried out on realistically large datasets.

    Pervasive and networked computers have dramatically reduced the cost of collecting and distributing large datasets. In this context, machine learning algorithms that scale poorly could simply become irrelevant. We need learning algorithms that scale linearly with the volume of the data while maintaining enough statistical efficiency to outperform algorithms that simply process a random subset of the data. This volume offers researchers and engineers practical solutions for learning from large scale datasets, with detailed descriptions of algorithms and experiments carried out on realistically large datasets. At the same time it offers researchers information that can address the relative lack of theoretical grounding for many useful algorithms. After a detailed description of state-of-the-art support vector machine technology, an introduction of the essential concepts discussed in the volume, and a comparison of primal and dual optimization techniques, the book progresses from well-understood techniques to more novel and controversial approaches. Many contributors have made their code and data available online for further experimentation. Topics covered include fast implementations of known algorithms, approximations that are amenable to theoretical guarantees, and algorithms that perform well in practice but are difficult to analyze theoretically.

    Contributors Léon Bottou, Yoshua Bengio, Stéphane Canu, Eric Cosatto, Olivier Chapelle, Ronan Collobert, Dennis DeCoste, Ramani Duraiswami, Igor Durdanovic, Hans-Peter Graf, Arthur Gretton, Patrick Haffner, Stefanie Jegelka, Stephan Kanthak, S. Sathiya Keerthi, Yann LeCun, Chih-Jen Lin, Gaëlle Loosli, Joaquin Quiñonero-Candela, Carl Edward Rasmussen, Gunnar Rätsch, Vikas Chandrakant Raykar, Konrad Rieck, Vikas Sindhwani, Fabian Sinz, Sören Sonnenburg, Jason Weston, Christopher K. I. Williams, Elad Yom-Tov