Most tasks require a person or an automated system to reason--to reach conclusions based on available information. The framework of probabilistic graphical models, presented in this book, provides a general approach for this task. The approach is model-based, allowing interpretable models to be constructed and then manipulated by reasoning algorithms. These models can also be learned automatically from data, allowing the approach to be used in cases where manually constructing a model is difficult or even impossible.
Dataset shift is a common problem in predictive modeling that occurs when the joint distribution of inputs and outputs differs between training and test stages. Covariate shift, a particular case of dataset shift, occurs when only the input distribution changes. Dataset shift is present in most practical applications, for reasons ranging from the bias introduced by experimental design to the irreproducibility of the testing conditions at training time.
The Internet gives us access to a wealth of information in languages we don't understand. The investigation of automated or semi-automated approaches to translation has become a thriving research field with enormous commercial potential. This volume investigates how machine learning techniques can improve statistical machine translation, currently at the forefront of research in the field.
Handling inherent uncertainty and exploiting compositional structure are fundamental to understanding and designing large-scale systems. Statistical relational learning builds on ideas from probability theory and statistics to address uncertainty while incorporating tools from logic, databases, and programming languages to represent structure.
Pervasive and networked computers have dramatically reduced the cost of collecting and distributing large datasets. In this context, machine learning algorithms that scale poorly could simply become irrelevant. We need learning algorithms that scale linearly with the volume of the data while maintaining enough statistical efficiency to outperform algorithms that simply process a random subset of the data.
Machine learning develops intelligent computer systems that are able to generalize from previously seen examples. A new domain of machine learning, in which the prediction must satisfy the additional constraints found in structured data, poses one of machine learning’s greatest challenges: learning functional dependencies between arbitrary input and output domains. This volume presents and analyzes the state of the art in machine learning algorithms and theory in this novel field.
Online decision making under uncertainty and time constraints represents one of the most challenging problems for robust intelligent agents. In an increasingly dynamic, interconnected, and real-time world, intelligent systems must adapt dynamically to uncertainties, update existing plans to accommodate new requests and events, and produce high-quality decisions under severe time constraints.
In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of application domains in which unlabeled data are plentiful, such as images, text, and bioinformatics.