1.1 Classification Problems and Machine Learning
1.3 Resistance to Overfitting and the Margins Theory
1.4 Foundations and Algorithms
2 Foundations of Machine Learning
2.1 A Direct Approach to Machine Learning
2.2 General Methods of Analysis
2.3 A Foundation for the Study of Boosting Algorithms
3 Using AdaBoost to Minimize Training Error
3.1 A Bound on AdaBoost’s Training Error
3.2 A Sufficient Condition for Weak Learnability
3.3 Relation to Chernoff Bounds
3.4 Using and Designing Base Learning Algorithms
4 Direct Bounds on the Generalization Error
4.1 Using VC Theory to Bound the Generalization Error
4.3 The Equivalence of Strong and Weak Learnability
5 The Margins Explanation for Boosting’s Effectiveness
5.1 Margin as a Measure of Confidence
5.2 A Margins-Based Analysis of the Generalization Error
5.3 Analysis Based on Rademacher Complexity
5.4 The Effect of Boosting on Margin Distributions
5.5 Bias, Variance, and Stability
5.6 Relation to Support-Vector Machines
5.7 Practical Applications of Margins
6 Game Theory, Online Learning, and Boosting
6.2 Learning in Repeated Game Playing
6.5 Application to a “Mind-Reading” Game
7 Loss Minimization and Generalizations of Boosting
7.3 Loss Minimization Cannot Explain Generalization
7.4 Functional Gradient Descent
7.5 Logistic Regression and Conditional Probabilities
7.7 Applications to Data-Limited Learning
8 Boosting, Convex Optimization, and Information Geometry
8.1 Iterative Projection Algorithms
8.2 Proving the Convergence of AdaBoost
8.3 Unification with Logistic Regression
8.4 Application to Species Distribution Modeling
9 Using Confidence-Rated Weak Predictions
9.2 General Methods for Algorithm Design
9.4 Alternating Decision Trees
10 Multiclass Classification Problems
10.1 A Direct Extension to the Multiclass Case
10.2 The One-against-All Reduction and Multi-label Classification
10.3 Application to Semantic Classification
10.4 General Reductions Using Output Codes
11.1 A Formal Framework for Ranking Problems
11.2 A Boosting Algorithm for the Ranking Task
11.3 Methods for Improving Efficiency
11.4 Multiclass, Multi-label Classification
12 Attaining the Best Possible Accuracy
12.1 Optimality in Classification and Risk Minimization
12.2 Approaching the Optimal Risk
12.3 How Minimizing Risk Can Lead to Poor Accuracy
13 Optimally Efficient Boosting
13.1 The Boost-by-Majority Algorithm
13.2 Optimal Generalization Error
14 Boosting in Continuous Time
14.1 Adaptiveness in the Limit of Continuous Time
14.3 AdaBoost as a Special Case of BrownBoost
14.4 Experiments with Noisy Data
Appendix: Some Notation, Definitions, and Mathematical Background
A.3 Maxima, Minima, Suprema, and Infima
A.5 Continuity, Closed Sets, and Compactness
A.6 Derivatives, Gradients, and Taylor’s Theorem
A.8 The Method of Lagrange Multipliers
A.9 Some Distributions and the Central Limit Theorem