Stochastic Approximation and NonLinear Regression
This monograph addresses the problem of "real-time" curve fitting in the presence of noise, from the computational and statistical viewpoints. It examines the problem of nonlinear regression, where observations are made on a time series whose mean-value function is known except for a vector parameter. In contrast to the traditional formulation, data are imagined to arrive in temporal succession. The estimation is carried out in real time so that, at each instant, the parameter estimate fully reflects all available data. Specifically, the monograph focuses on estimator sequences of the so-called differential correction type. The term "differential correction" refers to the fact that the difference between the components of the updated and previous estimators is proportional to the difference between the current observation and the value that would be predicted by the regression function if the previous estimate were in fact the true value of the unknown vector parameter. The vector of proportionality factors (which is generally time varying and can depend upon previous estimates) is called the "gain" or "smoothing" vector. The main purpose of this research is to relate the large-sample statistical behavior of such estimates (consistency, rate of convergence, large-sample distribution theory, asymptotic efficiency) to the properties of the regression function and the choice of smoothing vectors. Furthermore, consideration is given to the tradeoff that can be effected between computational simplicity and statistical efficiency through the choice of gains.
Part I deals with the special cases of an unknown scalar parameter-discussing probability-one and mean-square convergence, rates of mean-square convergence, and asymptotic distribution theory of the estimators for various choices of the smoothing sequence. Part II examines the probability-one and mean-square convergence of the estimators in the vector case for various choices of smoothing vectors. Examples are liberally sprinkled throughout the book. Indeed, the last chapter is devoted entirely to the discussion of examples at varying levels of generality. If one views the stochastic approximation literature as a study in the asymptotic behavior of solutions to a certain class of nonlinear first-order difference equations with stochastic driving terms, then the results of this monograph also serve to extend and complement many of the results in that literature, which accounts for the authors' choice of title. The book is written at the first-year graduate level, although this level of maturity is not required uniformly. Certainly the reader should understand the concept of a limit both in the deterministic and probabilistic senses (i.e., almost sure and quadratic mean convergence). This much will assure a comfortable journey through the first fourth of the book. Chapters 4 and 5 require an acquaintance with a few selected central limit theorems. A familiarity with the standard techniques of large-sample theory will also prove useful but is not essential. Part II, Chapters 6 through 9, is couched in the language of matrix algebra, but none of the "classical" results used are deep. The reader who appreciates the elementary properties of eigenvalues, eigenvectors, and matrix norms will feel at home. MIT Press Research Monograph No. 42