With contributions from Tomás LozanoPérez and Daniel P. Huttenlocher.
An intelligent system must know what the objects are and where they are in its environment. Examples of this ubiquitous problem in computer vision arise in tasks involving hand-eye coordination (such as assembling or sorting), inspection tasks, gauging operations, and in navigation and localization of mobile robots. This book describes an extended series of experiments into the role of geometry in the critical area of object recognition. It provides precise definitions of the recognition and localization problems, describes the methods used to address them, analyzes the solutions to these problems, and addresses the implications of this analysis.
The solution to problems of object recognition are of fundamental importance in many real applications and versions of the techniques described here are already being used in industrial settings. Although a number of questions remain to be solved, the authors provide a valuable framework for understanding both the strengths and limitations of using object shape to guide recognition.
W. Eric L. Grimson is Matsushita Associate Professor in the Department of Electrical Engineering and Computer Science at MIT.
Contents: Introduction. Recognition as a Search Problem. Searching for Correspondences. Two-Dimensional Constraints. Three-Dimensional Constraints. Verifying Hypotheses. Controlling the Search Explosion. Selecting Subspaces of the Search Space. Empirical Testing. The Combinatorics of the Matching Process. The Combinatorics of Hough Transforms. The Combinatorics of Verification. The Combinatorics of Indexing. Evaluating the Methods. Recognition from Libraries. Parameterized Objects. The Role of Grouping. Sensing Strategies. Applications. The Next Steps.
The projection of light rays onto the retina of the eye forms a two-dimensional image, but through combining the stereoscopic aspect of vision with other optical clues by means of some remarkably effective image-processing procedures, the viewer is able to perceive three-dimensional representations of scenes.
From Images to Surfaces proposes and examines a specific image-processing procedure to account for this remarkable effect-a computational approach that provides a framework for understanding the transformation of a set of images into a representation of the shapes of surfaces visible in a scene. Although much of the analysis is applicable to any visual information processing system-biological or artificial-Grimson constrains his final choice of computational algorithms to those that are biologically feasible and consistent with what is known about the human visual system.
In order to clarify the analysis, the approach distinguishes three independent levels: the computational theory itself, the algorithms employed, and the underlying implementation of the computation, in this case through the human neural mechanisms. This separation into levels facilitates the generation of specific models from general concepts.
This research effort had its origin in a theory of human stereo vision recently developed by David Marr and Tomaso Poggio. Grimson presents a computer implementation of this theory that serves to test its adequacy and provide feedback for the identification of unsuspected problems embedded in it. The author then proceeds to apply and extend the theory in his analysis of surface interpolation through the computational methodology.
This methodology allows the activity of the human early visual system to be followed through several stages: the Primal Sketch, in which intensity changes at isolated points on a surface are noted; the Raw 2.5-D Sketch, in which surface values at these points are computed; and the Full 2.5-D Sketch, in which these values—including stereo and motion perception—are interpolated over the entire surface. These stages lead to the final 3-D Model, in which the three-dimensional shapes of objects, in object-centered coordinates, are made explicit.
AI in the 1980s and Beyond provides an inside report on current applications, trends, and future opportunities in one of the world's major centers of artificial intelligence research. The topics covered include a general perspective on AI, knowledge-based systems, expert systems tools and techniques, system building in medical diagnosis, AI and software engineering, intelligent natural language processing, automatic speech recognition, intelligent vision, seeing robots, robot programming, robot tactile sensing, and autonomous mobile robots.
W. Eric L. Grimson and Ramesh S. Patil are both Assistant Professors in the Department of Electrical Engineering and Computer Science at MIT. AI in the 1980s and Beyond is included in the Artificial Intelligence series, edited by Patrick H. Winston, Michael Brady, and Daniel Bobrow.