Symbolic and statistical approaches to language have historically been at odds—the former viewed as difficult to test and therefore perhaps impossible to define, and the latter as descriptive but possibly inadequate. At the heart of the debate are fundamental questions concerning the nature of language, the role of data in building a model or theory, and the impact of the competence-performance distinction on the field of computational linguistics. Currently, there is an increasing realization in both camps that the two approaches have something to offer in achieving common goals.
The eight contributions in this book explore the inevitable "balancing act" that must take place when symbolic and statistical approaches are brought together—including basic choices about what knowledge will be represented symbolically and how it will be obtained, what assumptions underlie the statistical model, what principles motivate the symbolic model, and what the researcher gains by combining approaches.
The topics covered include an examination of the relationship between traditional linguistics and statistical methods, qualitative and quantitative methods of speech translation, study and implementation of combined techniques for automatic extraction of terminology, comparative analysis of the contributions of linguistic cues to a statistical word grouping system, automatic construction of a symbolic parser via statistical techniques, combining linguistic with statistical methods in automatic speech understanding, exploring the nature of transformation-based learning, and a hybrid symbolic/statistical approach to recovering from parser failures.
“The statistical and symbolic approaches to language have emerged from different starting points and methodologies and have tended to focus on different goals. The resulting tension and confusion has obscured the fact that both approaches can make crucial and often complementary contributions to a deeper understanding of how language works. The papers in this volume show that this is indeed the case: they carefully articulate the theoretical advantages of combining techniques and describe a number of concrete experiments that illustrate and support a systhesis of both approaches.”
—Ronald M. Kaplan, Research Fellow, Xerox Palo Alto Research Center