Symbolic and statistical approaches to language have historically been at odds—the former viewed as difficult to test and therefore perhaps impossible to define, and the latter as descriptive but possibly inadequate. At the heart of the debate are fundamental questions concerning the nature of language, the role of data in building a model or theory, and the impact of the competence-performance distinction on the field of computational linguistics. Currently, there is an increasing realization in both camps that the two approaches have something to offer in achieving common goals.
The eight contributions in this book explore the inevitable "balancing act" that must take place when symbolic and statistical approaches are brought together—including basic choices about what knowledge will be represented symbolically and how it will be obtained, what assumptions underlie the statistical model, what principles motivate the symbolic model, and what the researcher gains by combining approaches.
The topics covered include an examination of the relationship between traditional linguistics and statistical methods, qualitative and quantitative methods of speech translation, study and implementation of combined techniques for automatic extraction of terminology, comparative analysis of the contributions of linguistic cues to a statistical word grouping system, automatic construction of a symbolic parser via statistical techniques, combining linguistic with statistical methods in automatic speech understanding, exploring the nature of transformation-based learning, and a hybrid symbolic/statistical approach to recovering from parser failures.
“The statistical and symbolic approaches to language have emerged from different starting points and methodologies and have tended to focus on different goals. The resulting tension and confusion has obscured the fact that both approaches can make crucial and often complementary contributions to a deeper understanding of how language works. The papers in this volume show that this is indeed the case: they carefully articulate the theoretical advantages of combining techniques and describe a number of concrete experiments that illustrate and support a synthesis of both approaches.”
—Ronald M. Kaplan, Research Fellow, Xerox Palo Alto Research Center
“The Balancing Act, in presenting a collection of excellent, in-depth, current studies of many areas in NLP, ranging from grammar acquisition to lexical semantics, definitively establishes the clear advantages to be gained from combining statistical techniques with linguistic analysis. As such it marks a major turning point in the field, and is essential reading for anyone interested in new and exciting research techniques.”
—Martha Palmer, Senior Research Associate and Adjunct Professor, CIS Department, University of Pennsylvania; Co-Chair, ACL-96
“The subject of The Balancing Act, the integration of statistical and symbolic models of natural language, is perhaps the leading research area among the community of speech and language researchers today. This book does an excellent job of conveying the wide variety of methods currently being pursued that bring together these two types of models, and the central issues that arise in attempting a rapprochement.”
—Stuart Shieber, Gordon McKay Professor of Computer Science, Harvard University
“The first book to address the increasingly important problem of combining symbolic and statistical information in language processing. A valuable resource both for linguists interested in the emerging role of statistics in the understanding of the nature of language, and for computational linguists looking for methodological suggestions and empirical results.”
—Marti Hearst, PhD, Xerox PARC
“This balancing act succeeds, judiciously weighing against each other the quantitative and qualitative methods in computing over natural languages. Both are right and both are inadequate, taken separately, and this book is a serious contribution to resolving this technological paradox.”
—Yorick Wilks, Professor of Computer Science, University of Sheffield, UK
“Empiricism has come of age. I have to admit that I enjoyed the good old days when no when took us seriously and we could afford to lob cheap shots at the establishment, but I find it far more rewarding these days to see the issues being discussed in such a responsible way.”
—Edward W. Church, Department Head, Information Analysis and Display Research, AT&T Research Labs
“The Balancing Act celebrates the end of the stand-off between statistical and symbolic approaches to language, for which we can all be grateful. Though the focus is on computer systems, the issues affect us all: how do text frequencies affect the way we learn language, store it, modify it, and use it? The eight chapters all argue clearly and persuasively that we must learn a new balancing trick: how to combine notions like ‘rule,’ ‘principle,’ and ‘leixcal item’ with statistical ideas about weighting and probability. A very positive message which I enjoyed reading.”
—Richard Hudson, Professor of Phonetics and Linguistics, University College London
“Both symbolic and statistical approaches surely have much to contribute to our understanding of language, yet they have typically been presented as irreconcilable opposites, in effect forcing researchers to choose which of the two ‘camps’ they wish to belong to. The papers collected into this book make the refreshing argument that statistical and symbolic approaches are not at odds with one another, but that one can fruitfully combine insights from both. The Balancing Act is thus a ‘must read’ for anyone who wants an unbiased understanding of the computational modeling of natural language.”
—Richard Sproat, Bell Laboratories