Empirical Methods for Exploiting Parallel Texts
Parallel texts (bitexts) are a goldmine of linguistic knowledge, because the translation of a text into another language can be viewed as a detailed annotation of what that text means. Knowledge about translational equivalence, which can be gleaned from bitexts, is of central importance for applications such as manual and machine translation, cross-language information retrieval, and corpus linguistics. The availability of bitexts has increased dramatically since the advent of the Web, making their study an exciting new area of research in natural language processing. This book lays out the theory and the practical techniques for discovering and applying translational equivalence at the lexical level. It is a start-to-finish guide to designing and evaluating many translingual applications.