Introduction to Natural Language Processing
A survey of computational methods for understanding, generating, and manipulating human language, which offers a synthesis of classical representations and algorithms with contemporary machine learning techniques.
This textbook provides a technical perspective on natural language processing—methods for building computer software that understands, generates, and manipulates human language. It emphasizes contemporary data-driven approaches, focusing on techniques from supervised and unsupervised machine learning. The first section establishes a foundation in machine learning by building a set of tools that will be used throughout the book and applying them to word-based textual analysis. The second section introduces structured representations of language, including sequences, trees, and graphs. The third section explores different approaches to the representation and analysis of linguistic meaning, ranging from formal logic to neural word embeddings. The final section offers chapter-length treatments of three transformative applications of natural language processing: information extraction, machine translation, and text generation. End-of-chapter exercises include both paper-and-pencil analysis and software implementation.
The text synthesizes and distills a broad and diverse research literature, linking contemporary machine learning techniques with the field's linguistic and computational foundations. It is suitable for use in advanced undergraduate and graduate-level courses and as a reference for software engineers and data scientists. Readers should have a background in computer programming and college-level mathematics. After mastering the material presented, students will have the technical skill to build and analyze novel natural language processing systems and to understand the latest research in the field.
“Natural language processing is a critically important and rapidly developing area of computer science. Any modern practitioner needs a unified understanding of both machine learning algorithms and linguistic fundamentals. Jacob Eisenstein is an essential guide through the core technical methodologies of the field and their application in challenging real-world problems. His wonderful textbook is a much-needed resource for any student or researcher interested in mastering contemporary data-driven NLP and gaining a strong foundation for following, and contributing to, future advances.”
Alexander Rush, Associate Professor, Cornell University
"This book is a must-read for anyone studying natural language processing. It presents a unified view of the entire field, ranging from linguistic foundations to modern deep learning algorithms, that is both technically rigorous and also easily accessible."
Luke Zettlemoyer, Associate Professor of Computer Science and Engineering, University of Washington; Research Manager, Facebook AI Research
“This book is the most comprehensive and up-to-date reference on natural language processing since the beginning of the deep learning revolution. It covers the basics as well as more advanced materials and will expose its readers to most of the necessary ingredients of state-of-the-art AI and NLP algorithms.”
Richard Socher, Chief Scientist, Salesforce
"This book provides an excellent introduction to natural language processing, with emphasis on foundational methods and algorithms. I highly recommend it to every serious researcher and student in natural language processing.”
Hwee Tou Ng, Professor of Computer Science, National University of Singapore