Information retrieval is the foundation for modern search engines. This text offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus—a multiuser open-source information-retrieval system developed by one of the authors and available online—provides model implementations and a basis for student work. The modular structure of the book allows instructors to use it in a variety of graduate-level courses, including courses taught from a database systems perspective, traditional information retrieval courses with a focus on IR theory, and courses covering the basics of Web retrieval. After an introduction to the basics of information retrieval, the text covers three major topic areas—indexing, retrieval, and evaluation—in self-contained parts. The final part of the book draws on and extends the general material in the earlier parts, treating such specific applications as parallel search engines, Web search, and XML retrieval. End-of-chapter references point to further reading; exercises range from pencil and paper problems to substantial programming projects. In addition to its classroom use, Information Retrieval will be a valuable reference for professionals in computer science, computer engineering, and software engineering.
About the Authors
Stefan Büttcher is a Site Reliability Engineer at Google.
Charles L. A. Clarke is Professor of Computer Science at the University of Waterloo's David R. Cheriton School of Computer Science.
Gordon V. Cormack is Professor of Computer Science at the University of Waterloo's David R. Cheriton School of Computer Science.
—from the foreward by Amit Singhal
Honorable Mention, 2010 American Publishers Award for Professional and Scholarly Excellence (PROSE) in the Computing and Information Sciences category