Information retrieval is the foundation for modern search engines. This text offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus—a multiuser open-source information-retrieval system developed by one of the authors and available online—provides model implementations and a basis for student work.
Advances in information and communication technology are transforming the way scholarly research is conducted across all disciplines. The use of increasingly powerful and versatile computer-based and networked systems promises to change research activity as profoundly as the mobile phone, the Internet, and email have changed everyday life. This book offers a comprehensive and accessible view of the use of these new approaches--called “e-Research”--and their ethical, legal, and institutional implications.
In Theorizing Digital Cultural Heritage, experts offer a critical and theoretical appraisal of the uses of digital media by cultural heritage institutions. Previous discussions of cultural heritage and digital technology have left the subject largely unmapped in terms of critical theory; the essays in this volume offer this long-missing perspective on the challenges of using digital media in the research, preservation, management, interpretation, and representation of cultural heritage.
Information retrieval in the age of Internet search engines has become part of ordinary discourse and everyday practice: “Google” is a verb in common usage. Thus far, more attention has been given to practical understanding of information retrieval than to a full theoretical account.
Distributed business component computing--the assembling of business components into electronic business processes, which interact via the Internet--caters to a new breed of enterprise systems that are flexible, relatively easy to maintain and upgrade to accommodate new business processes, and relatively simple to integrate with other enterprise systems. Companies with unwieldy, large, and heterogeneous inherited information systems--known as legacy systems--find it extremely difficult to align their old systems with novel business processes.
All organizations today confront data quality problems, both systemic and structural. Neither ad hoc approaches nor fixes at the systems leve—installing the latest software or developing an expensive data warehouse—solve the basic problem of bad data quality practices. Journey to Data Quality offers a roadmap that can be used by practitioners, executives, and students for planning and implementing a viable data and information quality management program.
Service-Oriented Applications and Architectures (SOAs) have captured the interest of industry as a way to support business-to-business interaction, and the SOA market grew by $4.9 billion in 2005. SOAs and in particular service-oriented computing (SOC) represent a promising approach in the development of adaptive distributed systems. With SOC, applications can open themselves to services offered by third parties and accessed through standard, well-defined interfaces.
Questions about access to scholarship go back farther than recent debates over subscription prices, rights, and electronic archives suggest. The great libraries of the past—from the fabled collection at Alexandria to the early public libraries of nineteenth-century America—stood as arguments for increasing access. In The Access Principle, John Willinsky describes the latest chapter in this ongoing story—online open access publishing by scholarly journals—and makes a case for open access as a public good.
Instant electronic access to digital information is the single most distinguishing attribute of the information age. The elaborate retrieval mechanisms that support such access are a product of technology. But technology is not enough. The effectiveness of a system for accessing information is a direct function of the intelligence put into organizing it. Just as the practical field of engineering has theoretical physics as its underlying base, the design of systems for organizing information rests on an intellectual foundation.
Georeferencing--relating information to geographic location--has been incorporated into today's information systems in various ways. We use online services to map our route from one place to another; science, business, and government increasingly use geographic information systems (GIS) to hold and analyze data. Most georeferenced information searches using today's information systems are done by text query. But text searches for placenames fall short--when, for example, a place is known by several names (or by none).