We live in the era of Big Data, with storage and transmission capacity measured not just in terabytes but in petabytes (where peta- denotes a quadrillion, or a thousand trillion). Data collection is constant and even insidious, with every click and every “like” stored somewhere for something. This book reminds us that data is anything but “raw,” that we shouldn’t think of data as a natural resource but as a cultural one that needs to be generated, protected, and interpreted.
The development of the Semantic Web, with machine-readable content, has the potential to revolutionize the World Wide Web and its uses. A Semantic Web Primer provides an introduction and guide to this continuously evolving field, describing its key ideas, languages, and technologies.
The introduction of high-throughput methods has transformed biology into a data-rich science. Knowledge about biological entities and processes has traditionally been acquired by thousands of scientists through decades of experimentation and analysis. The current abundance of biomedical data is accompanied by the creation and quick dissemination of new information. Much of this information and knowledge, however, is represented only in text form--in the biomedical literature, lab notebooks, Web pages, and other sources.
Information retrieval is the foundation for modern search engines. This text offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus—a multiuser open-source information retrieval system developed by one of the authors and available online—provides model implementations and a basis for student work.
The way we record knowledge, and the web of technical, formal, and social practices that surrounds it, inevitably affects the knowledge that we record. The ways we hold knowledge about the past—in handwritten manuscripts, in printed books, in file folders, in databases—shape the kind of stories we tell about that past. In this lively and erudite look at the relation of our information infrastructures to our information, Geoffrey Bowker examines how, over the past two hundred years, information technology has converged with the nature and production of scientific knowledge.
Lessons from database research have been applied in academic fields ranging from bioinformatics to next-generation Internet architecture and in industrial uses including Web-based e-commerce and search engines. The core ideas in the field have become increasingly influential. This text provides both students and professionals with a grounding in database research and a technical context for understanding recent innovations in the field. The readings included treat the most important issues in the database area—the basic material for any DBMS professional.
In Ontological Semantics, Sergei Nirenburg and Victor Raskin introduce a comprehensive approach to the treatment of text meaning by computer. Arguing that being able to use meaning is crucial to the success of natural language processing (NLP) applications, they depart from the ad hoc approach to meaning taken by much of the NLP community and propose theory-based semantic methods.
As the World Wide Web continues to expand, it becomes increasingly difficult for users to obtain information efficiently. Because most search engines read format languages such as HTML or SGML, search results reflect formatting tags more than actual page content, which is expressed in natural language.
The growing interest in data mining is motivated by a common problem across disciplines: how does one store, access, model, and ultimately describe and understand very large data sets? Historically, different aspects of data mining have been addressed independently by different disciplines. This is the first truly interdisciplinary text on data mining, blending the contributions of information science, computer science, and statistics.
The idea of knowledge bases lies at the heart of symbolic, or "traditional," artificial intelligence. A knowledge-based system decides how to act by running formal reasoning procedures over a body of explicitly represented knowledge—a knowledge base. The system is not programmed for specific tasks; rather, it is told what it needs to know and expected to infer the rest.