The genetic code is the Rosetta Stone by which we interpret the 3.3 billion letters of human DNA, the alphabet of life, and the discovery of the code has had an immeasurable impact on science and society. In 1968, Marshall Nirenberg, an unassuming government scientist working at the National Institutes of Health, shared the Nobel Prize for cracking the genetic code. He was the least likely man to make such an earth-shaking discovery, and yet he had gotten there before such members of the scientific elite as James Watson and Francis Crick.
The goal of structured prediction is to build machine learning models that predict relational information that itself has structure, such as being composed of multiple interrelated parts. These models, which reflect prior knowledge, task-specific relations, and constraints, are used in fields including computer vision, speech recognition, natural language processing, and computational biology. They can carry out such tasks as predicting a natural language sentence, or segmenting an image into meaningful components.
Sparse modeling is a rapidly developing area at the intersection of statistical learning and signal processing, motivated by the age-old statistical problem of selecting a small number of predictive variables in high-dimensional datasets. This collection describes key approaches in sparse modeling, focusing on its applications in fields including neuroscience, computational biology, and computer vision.
In this book, Dan Gusfield examines combinatorial algorithms to construct genealogical and exact phylogenetic networks, particularly ancestral recombination graphs (ARGs). The algorithms produce networks (or information about networks) that serve as hypotheses about the true genealogical history of observed biological sequences and can be applied to practical biological problems.
Systems techniques are integral to current research in molecular cell biology, and system-level investigations are often accompanied by mathematical models. These models serve as working hypotheses: they help us to understand and predict the behavior of complex systems. This book offers an introduction to mathematical concepts and techniques needed for the construction and interpretation of models in molecular systems biology.
The introduction of high-throughput methods has transformed biology into a data-rich science. Knowledge about biological entities and processes has traditionally been acquired by thousands of scientists through decades of experimentation and analysis. The current abundance of biomedical data is accompanied by the creation and quick dissemination of new information. Much of this information and knowledge, however, is represented only in text form--in the biomedical literature, lab notebooks, Web pages, and other sources.
Recent research in molecular biology has produced a remarkably detailed understanding of how living things operate. Becoming conversant with the intricacies of molecular biology and its extensive technical vocabulary can be a challenge, though, as introductory materials often seem more like a barrier than an invitation to the study of life.
Using the tools of information technology to understand the molecular machinery of the cell offers both challenges and opportunities to computational scientists. Over the past decade, novel algorithms have been developed both for analyzing biological data and for synthetic biology problems such as protein engineering. This book explains the algorithmic foundations and computational approaches underlying areas of structural biology including NMR (nuclear magnetic resonance); X-ray crystallography; and the design and analysis of proteins, peptides, and small molecules.
Biomedical signal analysis has become one of the most important visualization and interpretation methods in biology and medicine. Many new and powerful instruments for detecting, storing, transmitting, analyzing, and displaying images have been developed in recent years, allowing scientists and physicians to obtain quantitative measurements to support scientific hypotheses and medical diagnoses.
In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of application domains in which unlabeled data are plentiful, such as images, text, and bioinformatics.