I am very pleased to introduce the June issue for XRDS on computational biology. I had the privilege to work as Issue Editor for this issue alongside Guest Editor Cristina Pop, who recently received her Ph.D. from Stanford University.
Computational biology is ubiquitous. Every modern bioscience lab relies on computational biology and bioinformatics techniques to some extend, whether for gene and protein sequencing or data storage. Moreover, advances in computational biology techniques allow researchers to gain deeper insights into biological mechanisms, simplify lab-bench methods, and develop more reliable and sophisticated methods for diagnosis and clinical applications. Computational biology drives biological research and even bounds the types of questions that researchers and clinicians can ask. This is why it is such an exciting and rapidly growing area of computer science.
Image from Flickr Libertas Academica, Creative Commons
We chose to structure this issue loosely around five stages of biological experimentation that make significant use of computational biology. These stages are: data gathering, data storage, refining and visualizing data, modeling data, and drawing conclusions from data. These may sound like standard steps in most data science workflows, and that’s because they are. With techniques like next generation sequencing and electronic patient records producing biological and healthcare data in ever-greater volumes, computational biology leverages many big data approaches and applies them to scientific data. The field of computational biology focuses on developing algorithms and techniques well attuned to biological data in particular.
Our features and interviews present several different angles into some of the most recent advances in computational biology. Russ Altman, Director of Biomedical Informatics at Stanford University, discusses his role in leading an interdisciplinary research program as well as his work in personalized medicine and pharmacogenetics, the use of genetic data in selecting and prescribing drugs.
Many of this issue’s articles focus on drawing inferences from large-scale studies and datasets using computational methods. David Heckerman and Christoph Lippert of Microsoft Research describe machine-learning techniques for mapping genetic differences to phenotype in large-scale genetic studies known as genome-wide association studies. Their work includes methods for disentangling correlation and causation when genetic differences in distinct populations also coincide with phenotypic variations.
We profile Suchi Saria of John Hopkin’s University, whose many valuable contributions to research and industry include developing techniques for predicting patient outcomes and treatments from electronic records. She discusses challenges and ongoing research in this area as well as her contributions to a variety of startups.
Also, we provide insights on how modern biology uses computer simulations. We speak with Vijay Pande, Director of Folding@Home, on using software to simulate protein folding. Folding@home is a widely used protein folding software program, that lets users anywhere in the world, via the Internet, donate a portion of their computer’s CPU to solving protein folding problems.
Marina Sirota and Bin Chen of the University of California at San Francisco write about computational and statistical techniques for drug discovery, which greatly increase the speed and cost at which new drug designs can be identified and then developed and tested at the bench.
This issue also introduces some cutting-edge techniques for computational biology. Karen Sachs of the Stanford School of Medicine and Tiffany Chen, Director of Informatics at Cytobank, Inc., discuss computational approaches in single-cell measurement techniques. These techniques are extremely powerful – traditional medical measurements often involve patient-wide approaches, such as average measurements in blood tests. Single-cell measurements are valuable in studying diseases like cancer, where individual cells can wreck havoc on the body.
Malay Bhattacharyya of the Indian Statistical Institute contributes on dietomics, which uses computational techniques the way personalized medicine might to make predictions about what individuals should eat for improved health and disease avoidance.
Adam A. Smith of the University of Puget Sound writes about using Markov models to model mouse vocalizations, which can be used to intuit a mouse’s mental state and may provide models for human mental disorders like schizophrenia and autism.
Sarah Aerni, Hulya Farinas, and Gautam Muralidhar of Pivotal Software outline their work in data storage, an important aspect of managing and using computational biology data.
While these feature articles present only some of the manifold exciting areas of computational biology research, we hope this issue presents a diverse range of ideas for our readers, from how computers influence large-scale studies and healthcare decisions to how they reveal the microscopic details of protein folding and subcellular structure and function.
Our Departments Section includes some exciting articles, from a discussion with Sriram Kosuri of the University of California, Los Angeles, whose paving the way forward in the field of DNA computing and storage, to tips on how to write an effective scientific paper.
Working for several years in a biochemistry lab myself, both at the bench and on the computational side, I witnessed just how much technological advances transformed the types of biological questions we could ask. We hope that these perspectives on some groundbreaking approaches to computational biology help spark your own questions that may one day help us solve pressing biological problems, or at least pique your interest in the exciting computer science subfield that is computational biology.