On the Human Condition
Volume XXVIII Number 2
Sean Mooney, center, discusses a computer visualization with lab member Brandon Peters at the Center for Computational Biology and Bioinformatics at IUPUI.
Photo by Rockey Rothrock, IU School of Medicine Visual Media
Most people define "biologist" as a person who designs experiments that address questions about the natural world. Those experiments might take place in thousands of test tubes, or they may require surveys of organisms across a landscape, but they center on some kind of tangible, living material--cells, DNA, entire organisms.
Sean Mooney, assistant professor of medical and molecular genetics at the Indiana University School of Medicine, is not one of those biologists. For starters, he doesn't even have a laboratory, just a tidy office overlooking the canal on the edge of downtown Indianapolis, decorated with photos of the California mountains.
Mooney doesn't need a lab in the traditional sense--he's a computational biologist, a pioneer in a field that blurs the line between biology, computer science, and information technology. He's one of a growing number of researchers turning to computer-based approaches to biological problems, performing his experiments in silico--that is to say, in the computer, rather than in living organisms.
Computational biology and its sister discipline, bioinformatics, are fields borne of necessity. With the dawn of the genomics revolution came a glut of information. New tech-nologies for quickly decoding an organism's DNA gave scientists massive quantities of genetic data, promising new advances in medicine and drug development. But to get there, somebody first has to make some sense of the flood of genetic data pouring out of research labs all across the world, Mooney says.
"Experimentalists today generate far more biological data than they ever could before, whether it's large amounts of genetic sequence, real-time expression of genes, or cata-loguing all the proteins in a cell," he says. "The big picture in my research is to apply computer technology to answer biological problems. Today, you just can't use a simple Excel spreadsheet to solve your problem in the laboratory."
To that end, Mooney developed and now oversees a comprehensive database called MutDB, which helps researchers around the world better understand the kinds of gene mutations associated with human disease.
Genes are specific lengths of DNA, the genetic code made up of chemicals represented by the letters A, C, G, and T. Genes contain the blueprints a cell needs to make proteins, the large molecules that carry out all the functions of a living organism, such as digesting food, carrying oxygen, or protecting the body from pathogens. The precise ordering of letters determines the type of protein a gene makes, and the shape of that protein determines whether the protein will be able to do its job.
A change as slight as just one letter in a gene's code can throw off the structure of the resulting protein. To appreciate just how tightly a protein's function relies on its structure, consider the inherited disorder sickle-cell anemia. At the root of this condition is a single change, where a "T" replaces an "A" in the gene coding for the protein that carries oxygen in the blood.
In the language of genetics, the most common of these changes are called "SNPs," which stands for "single nucleotide polymorphisms." Scientists have identified millions of these kinds of mutations in the human genome since its full sequence was published in 2001, and Mooney has annotated more than 8,000 of them in the free, publicly accessible MutDB. His goal is to make this kind of information easily available to the growing number of researchers investigating the genetics of human disease. Understanding how these DNA-level changes cause disease is key to developing potential therapies for them, Mooney says: "If we can understand what the molecular effects of genetic variation are, that has a huge impact for human health as a whole."
Working with his mentor, Teri Klein, at Stanford University, Mooney explored how mutations cause a condition called osteogenesis imperfecta, commonly known as brittle bone disease. An inherited disorder, brittle bone disease manifests differently in different people. In some, the disorder is so mild as to be virtually undetectable, while in others, it's prenatally fatal. Scientists know that mutations in the genes that produce collagen--a structural protein found in hair, skin, teeth, and bones--are associated with the disease. Yet different mutations that appear to produce similar types of changes in collagen structure can lead to either very severe or very mild forms of the disease. Mooney wants to know not only the ultimate cause of these differences, but also how to predict which kinds of mutations are more likely to cause severe forms of the disorder.
Approximately 600 different mutations are associated with brittle bone disease, and so to begin making sense of the disorder, Mooney turned to computer science for help. He developed an algorithm that could identify features in the genome which could predict the severity of the disease, based on the location and type of mutation. "Not all positions in the genome are created equal," he says. "We know a lot about protein structure and function, and we can make predictions about mutations in proteins pretty readily." It was this research that spurred Mooney's interest in the molecular basis of disease.
Proteins are made up of amino acids, chemical building blocks whose structure is dictated by a gene's DNA sequence. Just 20 amino acids, combined in countless ways, produce all the proteins found in all living organisms. A different triplet of DNA letters encodes each of those 20 amino acids. For example, the triplet AAA produces the amino acid lysine. A quirk of the genetic code, however, is its built-in redundancy; most amino acids are produced by more than one triplet of letters. The triplet AAA produces the amino acid lysine, but so does the triplet AAG. Thanks to this redundancy, mutations in the genetic code don't always cause organisms significant harm: a one-letter mistake doesn't necessarily produce an incorrect amino acid.
But those mistakes sometimes do cause harm, and Mooney's goal is to determine how much of an effect those single-letter changes ultimately have. "Amino acids fall into several groups, and mutating within a group is less likely to have a detrimental effect than mutation between groups," he says. One group of amino acids, for example, might be repelled by water while water attracts another group of amino acids. The way in which an amino acid reacts with water can have a profound influence on the shape and behavior of the protein that contains it. That effect, in turn, can determine whether a protein maintains its intended function.
A mutated amino acid's location in a protein also has an impact on how severe its effects are, Mooney says. "The amino acids inside of proteins are very tightly packed. A mutation on the interior of the protein is more likely to disrupt that packing and therefore, the structure of that protein," he says.
Ultimately, Mooney hopes to develop computer-based tools that will allow researchers to predict how newly discovered mutations in the human genome cause disease. "It's a very, very complicated problem," he says. "We're trying to make predictions about new mutations--changes in the sequence where we haven't yet discovered the health consequences. That is a new frontier for us." With more than 3 billion letters in the human genome, Mooney's computers will be busy for years to come.
Mooney's current position is a perfect fit with the twin interests in biology and programming he's harbored since he was kid growing up in Seattle.
A self-proclaimed computer geek, Mooney got his first computer, an IBM PC Junior, when he was 10 years old, and soon learned how to program. After high school, he attended the University of Wisconsin-Madison, where he majored in biochemistry but continued programming on the side, paying his way through college by writing budgeting software for various administrative offices of the university.
While he enjoyed both the programming work and his studies, Mooney nevertheless found his early years as an undergraduate frustrating." I was really interested in biology as well as programming, but there wasn't a connection between them," he recalls. A turning point came during his junior year, when he learned of an interdisciplinary research center called the Santa Fe Institute in Santa Fe, New Mexico. He applied and received a research internship for the following summer. "That was one of the greatest summers of my life," Mooney says. "That was the big moment that led me into the field I'm in today."
At the Santa Fe Institute, Mooney rubbed shoulders with Nobel laureates and other prominent scientists, but what inspired him most was the nature of the research he was exposed to there. "What really got me excited was finally putting together the things I'd been doing all my life," he says. "I was combining what I'd learned in my biology classes with programming. Up until that point, I hadn't met anyone who combined the two fields."
After a second summer at the institute, Mooney entered graduate school at the University of California-San Francisco, where he began applying computer technology to biomedical problems. He transferred to Stanford University when his advisor, Klein, accepted a position there and began building the MutDB, a resource he brought with him when he came to IU in 2001.
MutDB pulls together data from several publicly available sources, providing one-stop shopping for researchers who want to see the products of different mutations, or changes in genes. "A big problem in bioinformatics is that there's too much data and not any standardized format for storing or displaying it," says Jessica Danzer, a former graduate student in the Mooney lab now working at the National Institutes of Health. "The MutDB database makes the information researchers need easier to find."
It also promises to open up a wealth of new information to scientists all over the world. Mooney is collaborating with IU's Pervasive Technology Laboratory to develop a Web-based system that allows researchers anywhere, regardless of their computer operating system, to use the MutDB. This system, called a Web service, is an emerging application of computer technology in the life sciences, says Randy Heiland, associate director of the Scientific Analysis Lab at IU's Pervasive Technology Laboratory.
"Web services are starting to revolutionize the way science is done," Heiland says. "We're moving away from science being done in specialized centers, which are solitary keepers of vital information, and into a much more open system."
Web services aren't a new technology--they've been used widely in business and allow sites such as Amazon.com to track book sales, reviews, rankings, and other information. Essentially, a Web service is a resource that allows applications to exchange or share information over the Internet. Mooney and Heiland have applied a Web services approach to integrateing information from multiple life science datasets, hoping to make the material more easily accessible.
"Sean is very interested in making research widely available," Heiland says. "He's not shy about information technology, and he immediately saw the value in using a Web service approach."
Mooney may be setting a new paradigm for research in life science. Still, his goals are no different than those of bench and field scientists all over the world: to solve scientific problems and gain a better understanding of the natural world.
"I want to solve biological problems using computers," he says. "It's not easy. But ultimately, I want to help physicians know how mutations affect what they see in their patients and lead them to understand how patients will respond to treatment. "
Jennifer Cutraro is a freelance science writer in Boston, Mass.