Vikram Mulligan, Baker Lab.
Colorful coiling ribbons float across monitors throughout the University of Washington’s Baker Laboratory, intricate three-dimensional models of the amino acid chains that get stuff done inside organisms.
The head of the lab is David Baker, whose shaggy hair and trail shoes betray his weekend plans, hiking the mountains around Seattle.
A professor of biochemistry at the university, Baker pioneered methods for predicting the shape and function of proteins based on DNA sequences, including the development of a set of widely adopted software tools. Now he’s focused on creating proteins from scratch by manipulating the code of life itself.
This goes beyond the popular conception of genetic engineering, tinkering on the margins of what Mother Nature handed us after billions of years of evolution, through random mutations that only very occasionally landed upon something useful.
Baker and the researchers at his lab are custom designing proteins to carry out specific tasks, building life-forms that never existed in nature. Among many other projects, they’re attempting to develop therapeutics and vaccines to combat HIV, Ebola, influenza and other viruses.
“We’re now doing post-evolutionary biology,” he said. “We’re able to take all the stuff we’ve learned from studying nature and make a whole new world of molecules that are designed according to principles to solve current-day world problems.”
The Baker Lab stands on the top floor of the modern Molecular Engineering and Sciences Building, a broad glass curtain framed by stone panel walls, announcing itself loudly amid the collegiate gothic architecture that dominates campus. Among researchers in the field, the facility is revered for cranking out promising proteins.
“I grew up in science hearing about this mythical lab,” said Shawn Douglas, an assistant professor at UC San Francisco who focused on manipulating DNA to create therapeutics in his postdoctoral research at Harvard. “They had all these amazing projects and really top-notch scientists who started their training there and have gone on to do great things in their own right.”
Engineering proteins is generally considered a subfield of what’s known as synthetic biology. It’s a broad umbrella for a promising class of tools and techniques allowing scientists to rearrange the building blocks of life in the hope of creating novel drugs, vaccines, clean fuels, nanomaterials and much more.
So How Does This All Work?
The human genome is a long sequence of nucleotides, some three billion base pairs of adenine, cytosine, guanine and thymine. Along the segments of DNA known as genes, the specific order of nucleotides produces specific types of amino acids.
The amino acids link together, forming polypeptide chains, the order of which determines the final protein structure. These chains fold into shapes that are critical for carrying out biological functions, opening and closing pathways depending on how they bind and interact with other molecules and atoms.
Proteins are the antibodies that bind to viruses and bacteria, the enzymes that catalyze chemical reactions, the messengers that tell other cells what to do and the structure components of organs and tissues.
A series of advances have allowed researchers to assert greater control over the process, including better software tools for predicting the shape of proteins or designing them, notably one developed by the University of Washington’s Baker Laboratory known as Rosetta.
In addition, there have been major advances in the capabilities of DNA synthesis technology, such as Gen9’s BioFab system, to produce custom-ordered sequences. Email those shops DNA code and they’ll link together the nucleotides in the right order, and ship back test tubes filled with strands of DNA.
These can be inserted into bacteria or yeast cells, which can begin self-replicating and producing engineered proteins.
In addition, in the wrong hands, the same technology could be used for malicious purposes, such as creating deadly new viruses and personalized bio-weapons. These possibilities have already sparked debate over appropriate regulatory guidelines.
“Best in class”
Many advances are driving the field forward, but probably none as fundamental as the accelerating capabilities of computer science.
The increasing horsepower of computer networks enables scientists to more easily analyze and manipulate massive datasets like genomes. Researchers are also developing better design tools and algorithms for modeling the shapes of DNA and proteins based on the configurations of molecular bonds, the behavior of atoms and other complicated rules.
One of the critical contributions of the Baker Lab is a software tool called Rosetta, which can be used to predict or design the form of proteins or nucleic acids. The lab began working on the technology in the mid-1990s. About 20 corporations now license it for research projects, as do thousands of academic labs.
“It’s definitely best in class right now,” said J. Christopher Anderson, an assistant professor of bioengineering at UC Berkeley.
To supplement the processing power at their disposal, the lab also launched Rosetta@Home, a distributed computing project that allows anyone to donate their hardware to the mission. It simply runs as a screen saver when a desktop or laptop isn’t in use. About 350,000 users have registered. (It can be installed from this link.)
“When we design a new protein, we need to verify that the amino acid sequence will fold up to the structure we want — and that’s a very time-intensive task,” Baker said. “Rosetta@Home has been absolutely critical to everything we’ve done.”
He said that commercial cloud tools like Amazon Web Services or Microsoft Azure — both companies in his backyard — are simply beyond the facility’s price range. The lab is funded by the Howard Hughes Medical Institute, as well as multiple federal science agencies.
Before long, the Baker Lab realized that Rosetta@Home users could often see when the algorithms were wasting time, exploring every possible degree of freedom in the protein structure, when the best answer was obvious to human eyes.
So in 2008 they collaborated with peers at UW’s Center for Game Science to launch FoldIt, an online video game that tapped into human intuition, gamification and the wisdom of the crowd to help solve these complex puzzles.
It worked. In 2011, they published a paper in Nature Structural & Molecular Biology after players figured out the shape of a protein that causes AIDs in rhesus monkeys, a goal that had stumped scientists for a decade. It took the players three weeks.
More recently, players moved from prediction to design, creating an enzyme that was 18 times more active than one that experts in the field had engineered, promising more efficient synthesis of certain chemicals and drugs.
“The way they’ve maintained this software project, which has branched out and gone all sorts of interesting directions, is very inspiring,” said Douglas, who developed his own software tool for DNA design, known as Cadnano.
Into the world
Some of the work has already left the lab.
Baker’s former researchers have set up at least two private companies, including Arzeda and BAL Biofuels, focused on using synthetic biology to create enzymes for biofuels, biodegradable plastics, high-yield crops and more.
Read other stories in the De/code series:
- Artificial Intelligence Raises New Hope for Cancer Patients
- Medicine’s Big Problem with Big Data: Information Hoarding
- Self-Assembly Required: One Scientist’s Bid to Build Cancer-Killing Nanorobots
- Beyond Evolution: Scientist Designs Life From Scratch to Combat Disease
- Elizabeth Kolbert on How Tech Can — And Can’t — Tackle Climate Change and Extinction
- The Quantified Computer Scientist: Larry Smarr on the Future of Medicine
- One Scientist’s Bid to Debug Human Software
In 2012, he helped establish the Institute for Protein Design, specifically to identify promising protein designs and advance research to the point that it can be commercialized.
The raw possibilities for new protein designs are nearly endless. Or, in any case, a very, very large number — technically something like 20 to the 200th power’s worth of potential configurations. That’s a number with 260 digits before you get to the decimal point.
Evolution has only sampled a tiny fraction of those — and scientists are just getting started.
“Basically, proteins are involved everywhere,” Baker said. “So our ability to make proteins is pretty broadly useful. One of the challenges is to figure out what the most important, pressing problems are to address.”