Design, Build, Test
If you’ve spent time around technologists these days, you’ve probably heard the phrase “design, build, test.” The main idea behind this mantra, borrowed from engineering disciplines, is that technological progress is made iteratively by stepping through this cycle. With a hypothesis in mind, one first designs an experiment, builds the experiment, and finally tests the hypothesis. The information gleaned from this process is used to update your hypothesis, which starts another iteration. The factors in this approach have traditionally been how can you decrease the cycle time between iterations by automating, standardizing, or batch processing each experimental step.
DNA as Rapid Prototype
Since the cracking of the genetic code in the 1960’s, researchers have posed biological questions in terms of DNA sequences. For example, assume we are interested in increasing the efficiency of an industrially relevant enzyme. In terms of the design, build, test cycle, the process would look something like this: we would first design a change in the amino acid sequence of our target protein enzyme. We would then build a DNA sequence that corresponds to that amino acid change, and use a microbe like E. coli to translate the modified DNA sequence into our modified protein. Finally, we would test the effect of the amino acid change on the function of our enzyme, completing one loop of the cycle. In a typical set of experiments, each loop could take a few weeks. However, technological advances in the past decade such as the advent of Next-generation Sequencing (NGS), and DNA microarray synthesis has fundamentally changed the way in which researchers can investigate biology. The ability to rapidly test more iterations in parallel and accelerate the cycle time for each loop of work has unleashed what is effectively a rapid prototyping model for working with biology.
Dramatic reduction in the cost of the basic inputs of biology—reading DNA and writing DNA—have made this possible. Nowadays, when one thinks about NGS, they most likely think of advances in personal genome sequencing. In about 7 years, the cost of sequencing a human genome fell ~10,000x. To put that in perspective, imagine if the price of the iPhone dropped from ~$600 when it was first released to about 6 cents for the iPhone 7. While NGS is certainly amazing for sequencing whole genomes, researchers can leverage its power to make precise measurements of shorter pieces of DNA as well. The trick is to code both the hypothesis and the output of the experiment into DNA. Turning back to our enzyme example, we would code the amino acid change in our enzyme of interest into DNA, and link the function of that variant to a DNA molecule (known as a barcode) output that can be read by NGS. We could then compare the relative level of our variant barcode to a wild-type barcode (used as the control) to measure protein function. If there is an enhanced level of the protein tag in our variant, we know our modification of the enzyme is moving us in the right direction.
If we then wanted to test multiple enzyme variants, we would first link them all to unique barcodes. In the past, this would mean synthesizing variants serially, one at a time, in a “low-throughput” manner. However, much like reading DNA, we’ve also seen significant decreases in the price of writing DNA over the past 5 years (~10x over the last 5 years). As a result of the low cost, researchers now routinely order libraries of tens of thousands of variants of DNA sequences from various vendors. Since we can synthesize a greater number of variants than we could possibly test serially, researchers have started to “multiplex” their experiments – meaning they pool all of the variants into a single experiment, run them in parallel, and then deconvolute the results afterwards. Turning to our favorite example, a “multiplexed” assay of enzyme function would look roughly like this: we would first synthesize thousands of different DNA sequence variations corresponding to an amino acid change in our target enzyme, effectively producing all of the different relevant prototype possibilities. We would give each variant a unique barcode identifier and then pool these barcode-variant pairs into a 96 well plate where they would be transformed into E. coli. In this context, “transform” means simply that we would introduce them into E. coli and these cells would take up the new DNA and operationalize it, meaning that we could then measure the relative rates of enzyme production in a functional screen. Lastly, we would extract all of the protein barcodes, and measure their relative expression levels with NGS. This would all be done in parallel, as a single workflow, within one experiment.
Multiplexing is the Future of Design, Build, Test
Multiplexed assays radically increase the amount of information a researcher can generate in a single design, build, test loop. Instead of serially testing a few dozen variants in a low throughput method, or even using automated lab equipment to do thousands of variants individually in a traditional high-throughput approach, researchers can now use multiplexed assays to test all of these variants in a single “massively parallelized” experiment. This means we can effectively explore the broadest range of possible solutions with the fewest experimental steps. Multiplexed assays have the potential to rapidly increase progress in understanding and manipulating biology, as researchers can obtain the same density of information as large high-throughput screening campaigns for a fraction of the cost (much less automation and robotics is needed as there are fewer steps). However, multiplexed assays shift the traditional challenge of building and testing single variants to designing and interpreting tens of thousands of experiments being run simultaneously. With the advent of Big Data in biology, we are becoming more adept at deriving insights from more expansive datasets. Nonetheless, deep questions still remain as to how best to leverage our newfound synthesis capabilities to design the best set of thousands experiments.
Note: special thanks to Nathan Lubock at UCLA for sharing expertise on this topic.