The high-throughput frontier
By Rebecca Guenard
- The drive to solve problems faster initiated the development of computer-controlled instruments that spit out data on exploratory experiments for hundreds of samples.
- More and more high-throughput applications are being applied to create solutions in the fats and oils industries.
- However, even after decades of use, the methods still suffer from a lack of standardization that makes cross-referencing data difficult.
There was a time when drug discovery was mostly serendipitous—take penicillin, for example. Then came a new way to screen molecules, and the discoveries ramped up exponentially. In 2019, the US Food and Drug Administration alone approved 48 novel drugs. The year before it was 59. This rapid development of new treatments in the past 30 years is due to two things: the identification of targets (such as proteins) through genome sequencing, and the high-throughput screening of compounds that could be used to affect those targets.
Many research centers across the United States opened their doors to assist with finding answers to biological questions. The University of California, Los Angeles, has a facility called the Molecular Screening Shared Resources (MSSR) laboratory that routinely screens 100,000 molecules a day (http://www.mssr.ucla.edu). Similar facilities exist at the University of Illinois at Champaign-Urbana, Illinois (https://scs.illinois.edu), and at Rockefeller University in New York, New York (https://www.rockefeller.edu/htsrc). These programs have cataloged as many as 300,000 small molecules. Researchers test hundreds of molecules at a time against various biological assays that are dispensed into microliter or nanoliter well-plates by automation (Fig. 1). The plates are then analyzed by spectroscopic methods, churning out a wealth of data on biological reactions that often lead to a new drug treatment in record time.
“Contract research organizations are now carrying out large amounts of screening without the need to sink a large capital investment, it gives smaller companies a lot more flexibility and in fact many large companies are taking this approach too,” said Bill Janzen, an expert in drug development and editor of the book, High Throughput Screening, during an interview with the online magazine Technology Networks.
The big mechanical arms, robotic instruments, and data processing machinery used for high-throughput technology were once financially out of reach for smaller companies or start-ups (Fig. 2). Today, that is no longer the case thanks to the increased availability of reliable and cost-effective outsourcing.
FIG. 1. An example of 96, 384, and 1536 well plates for high-throughput micro- and nanoscale experiments. Source: www.abautopedia.com
Given the capacity for speed such technology offers, it is only natural that other high-throughput methods are now being developed and used for a wide range of applications that go beyond biological systems. Chemical synthesis is a particularly tedious process that requires many iterations of failed experiments before a researcher successfully produces a new molecule. New contract labs are teaching the next generation of scientists to be fluent in high-throughput systems and gathering data that will eventually lead to completely automated chemical synthesis. Formulations are also benefiting from a high-throughput advantage. The personal care and coatings industries are applying high-throughput methods to speed the production of new formulations and bring products to market faster.
The future of formulations
The United Kingdom has established a network of technology centers, known as the Catapult Network, with the mission of driving innovation by bridging the ideas of academia with the know-how of industry. The Center for Process Innovation (CPI), whose head office is in Wilton, UK, was established nearly 15 years ago to help companies launch new products (https://www.uk-cpi.com). A little over four years ago, they started applying high-throughput methods to evaluate and optimize formulations. “We support businesses across all sectors in the formulations space,” says Tony Jackson, formulations business unit director at CPI.
CPI has the capability to evaluate the performance of hundreds of variations of a formulation. Automated robots dispense a range of ingredients in complex mixtures to optimize a desired formulation. Like the screening labs mentioned earlier, CPI is a facility open to companies that are interested in developing or improving a product, without investing in pricey, high-throughput equipment. According to Jackson, this is crucial to the way today’s consumer goods businesses are trying to keep up with customer expectations.
“They are thinking about, in the future, going from making big batches of paint or washing powder to making more personalized products,” Jackson says. His business unit at CPI works with companies who want fast analytical measurements so they can understand formulation properties and quickly scale-up to batch processes. Companies are also interested in exploring options for continuous production that will give them the versatility to alter their manufacturing schedule according to consumer trends.
“Consumers expect their products to solve the problems that they have today,” says Jackson. That might mean personalized medicine that is tuned to seasonal cold outbreaks, or laundry detergents with more stain-fighting ingredients in summer, when clothes get dirtier from more time spent outdoors. To achieve seamless conversion of products during continuous production, companies must first understand the chemical and physical properties of a wide range of formulations, he explains. High-throughput methods are ideal for gathering this information.
High-speed stability testing
Aside from assisting with how best to incorporate ingredients into a formulation, CPI has spent the past two years designing a new high-throughput way to test formulation stability. Currently, even with these multiplate, robotic devices, formulas require periodic testing over long periods of time to determine if or when a formulation becomes unstable. This slows down the launch of new products dramatically. Jackson and his team at CPI have been collaborating with two UK universities to take the wait out of stability testing by developing a microfluidics technology that can cut testing down to days, possibly hours.
FIG. 2. A robotic arm retrieves well plates for analysis in a high-throughput screening contract lab maintained by the National Institutes of Health in Bethesda, Maryland, USA. Source: https://ncats.nih.gov
“Instead of making a formulation, sticking it in a vial, and checking it periodically, this method uses microfluidics to impart mechanical and temperature stresses onto products,” says Jackson (Fig. 3). “You can detect changes more quickly and with much less volume.”
Working with Procter & Gamble and BP, CPI is in the process of testing and validating their high-throughput stability system at the pilot level. Jackson says they hope that in another year, the method will be useful as both a pass/fail stability screen and a mechanistic means for understanding why a formulation fails. “It will be a significant tool in the formulations armory,” he says.
High-throughput screening is being applied to other types of formulations testing as well. At the AOCS annual meeting last year, Carol Mohler, of the Dow Chemical Company in Midland, Michigan, USA, described a high-throughput method of measuring viscosity. The viscometer the company created for internal use can measure hundreds of samples in an hour. The instrument quickly identifies samples with undesirable flow behaviors, narrowing the sample size needed for detailed rheological characterizations.
As more companies recruit these computer-controlled methods to conduct research, scientists with high-throughput savvy will have an advantage in the workplace.
Training scientists to train robots
The Imperial College of London offers a doctoral program, called Next Generation Synthesis and Reaction Technology, in which students learn to use continuous-flow reactors and high-throughput robotics platforms to create new molecules. The training occurs at Imperial’s Center for Rapid Online Analysis of Reactions (ROAR), a public facility that opened in early 2019. ROAR has the capability to evaluate reactions in real time with instruments that identify intermediates, providing insight into the synthesis with each step of a reaction. The students in the program also learn data science, a burgeoning area of chemistry and biology that uses statistical analysis to interpret the data that results from running thousands of iterations of the same reaction that each took place under slightly different reaction conditions.
Like CPI and others, ROAR is open to anyone in the chemistry community who proposes an experiment to the lab’s full-time staff (http://www.imperial.ac.uk). While exposing future scientists to the variety of problems that can be solved with high-throughput systems, ROAR is also teaching the computers.
A wealth of synthesis instructions can be found in the scientific literature, and researchers hope they could teach machines to use them. Many have tried, but creating an algorithm to teach a machine to collect the information published in scientific papers is not like teaching a supercomputer to play chess. The instructions are written out in an analog way that is not easy for a computer to comprehend. In addition, literature synthesis instructions can be vague or incomplete and they do not include mistakes. Scientists do not publish the starting material and reaction conditions that were attempted but did not result in a product. These are details that a machine would need to be able to think like a synthetic chemist.
“Synthetic chemists in academic labs are not collecting the right data and not reporting it in the right way,” ROAR’s facility manager, Benjamin J. Deadman, said in an October 2019 article in C&EN (https://cen.acs.org/synthesis/Automation-people-Training-new-generation/97/i42)
However, as ROAR begins to routinely collect large amounts of data on a variety of synthetic reactions, that data can be used to establish machine-learning algorithms to eventually teach robots to do independent synthesis. Some scientists are imagining a future of robot laboratory technicians. Tell the robot what molecule to synthesize and, using artificial intelligence, it will get the job done. Collecting synthesis data in real time, as ROAR does, gets us to that future sooner.
Though high-throughput methods may speed up the process of machine learning, they are not going to replace human scientists any time soon. As data collection from high-throughput screening for drug development has proven, having large amounts of data can cause a whole new set of problems.
FIG. 3. An example of an instrument that can be used to dispense reagents for high-throughput experiments. Source: https://hudsonrobotics.com
Handling high-throughput data
The intention of any high-throughput process is to perform research faster. Though employing computers to perform thousands of experiments may speed up the work, researchers are left with the burden of making sure their work has meaning. For the purpose of drug development or any other biotech application, the use of living organisms further complicates the analysis.
Along each step in the high-throughput screening process, a scientist must make a choice about algorithms. When determining the genes expressed in thousands of single cells, for example, a mathematical function is used to visualize the resulting, extensive data set. More than 100 such tools exist, according to Catalina Vallejos, a researcher at the University of Edinburgh, Scotland, and scientists need to prioritize the task of establishing reproducible benchmark schemes for high-throughput methods.
Measuring the activity of a compound on a drug target is one thing, but quantifying efficacy requires non-linear regression, which means selecting from several mathematical models. The wrong model could result in a false positive or in a viable compound being overlooked. The size and complexity of high-throughput data sets means performing uncertainty calculations for each result would require too much attention, negating the benefit of the time-saving technique.
In recent years, software developers have built models that specifically address data handling for high-throughput analysis. These products can boost the confidence of the results within an organization, but for data to be shared or evaluated outside a company, researchers using high-throughput methods need standardized procedures. Continuous-flow systems manned by robots do not record fine grain data, such as batch and plate number, which makes evaluating another researcher’s results difficult. There is no way to make sure everyone is looking at the same thing.
Pharmaceutical companies are making efforts to remove these pitfalls. AstraZeneca, GlaxoSmithKline, Novartis, and Pfizer founded a collaboration called Pistoia Alliance meant to lower the barriers to innovations in healthcare. Last year, they released the Unified Data Model (https://www.pistoiaalliance.org/news), which specifies an agreed-upon file format for chemical reaction information. As more ingredient manufactures rely on biotechnology as a source for their products, eventually they may realize the need to reach a similar agreement on data collection and management.
Despite the lack of standardized methods, high-throughput technology continues to expand into more applications. Its platforms provide formulators with information about new materials and new mixtures on a faster timescale. In turn, formulation scientists can correlate this information with findings from traditional methods and provide the product attributes that appeal to the consumer. With high-throughput technology available to measure fundamental properties like rheology, friction, and wear the process continues to prove its relevance in the consumer goods market.
The digitization of organic synthesis, Davies, I.W., Nature 570: 175–181, 2019.
A novel high-throughput viscometer, Deshmukh, S.S., et al., ACS Comb. Sci. 18: 405−414, 2016.
Exploring a world of a thousand dimensions, Vallejos, C.A., at. Biotechnol. 37: 1423–1424, 2019.
A systematic evaluation of single cell RNA-seq analysis pipelines, Vieth, B., S. Parekh, and C. Ziegenhain, et al., Nat. Commun. 10: 4667, 2019.
Uncertainty quantification in ToxCast high throughput screening, Watt, E.D. and R.S. Judson, PLOS ONE 13: 7, 2018.
Automation for the people: training a new generation of chemists in data-driven synthesis, Peplow, M., C&EN 97: 42, 2019.