TOOLS FOR PROTEIN SCIENCE
TOOLS FOR PROTEIN SCIENCE
ShiftScan: A tool for rapid analysis of high-throughput differential scanning fluorimetry data and compound prioritization
One INTRODUCTION
Differential scanning fluorimetry, a type of thermal shift assay, can be used to monitor temperature-induced unfolding events of either protein or polynucleotide macromolecules. In a typical assay for changes in protein stability, a protein of interest is incubated with a fluorescent dye, most commonly Sypro Orange, in a multi-well plate and heated in a real-time polymerase chain reaction
Abstract
Abstract
Differential scanning fluorimetry can be an effective high-throughput screening assay in drug discovery for detecting protein-compound interactions that stabilize or destabilize macromolecules. Due to the magnitude and quality of the data produced by this biophysical assay, analyzing and prioritizing compounds from large-scale differential scanning fluorimetry data sets has proven challenging to the research community. Here, we present ShiftScan-a powerful, stand-alone tool designed for the rapid analysis of differential scanning fluorimetry data and compound prioritization based on thermal transition patterns. ShiftScan accurately and quickly predicts melting temperatures from both canonical and non-canonical transition patterns, efficiently filtering out spurious data to minimize false positives. We report on the use of this tool for data analysis of screens involving both pure compound and natural product fraction libraries and provide the software to the screening community to aid in the discovery of molecularly-targeted compounds. Instructions for installation and usage of ShiftScan can be found at our GitHub repository.
instrument. The plate is uniformly heated over a temperature gradient, and as the protein unfolds, the hydrophobic residues of the inner regions of the protein structure become exposed and bind the Sypro Orange dye, increasing the dye's fluorescence intensity. Fluorescence measurements are routinely taken greater than two hundred fifty times per well over the course of an experiment. If the protein unfolds in a two-state (folded to unfolded) manner, the change in fluorescence with respect to temperature typically follows a sigmoidal pattern, and the mid-point of the rise in the curve can be extrapolated as the protein's melting temperature. This melting temperature can be monitored in the presence of small molecule libraries; if the binding of a molecule causes a structural perturbation in the protein, this is often reflected in a shift in the melting temperature of the protein. A shift in melting temperature can occur in either direction corresponding to either a stabilization (higher melting temperature) or a destabilization (lower melting temperature) of the protein's tertiary structure upon ligand binding.
Differential scanning fluorimetry is used in numerous applications, including as a primary screening platform for detecting novel macromolecule-compound interactions. In these differential scanning fluorimetry screens, compounds (or mixture of compounds from fractionated natural product extracts) can be assayed against a protein of interest in a high throughput manner, the samples tested in a ninety-six-well or three hundred eighty-four-well plate. The raw fluorescence readings upon heating are then measured, and the extrapolated melting temperatures of the sample wells relative to that of controls determine which samples may be worth further consideration. Differential scanning fluorimetry is a cost-effective, sensitive assay in early-stage drug discovery that can be performed in high-throughput manner on widely available real-time polymerase chain reaction instruments. While the ideal output of a differential scanning fluorimetry assay is a series of sigmoidal curves that can be easily compared to one another, there are often non-canonical transition patterns that occur due to interference by a compound or extract's intrinsic fluorescence. Sub-optimal experimental conditions and non-two-state protein unfolding events due to sample effects can also produce non-canonical transition curves, making the entire process of melting temperature extrapolation and compound prioritization more challenging.
In the context of a three hundred eighty-four-well plate setup, with a two hundred sixty-step temperature increase, a single plate generates approximately one hundred thousand data points. This means that in a high throughput setting, datasets can rapidly produce millions of data points for analysis. Various tools are available for processing and inspecting the data, including differential scanning fluorimetry World, Simple Differential Scanning Fluorimetry Viewer, the differential scanning fluorimetry workflow in KNIME, High-Throughput Differential Scanning Fluorimetry Explorer, among others. Differential scanning fluorimetry World is a recently developed and useful tool that exploits four robust mathematical models for fitting canonical and non-canonical differential scanning fluorimetry curve data. However, to our knowledge, neither differential scanning fluorimetry World nor Simple Differential Scanning Fluorimetry Viewer process data from high-throughput screening campaigns. The KNIME workflow is powerful but relatively slow, processing one three hundred eighty-four-well plate at a time, and requiring heavy user interaction or input in at least eleven different points and visual inspection of individual melting curves for "well-behaved" data;
this is not a feasible option for high-throughput initiatives. Similarly, this approach excludes curves which follow non-canonical transition patterns which may be of interest to researchers. High-Throughput Differential Scanning Fluorimetry Explorer processes differential scanning fluorimetry data in a high-throughput manner but aims to provide preliminary binding constants from concentration response assays rather than prioritization of compounds at a single concentration (as is often screened in preliminary high-throughput campaigns).
A recent investigation into strategies for compound prioritization found differential scanning fluorimetry to be an optimal first step in identifying hits before confirmation with either surface plasmon resonance or temperature-related intensity change assays. Similarly, major pharmaceutical companies such as AstraZeneca have recently reported the automation of the differential scanning fluorimetry assay for the high-throughput screening of approximately one hundred thousand compounds. At the National Cancer Institute, we have developed a workflow for screening large libraries of pure compounds and pre-fractionated natural product extracts against protein targets of interest using differential scanning fluorimetry. The challenge has been the magnitude of the resultant data generated (i.e., for every hundred three hundred eighty-four-well plates assayed across a two hundred sixty-step temperature gradient (thirty-five thousand two hundred test samples) approximately ten million data points are produced). Along with the size of the data set, additional challenges, such as a variety of non-canonical transition patterns and otherwise noisy data, often result from the assaying of natural product mixtures.
To address these challenges, we developed ShiftScan, a standalone tool that can be run locally or on large computing clusters, with the choice of RAM- or disk-intensive modes to suit different system capabilities. Along with the main processing algorithm, we have developed a companion visualization tool for the rapid identification of hits as defined by a user's criteria. ShiftScan is also available as a Google Colab notebook and a stand-alone GUI application, although the increased user-friendliness in these implementations comes at the cost of processing speed. ShiftScan processes data in a plate-wise fashion and can analyze data from sample sets assaying different proteins of interest simultaneously. We hope that ShiftScan will help provide a valuable tool to the scientific community in the pursuit of novel therapeutics.