Allosteric and Energetic Remodeling by Protein Domain Extensions
Allosteric and Energetic Remodeling by Protein Domain Extensions
Abstract
Many functions of proteins are performed by independently folding structural units called domains. The structures of domains are conserved during evolution but they are not identical. For example, the greater than two hundred seventy human PDZ domains vary in the number of secondary structure elements and in the length of loops. An important but largely unexplored question is the impact of these extensions on protein energy landscapes: beyond any immediate functional effects, do extensions also alter the consequences of perturbations elsewhere in the domain, altering the potential for regulation and evolvability? Here we perform massively parallel energetic measurements on a model human PDZ domain to directly and comprehensively answer this question. In total we quantify the binding to a ligand and abundance of approximately one hundred ninety thousand protein variants to quantify free energy changes for mutations throughout the canonical domain fold and approximately seven thousand energetic couplings between these mutations and the two domain extensions, both alone and in combination. We find that both extensions-one structured and one more dynamic-substantially and specifically re-shape the energy landscape of the domain, with the removal of an alpha-helix altering the energetic consequences of four hundred twenty-four mutations in fifty-four sites on fold stability and four hundred twenty mutations in fifty-six sites on binding to a ligand. These changes to the energy landscape alter the effects of three hundred thirty allosteric mutations, including at solvent-accessible surface sites. Extending or pruning the domain therefore reshapes its energetic and allosteric landscape, adding and removing opportunities for the allosteric control of protein function.
Introduction
Introduction
Independently folding domains are often considered the functional and evolutionary structural units of proteins. For example, the human genome encodes greater than eight thousand distinct domain families with a median of two different domains per protein and many proteins having substantially more. Individual domains from each family have conserved secondary structure topologies and folds, but they can also differ, with frequent amino acid insertion or deletion in loops and by the addition of secondary structure elements, particularly at their N- and C-termini. These 'domain extensions' can have important functional consequences, for example altering protein stability or the affinity of binding to ligands.
The impact of a domain extension could be direct, for example adding new ligand contacts to a binding interface, or it could be indirect, influencing function at another site in the protein. Such indirect or allosteric effects are not well understood and are difficult to predict. In addition, a domain extension could potentially alter the consequences of perturbations elsewhere in the protein. For example, an extension might alter the energy landscape of a protein such that a distant allosteric regulatory site is strengthened or weakened. It is this question that we seek to address in this manuscript: how do domain extensions alter the effects of perturbations throughout a domain.
Perturbations to proteins include binding to other proteins, nucleic acids or small molecules, covalent post-translational modifications, and mutations. Mutations are particularly powerful experimental perturbations as they can be introduced at every site throughout a protein. This approach - often called deep mutational scanning - uses pooled libraries of variants and sequencing to quantify changes in variant frequencies during selection experiments. Coupled to selections for a particular protein property or function, mutational scanning can quantify the effects of thousands of different perturbations to a protein in a single pooled experiment.
To quantify the impact of domain extensions using mutations, we ask whether adding or removing a domain extension changes the effect of each mutation elsewhere in the domain. For biophysical properties, this question is whether adding or removing a domain extension alters the energetic effect of a mutation. Formally, the question is whether each mutation is energetically coupled to loss or gain of the extension. Energetic couplings (also called genetic interactions or epistasis) between a mutation and an extension might affect one or multiple properties of a protein, for example its stability or affinity of binding to a ligand.
At one extreme, domain extensions might be 'modular' changes, having functional consequences but effects that are not coupled to (i.e. are energetically additive with) perturbations elsewhere in the protein. At another extreme, domain extensions might be strongly energetically coupled to other sites, altering the effects of perturbations throughout the domain. The extent of this coupling will have important evolutionary and functional consequences. For example, strong energetic coupling between domain extensions and the rest of a domain could result in the emergence of new allosteric sites i.e. extension of the domain could change the sites elsewhere in the domain where mutations or other modifications alter the activity of the protein. The extension of a domain by structured or unstructured sequences could therefore create or remove regulatory sites. Similarly, strong energetic coupling could result in domain extensions creating and destroying opportunities for therapeutic targeting by the gain or loss of allosteric activity at accessible sites. Similarly, if a domain extension is energetically coupled to other sites in the domain it will by definition alter the consequences of mutations at those sites, so altering the potential for evolution i.e. evolvability.
Recently, we and others have developed methods to comprehensively quantify the energetic effects of mutations throughout protein domains using massively parallel assays and model fitting. Using different experimental selections it is possible to quantify the effects of mutations on different protein properties, including fold stability (quantified as the Gibbs free energy of folding, Delta G folding) and binding to interaction partners (quantified as the Gibbs free energy of binding, Delta G binding). These multiplexed experimental approaches present an opportunity to comprehensively quantify the impact of domain extensions on protein energy landscapes.
Here we apply this approach at scale. First, we measure the energetic effect of each mutation throughout a domain on fold stability (Delta G folding) and binding to a ligand (Delta G binding). We then repeat the measurements after removing the domain extension. The difference in the energetic effect of each mutation with and without the extension quantifies the energetic coupling between the extension and the mutation. For fold stability this coupling is quantified as Delta G folding, extension present minus Delta G folding, extension removed, where Delta G folding, extension removed is the folding energy change of the mutation in the domain without the extension. For ligand binding it is quantified as Delta G binding, extension present minus Delta G binding, extension removed, where Delta G binding, extension removed is the binding energy change without the extension. These pairwise (second order) energetic couplings (genetic interactions) provide a complete picture of how removing a domain extension alters the energetic effects of mutations throughout a domain.
As a model system we use a well-studied model domain, the third PDZ domain from the protein PSD-ninety-five, henceforth referred to as PDZ three; PDZ domains are named for the three founding members of the family: Postsynaptic density protein (PSD-ninety-five), Drosophila discs large tumor suppressor (Dlg), and Zonula occludens-one protein (ZO-one). PSD-ninety-five is a member of the MAGUK (membrane-associated guanylate kinase) family and a key scaffolding protein found in excitatory synapses where it anchors proteins to the postsynaptic membrane. PDZ domains are the largest family of human protein-protein interaction domains with more than two hundred seventy PDZ domains in greater than one hundred fifty different human proteins that participate in a wide range of cellular processes, including signaling, cell polarity, cell adhesion, and neuronal synaptic transmission. Many PDZ domains are considered important therapeutic targets, but very few have been successfully targeted, even experimentally. PDZ domains bind diverse, normally C-terminal short peptide ligands. Despite low sequence identity, PDZ domains share a canonical fold composed of five to six beta-strands and two or three alpha-helices. Ligand recognition occurs in a pocket composed of the beta two strand, the alpha two helix, and the carboxylate-binding beta one-beta two loop. Although PDZ domains share a conserved structural fold, they vary in the precise number of secondary structure elements and in the length of loops. These extensions are frequently at the N- and C-termini of the domains.
PDZ three from PSD-ninety-five contains two domain extensions in addition to the canonical PDZ domain fold: an extra alpha-helix of nine amino acids (residues three hundred ninety-four to four hundred two) at the C-terminus allied the alpha three helix and a more dynamic N-terminal extension of eight amino acids (residues three hundred three to three hundred ten). The alpha three helix increases the affinity of PDZ three binding to peptide ligands approximately twenty-fold. The alpha three helix does not directly contact the ligand, so the change in binding affinity must have an allosteric mechanism. The allosteric potential of the alpha three helix is further demonstrated by the consequences of phosphorylation of Y three hundred ninety-seven within the helix: this post-translational modification further increases ligand binding affinity. Engineering a photoconvertible residue into the alpha three helix further and elegantly demonstrated its allosteric impact on binding affinity. The impact of the N-terminal extension to PDZ three is less studied.
Here we use an experimental design that allows us to comprehensively quantify the changes in folding energy and ligand binding energy for mutations throughout the PDZ three domain in four different contexts: in the presence of both domain extensions, in the absence of the alpha three helix, in the absence of the N-terminal extension, and in the absence of both extensions. In total we quantify one hundred eighty-seven thousand six hundred twenty-two mutational effects on protein abundance and one hundred ninety-five thousand forty-seven mutational effects on ligand binding to quantify the energetic couplings between all amino acid substitutions and deletion of each of the N- and C-terminal extensions, as well as the energetic interactions with deletion of both extensions together. Both extensions alter the stability of the domain and its ligand binding affinity. In addition, both extensions are also energetically coupled to a subset of sites throughout the core PDZ domain, quantifying how the extensions modulate the core domain's folding and binding energy landscapes. One consequence of this is a transformation of the domain's allosteric landscape, altering evolvability and the potential for regulation and therapeutic targeting.