Leitung: Prof. Dr. Heinrich Sticht
Bioinformatics, the application of computers in biological sciences and especially the analysis of biological sequence and structure data, has become an important tool in molecular biology. Genome sequencing projects have generated vast quantities of sequence data and the emerging projects in structural genomics are expected to produce a similar wealth of information with respect to protein structures in the future.
The aim of bioinformatics is to decipher this information and to convert it into biochemical, biophysical and medical knowledge using computational tools. In this context, it is not sufficient to assign molecular functions to isolated proteins, but also to predict their cellular role (e.g. the metabolic or signaling pathway to which a certain protein belongs). The latter goal requires an understanding of those molecular factors that govern affinity and specificity of molecular interactions.
My group is primarily interested in investigating molecular interactions by a variety of comoutational tools (e.g. sequence data analysis, molecular modeling, molecular dynamics) to gain information about the determinants of molecular recognition. We apply these methods to a number of biological systems of medical relevance which are described in more detail below:
The project focuses on the prediction and structural characterization of host-pathogen protein interactions using computational tools. Such recognition processes either occur between short sequence motifs that bind complementary adapter modules or between pairs of globular protein domains. These types of interactions do not only differ from a structural point of view but also with respect to the computational tools required for their prediction and analysis.
One particular problem for the prediction of functional interaction motifs is the short length of the respective sequence patterns resulting in a large number of false-positive hits, which prove to be non-functional in subsequent experiments. Therefore, we aim to improve the specificity of the predictions by assessing the importance of motif-specific flanking sequence regions. In order to further increase the reliability of the predictions, modeling of sequence motifs in complex with the respective adapter domains is performed, what will allow to judge the likelihood of an interaction based on a three-dimensional structure. For the host-pathogen interactions formed between globular proteins domains, a combination of molecular modeling, docking, and molecular dynamics simulations is used. The latter technique provides information about the conformational stability and energetics of an interaction that can hardly be deduced from static structures alone. These methods are for example applied to study the structure of the cytomegaloviral regulator proteins and their interaction with its cellular targets
The Human Immunodeficiency Virus (HIV), which causes the acquired immunodeficiency syndrome (AIDS), is a member of the retrovirus family. The HIV-protease is essential for replication and assembly of the virus and therefore it has become an important target for the design of antiviral agents. These drugs generally bind to the active site of the protease, thus blocking access of the substrate and resulting in a catalytically inactive enzyme. The major problem shared by current therapies is the rapid development of resistance to antiretroviral drugs resulting from mutations of amino acids in the protease. Mutations can occur at a large variety of locations in HIV-protease and can also confer different levels of resistance for distinct inhibitors. For most of these mutations, their mechanism cannot be explained on the basis of the rigid three-dimensional structures available. Therefore, it was suggested that these mutations alter the dynamics of the protease thereby reducing its affinity for the inhibitor. Using molecular dynamics simulations, we were able to show, that several mutations in HIV-protease actually affect the dynamics of the protein and decrease the affinity of inhibitor binding. The results from these simulations are expected to facilitate the design of novel and more effective drugs, e.g. by targeting different residues or by developing allosteric inhibitors that are capable of modulating protease dynamics.3. Prediction of networks of interacting proteins using computational tools
Protein-protein interactions play a crucial role for the transduction of information in biological signaling pathways. The identification of the underlying principles of molecular recognition is important for the understanding of regulatory mechanisms and for the prediction of novel physiologically relevant protein interactions.
Existing computational methods use information from gene fusion, the relative order of genes, phylogenetic profiles or a combination of these methods in order to predict functional relationships and thus putative protein-protein interactions. We pursue an alternative computational approach that uses the information available from the experimentally determined three-dimensional structures of proteins and protein-protein complexes. At present more than 2000 structures of protein-protein complexes have been determined by NMR-spectroscopy or X-Ray crystallography. Although this number of complexes is still rather small compared to the number of possible pairs of interacting proteins within a cell, it can be drastically increased by generating models of homologous complexes. However, it cannot a priori be presumed that two proteins that are homologous to complex of known structure interact in the same fashion or even interact at all. Therefore, one important goal of bioinformatics is the development of algorithms that are able to assess the affinity and specificity of intermolecular interactions.
At the moment, we pursue an approach that attempts to transfer the knowledge gained from the design of globular proteins to the design of protein interfaces. Protein design allows the identification of sequences that are compatible with a given fold and has already been successfully applied for the design of protein cores and even of novel folds. In the present studies we are using protein design algorithms in a similar way to identify those sequences that are compatible with a particular protein interface. Conversion of this sequence-structure relationship into scoring functions will allow a genome-wide search for physiological interaction partners. We are currently investigating several model systems in which protein-protein interactions play a crucial role in signal transduction (e.g. cytokine-receptor complexes, cyclin-cdk complexes).
Another project that mainly uses molecular modeling, protein docking and molecular dynamics as bioinformatic tools focuses on the characterization of the molecular basis of specific recognition of the HPr-protein. HPr from gram positive bacteria is a multifunctional protein that plays both a role for sugar uptake by the PEP:phosphotransferase system and also for transcriptional control mediating both induction of numerous catabolic operons and also catabolite repression. In order to gain a better understanding of the complex regulation processes controlling in carbohydrate metabolism, we are using computational strategies to identify those structural features that ensure the specific recognition of HPr by its known interaction partners. This information will subsequently also be applied to identify yet unknown interaction partners of HPr particularly focusing in its role in transcriptional control.
Molecular docking represents a versatile and important computational method for determining the structure of protein-protein complexes. Despite considerable efforts during the past years, a general solution to this problem is not yet within reach. One major problem is the definition of suitable criteria for a scoring function that allows the identification of a good docking solution among many false arrangements.
Our present work demonstrates that the concepts from information theory can be adapted to treat the biological problem of protein-protein docking. We have developed a formalism on the concept of mutual information (MI) to investigate different features with respect to their information content in protein docking and we have also shown that MI-values can successfully be converted into a scoring function. Current work includes the analysis of larger datasets and more sophisticated structural features to obtain a robust and widely applicable approach.