Predicting protein function and protein-ligand interactions through machine learning [electronic resource]
- Grace Wonlyn Tang.
- Physical description
- 1 online resource.
Also available at
At the library
All items must be viewed on site
Request items at least 2 days before you visit to allow retrieval from off-site storage. You can request at most 5 items per day.
|3781 2014 T||In-library use|
- With modern day advancements in high throughput technology, we have more genomes, sequences, and protein structures available. An important scientific endeavor is to apply this information towards combating human diseases and disorders. Two key steps in this task involve understanding the function of proteins and developing the means to modulate their behavior. Experimental assays do not possess the necessary throughput to characterize in full the function and drug-binding preferences of these many newly identified proteins. Computational assessment is an attractive alternative, but current algorithms possess many shortcomings. Function prediction tools struggle to annotate sequence and structurally unique proteins; ligand-binding predictors have limited accuracy, as they are largely physics-based with many approximations built into the calculation of intermolecular interactions. With the wealth of biological information, specifically protein structure data, there presents an opportunity to take data-driven, machine learning approaches to these scientific questions. This dissertation thus presents novel computational algorithms for predicting protein function and small molecule interactions that merge protein structure data with machine learning. The first method (HMMDF) combines protein sequence models (Hidden Markov Models) with protein structure models augmented by structural dynamics information (Dynamic FEATURE) to identify the function of sequence and structurally novel proteins. HMMDF applied to thioredoxin function prediction shows high precision and recall. The second method (FragFEATURE) addresses the prediction of protein-ligand interactions using an innovative knowledge base of protein structural environments annotated with the small molecule substructures (fragments) they bind. Given a protein structure of interest, FragFEATURE searches the knowledge base for environments similar to the query to identify statistically enriched fragments. FragFEATURE predicts fragments corresponding to known ligands of a protein target with high accuracy; in many cases, FragFEATURE predicts fragments corresponding to known inhibitors of a protein target. Using this fragment binding predictor, I identified fragments for two bacterial proteins involved in pathogenesis and antibiotic resistance. These fragments may lead us to the development of inhibitors for these therapeutically important protein targets. In summary, the work presented in this dissertation represents novel and powerful methods for interrogating protein function and protein-ligand interactions, strengthening the repertoire of computational tools to assist in the understanding and treatment of human diseases and disorders.
- Publication date
- Submitted to the Department of Bioengineering.
- Thesis (Ph.D.)--Stanford University, 2014.
Browse related items
Start at call number: