ConJ/MCB 544: Protein Structure, Modification, Diversification and Regulation
Course is Offered:
First five weeks of winter quarter, 2017 (Tuesday, January 3 thru Thursday, February 2) 1.5 credits
Note: This class is taught every year, with course coordination alternating between Barry Stoddard and Roland Strong. Stoddard runs the class and teaches in odd years (i.e. 2015, 2017, etc).
T, Th 3:15 to 4:45 FHCRC Day Campus, Weintraub Bldg, room B-072/074
Shuttle from UW: Departs UW Medical Center at 2:45 for a 3:05 arrival
Shuttle to UW: Departs FHCRC at 4:58 pm for a 5:10 arrival at UW Medical Center
Barry Stoddard (FHCRC Basic Sciences/UW Biochemistry) (Weeks 1 to 4)
Phil Bradley (FHCRC Public Health Sciences and Basic Sciences/UW Genome Sciences) (Week 5)
Rationale and Background:
While life may have arisen from an RNA-centric origin, and many of the fundamental processes comprising the flow of genetic information are governed by nucleic acids, proteins have evolved to carry out many of the core biochemical and biophysical processes required for life, such as generation of force and motion, creation of physical structures, transmission of material and information, and catalysis of biological reactions. Over the past 10 years, the development and use of proteins as therapeutic agents has increased dramatically, an advance that can be attributed to the enormous number of protein structures and mechanisms that have been elucidated over the past 40+ years, and especially the recent development of powerful methods for protein engineering.
This course will provide a graduate-level survey of many of the fundamental properties of proteins that govern and define their folding, structures and function, and at the same time will introduce students to some of the most commonly used tools (and best practices) for modeling, analyzing and modifying protein structures and properties. After devoting the first week's sessions to an introduction, refresher and enhancement of our understanding of basic structural and dynamic features of proteins (and the ways that such information is organized on the web) the remaining four weeks will provide a detailed examination of how protein chemistry and structure/function analyses are employed, covering four separate interesting aspects of protein structure and function: (1) shape-shifting and moonlighting; (2) post-transcriptional and post-translational splicing; (3) multimer-facilitated cooperativity and allostery; and (3) modular and repeat proteins and molecular recognition.
The course will assume knowledge at the level of an advanced undergraduate biochemistry course, and will be heavily skewed towards the use of structural information to understand the physical basis of protein behavior. Emphasis will be placed upon the use and visualization of protein structural models as an essential component of fully appreciating protein behavior and function. Background and understanding in the areas we will discuss at the level of Stryer, Biochemistry or Alberts, Molecular Biology of the Cell will be assumed.
Assignments and grading:
Students will be graded on and and all of the following:
(A) In-class assignments or quizzes on the assigned readings, at the discretion of the instructor.
(B) Demonstration of familiarity with assigned reading during in-class discussion, including oral presentation of answers to discussion questions.
(C) (MAYBE) Completion of a mid-class protein modeling assignment.
(D) A final written assignment (due friday, February 13th) that demonstrates your total combined knowledge, intuition and imagination regarding functional annotation of a protein structure, based on the "Functional Sleuth" database.
SESSIONS AND READING ASSIGNMENTS (ALL PAPERS WILL BE MADE AVAILABLE VIA DROPBOX):
(Readings in parentheses) are suggested and encouraged; those not in parentheses are VERY STRONGLY suggested).
Students will be called on to answer questions and/or lead discussion on these. We limit assigned papers to one (infrequently two) per session, so please read them thoroughly and be prepared to participate in discussion. Please don't be 'that person' who comes to a class with no clue about what the papers were about!
Week 1 (January 3 and 5):
Topics: Basics and beyond basics of protein folds and structures.
1.Berman et al. (2016) "The archiving and dissemination of biological structure data" Current Opinion in Structural Biology 40: 17 - 22. (PubMed 27450113)
(2. Finn et al. (2016) "The Pfam protein families database: towards a more sustainable future" Nucleic Acids Research 44: D279 - D285) (PubMed 26673716).
(3. Fox et al. (2014) "SCOPe: Structural Classification of Proteins -- extended, integrating SCOP and ASTRAL data and classification of new structures" Nucleic Acids Research 42: D304 - D309. (PubMed 24304899)
(4. Dawson et al. (2016) "CATH: an expanded resource to predict protein function through structure and sequence" Nucleic Acids Research doi: 10.1093/nar/gkw1098) (PubMed 27899584)
Week 2 (January 10 and 12): (Barry Stoddard)
Topics: Shape Shifting and Moonlighting Proteins
1. Tuinstra et al. (2008) “Interconversion between two unrelated protein folds in the lymphotactin native state” PNAS USA 105: 5057 - 5062. (PubMed 18364395)
2. Walden et al. (2006) “Structure of dual function iron regulatory protein 1 complexed with ferritin IRE-RNA” Science 314: 1903 - 1908. (PubMed 17185597)
(3. Kosloff and Kolodny (2008) "Sequence-similar, structure-dissimilar protein pairs in the PDB" Proteins 71: 891 - 902) (PubMed 18004789)
Week 3 (January 17 and 19): (Barry Stoddard)
Topic: Alternatively spliced and self-splicing proteins
1. Garcia et al. (2004) “A conformational switch in the Piccolo C2A domain regulated by alternative splicing” Nature Struct. Mol. Biol. 11: 45 - 53. (PubMed 14718922)
2. Ciragan et al. (2016) “Salt-inducible protein splicing in cis and trans by inteins from extremely halophilic archaea as a novel protien-engineering tool” J. Mol. Biol. 428: 4573 - 4588 (PubMed 27720988).
(3. Kelley et al. (2016) "The Phyre2 web portal for protein modeling, prediction and analysis" Nature Protocols 10 (6): 845 - 858) (PubMed 25950237).
(4. Kiriakidou et al. (2007) "An mRNA M7G cap binding-like motif with human Ago2 represses translation" Cell 129: 1141 - 1151 (PubMed 17424464) VERSUS Kinch and Grishin (2009) "The human Ago2 AMC region does not contain an elF4E-like mRNA cap binding motif" Biology Direct 4: 2 (PubMed 19159466)
Week 4 (January 24 and 26): (Barry Stoddard)
Topic: Cooperative and allosteric protein assemblages and behavior.
Tools: Protein optimization and engineering server (PROSS: PRotein One Stop Shopping)
1. Dombrauckas et al. (2005) "Structural basis for tumor pyruvate kinase M2 allosteric regulation and catalysis” Biochemistry 44: 9417 - 9429. (PubMed 15996096)
2. Goldenzweig, et al. (2016) "Automated structure- and sequence-based design of protein for high bacterial expression and stability" Molecular Cell 63: 337 - 346. (PubMed 27425410)
(3. Johansson, R., et al. (2016) "Structural mechanism of allosteric activity regulation in a ribonucleotide reductase with double ATP cones" Structure 24 (6): 906 - 917. (PubMed 27133024)
Week 5 (January 31 and February 2) (Phil Bradley)
Topic: Modular and Repeat Proteins and Computational Protein Engineering
1. Mak, A. N., Bradley, P., Cernadas, R. A., Bogdanove, A. J., and Stoddard, B. L. (2012) "The crystal structure of TAL effector PthXo1 bound to its DNA target" Science 335: 716 - 719. PubMed 22223736.
2. Bradley, P. (2012) "Structural modeling of TAL effector-DNA interactions" Protein Science 21 (4): 471 - 474. PubMed 22334576.
3. Doyle, L, Hallinan, J., Bolduc, J., Parmeggiani, F., Baker, D., Stoddard, B. L., and Bradley P. (2015) "Rational design of a-helical tandem repeat proteins with closed architectures" Nature 528 (7583) 585 - 588. (PubMed 26675735).
As you may know, the NIH and several other funding agencies around the world have for about ten years funded large consortiums of investigators (comprised of individual academic labs, national laboratory facilities, research centers and industrial groups) to determine structures of as many proteins as possible (from specific model organisms, putative biochemical pathways and large homlogous gene superfamilies). Termed "Structural Genomics", this effort has thus far led to the determination of several thousand distinct protein crystal just by the consortiums funded by the NIH through their "Protein Structure Initiative".
Because the laboratories engaged in Structural Genomics will solve structures of pretty much anything that they can express, purify and crystallize without overdue concern for biological context or information, a large number of protein structures now exist for which there is no functional annotation. Here is your chance to get involved in this area of investigation, using only your understanding of protein sequences, structure, and function your internet service provider and your imagination.
The gallery of these "nonannotated" protein structures, with links to their PDB entries, is provided at the somewhat stupidly named "Functional Sleuth" website:
http://sbkb.org/kb/search.do?SSIDSearch=UnkStruc (Start by clicking on "View by PSI Center")
....wth the invitation for really smart people such as yourself to conduct "further research for proteins in the Protein Data Bank archive whose functions are unknown or minimally characterized" ( i.e., the crystallographers are too busy collecting X-ray data to spend time actually investigating the structures they solve).
As of August 22, 2016, they list 1682 protein structures that "lack functional annotation".
Your assignment: Choose any one of these many hundreds of functionally nonannotated structures from the Functional Sleuth website (how will you choose?).
1. Try to annotate and describe the function and properties of the protein based only on its sequence, ignoring and avoiding all information that might be available based upon its 3-D structure.
2. Then subject the structure (and its corresponding sequence) to EVERY POSSIBLE method of structure-based analysis that you have learned in this class and any others that you can think of, to do the same thing: annotate and describe the function and properties of the protein.
For both parts of this process, don't limit yourself to only analyses of the coordinates--feel free to find the reading frame in the NCBI database and look at its surrounding genetic context; also see if its popped up in any phenotypic screens.
Write up a report, with original figures, that summarizes your findings. Limit the length of your write up to no more than two pages of single spaced text (11 point arial font or something similar) plus citations and figures.
At the end of the report, please summarize what information and clues regarding function, IF ANY, were provided as a result of having a crystal structure of the exact protein available, rather than only having its sequence available.
This write-up must be of publication quality; i.e. it needs to be well-written, with actual complete sentences and paragraphs, correct grammar, and original figures (NOT snapshots of desktop output!!!). If you send me a poorly written, unclear analysis, I will send it back to you ungraded and respectfully ask that you redo it.
Properly cite all webservers and on-line tools (they all provide a seminal paper that they want cited if you use their tool). Also cite the PDB and the corresponding entry, and the structural genomic consortium from which the structure originated.
Things to address in your analysis MIGHT include (not necessarily in this order)
1. The biological source of the protein, and multisequence alignment analyses:
What fold family(s) are represented in the structure?
What are the most closely related folds from SCOP, CATH or PFAM?
Where are the most conserved residues on the surface of the protein? What are their chemical properties and abilities?
What does the electrostatic surface potential of the protein look like? Are there obvious highly charged or uncharged surfaces that might be involved in molecular recognition?
2. The similarity of the protein structure to other protein structures in the database (DALI analysis). Are the critical residues conserved with those proteins that most closely resemble your candidate?
3. Single protein fold or domain per subunit or multiple protein folds or domains per subunit? If the latter, is there evidence for a cleft and/or 'hinge points' that might indicate a ligand binding site and/or comformational changes?
4. Dynamic information: b-factor plots, disordered regions. Disordered regions are often involved in binding interactions.
5. Sites of possible covalent modifications? Examine the sequence for known sites of glycosylation, lipidation, etc.
6. Back to the gene: is it part of an operon or gene cluster with annotated proteins that might provide clues to function?
7. Are there knockout or knockdown studies in the model organism source that might indicate importantce or function?
8. What happens if you submit the sequence of the protein to a structural threading algorithm? Other than the actual structure, what other structures does the sequence 'hit' on?
9. Oligomeric state of the protein (monomer, dimer, tetramer, etc)--any signs of cooperativity? allostery?
10. Higher symmetry--what types of biological functions have involved proteins with 6-fold, 7-fold, 8-fold or higher symmetry?
11. Secondary structure composition: what types of motions, functions etc are associated with all helical bundles, with extended beta sheets, etc? What types of functions can you comfortably rule out by looking at the structure?
12. Bound ligands: did anything 'come along for the ride' in the purification and crystallization experiment? If so, perhaps it gives a clue regarding high affinity binding by the protein?
These are just a few suggestions. I encourage you to go wild, think up additional ways to look at the protein, and above all else HAVE FUN with this assignment. I look forward to seeing your answers.
Best regards, and thanks for taking this class; I hope you enjoyed it.