|
Syllabus, Readings and Lecture Notes
Course Overview
Motif and cis-Regulatory Module (CRM) Modeling
- topics: learning motif models, learning models of cis-regulatory
modules, Gibbs sampling, Dirichlet priors,
parameter tying, sequence entropy, mutual information
- required reading
- T. Bailey and C. Elkan.
The value
of prior knowledge in discovering motifs with MEME.
In Proceedings of the 3rd International Conference on
Intelligent Systems for Molecular Biology, pp. 21-29, 1995.
- C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and
J. Wootton. Detecting
subtle sequence signals: a Gibbs sampling strategy for multiple alignment.
Science 262:208-214, 1993.
- O. Elemento, N. Slonim and S. Tavazoie.
A universal framework for regulatory element discovery across all genomes and data types.
Molecular Cell 28(2):337-350, 2007.
(Supplemental materials containing key methodological details)
- optional reading
- optional viewing
- lecture notes
- Learning Sequence
Motif Models using EM
(PDF, PPTX) (1/19, 1/24, 1/26)
- Learning Sequence Motif
Models using Gibbs Sampling (PDF, PPTX, Gamma example, Dirichlet example) (1/26, 1/31, 2/2)
- Inferring Models of cis-Regulatory
Modules using Information Theory (PDF, PPTX) (2/2, 2/7, 2/9)
Genotype Analysis
- topics: haplotype inference, genome-wide association studies (GWAS),
quantitative trait loci (QTL) mapping, multiple hypothesis testing
- required reading
- optional reading
- lecture notes
- Linking Genetic Variation to Important Phenotypes (PDF, PPTX) (2/9, 2/14)
- GWAS and multiple testing correction (PDF, PPTX) (2/14, 2/16, 2/21)
Epigenomics
- topics: epigenomic data types, DNase I hypersensitivity, Gaussian processes,
convolutional neural networks, interpreting noncoding genetic variants
- required reading
- R.I. Sherwood, T. Hashimoto, C.W. O'Donnell, S. Lewis, A.A. Barkal, J.P. van Hoff, V. Karun, T. Jaakkola, and D.K. Gifford. Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat Biotechnol 32(2): 171–178, 2014.
- J. Lever, M. Krzywinski, and N. Altman. Points of Significance: Classification evaluation. Nat Methods 13(8):603-604, 2016.
- C. Angermueller, T. Pärnamaa, L. Parts, and O. Stegle. Deep learning for computational biology. Mol Syst Biol 12(7):878, 2016.
- optional reading
- lecture notes
RNA-Seq and Mass Spectrometry
- topics: RNA-Seq technology, transcript quantification,
peptide and protein identification with mass spectrometry
- required reading
- optional reading
- lecture notes
- Transcript quantification with RNA-Seq (PDF, PPTX) (3/14, 3/16)
- Mass spectrometry (PDF, PPTX) (3/28, 3/30, 4/4)
Biological Network Analysis
- topics: protein interactions, pathway identification, linear programming, min cost flow
- required reading
- E. Yeger-Lotem, L. Riva, L.J. Su, A.D. Gitler, A.G. Cashikar, O.D. King, P.K. Auluck, M.L. Geddie, J.S. Valastyan, D.R. Karger, S. Lindquist, and E. Fraenkel. Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity. Nat Genet 41(3):316-323, 2009.
- optional reading
- lecture notes
- Identifying signaling pathways (PDF, PPTX) (4/4, 4/6)
Gene Finding
- topics: gene finding, interpolated Markov models, generalized HMMs, pair HMMs
- required reading
- optional reading
- lecture notes
- Interpolated Markov Models for Gene Finding (PDF, PPTX) (4/11, 4/13)
- Eukaryotic Gene Finding (PDF, PPTX) (4/13)
Large-Scale and Whole-Genome Sequence Alignment
- topics: large-scale alignment, whole-genome alignment,
suffix trees, k-mer tries, longest increasing
subsequence problem, MUMmer
- required reading
- A. Delcher, S. Kasif, R. Fleischmann, J. Peterson, O. White
and S. Salzberg.
Alignment of Whole Genomes.
Nucleic Acids Research 27(11):2369-2376, 1999.
- optional reading
- E. Ukkonen.
On-line Construction of Suffix Trees
Algorithmica 14(3):249-260, 1995.
- M. Brudno, C. Do, G. Cooper, M. Kim, E. Davydov, NISC Comparative
Sequencing Program, E. Green, A. Sidow, and S. Batzoglou.
LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale
Multiple Alignment of Genomic DNA.
Genome Research 13:721-731, 2003.
- Chapter 3 of C. Dewey.
Whole-genome alignments and polytopes for comparative genomics.
PhD thesis. University of California, Berkeley, 2006.
- lecture notes
- Alignment of Long Sequences (PDF, PPTX) (4/18, 4/20, 4/25)
RNA Structure Analysis
- topics: predicting RNA secondary structure, Nussinov/energy-minimization algorithms,
stochastic context free grammars
- required reading
- Chapter 9 in Durbin et al.
- Sections 10.1, 10.2 in Durbin et al.
- optional reading
- lecture notes
- RNA Secondary Structure Prediction
(PDF, PPTX) (4/25, 4/27)
- Stochastic Context Free Grammars for RNA Structure Modeling
(PDF, PPTX) (4/27, 5/2)
Protein Structure Prediction
- topics: secondary structure prediction, threading, branch and bound search
- required reading
- optional reading
- lecture notes
- Introduction to Protein Structure Prediction (PDF, PPTX) (5/2)
- Protein Threading (PDF, PPTX) (5/4)
Lecture Notes
Thank you to Professors Mark Craven and Colin Dewey for providing lecture material. These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by Mark Craven, Colin Dewey, and Anthony Gitter.
|