|
Syllabus, Readings and Lecture Notes
Course Overview
Motif and cis-Regulatory Module (CRM) Modeling
- topics: learning motif models, learning models of cis-regulatory modules, Gibbs sampling, Dirichlet priors,
parameter tying, heuristic search, HMM structure search, sequence entropy and mutual information,
duration modeling, semi-Markov models
- required reading
- T. Bailey and C. Elkan.
The value
of prior knowledge in discovering motifs with MEME.
In Proceedings of the 3rd International Conference on
Intelligent Systems for Molecular Biology, pp. 21-29, 1995.
- C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and
J. Wootton. Detecting
subtle sequence signals: a Gibbs sampling strategy for multiple alignment.
Science 262:208-214, 1993.
- K. Noto and M. Craven.
Learning
probabilistic models of cis-regulatory modules that represent logical and
spatial aspects.
Bioinformatics 23(2):e156-e162, 2007.
- O. Elemento, N. Slonim and S. Tavazoie.
A universal framework for regulatory element discovery across all genomes and data types.
Molecular Cell 28(2):337-350, 2007.
(Supplemental materials containing key methodological details)
- optional reading
- lecture notes
- Learning Sequence
Motif Models using EM
(PDF, PPTX) (1/22, 1/27)
- Learning Sequence Motif
Models using Gibbs Sampling (PDF, PPTX) (1/27, 1/29, 2/3)
- Inferring Probabilistic
Models of cis-Regulatory Modules (PDF, PPTX) ("Seeing Single Molecules Move" animation) (2/5, 2/10)
- Inferring Models of cis-Regulatory
Modules using Information Theory (PDF, PPTX) (2/12, 2/17)
Gene Finding
- topics: the gene finding task, maximal dependence decomposition,
interpolated Markov models, back-off models, pairwise HMMs, Genscan, Twinscan, SLAM
- required reading
- S. Salzberg, A. Delcher, S. Kasif, and O. White.
Microbial
gene identification using interpolated Markov models.
Nucleic Acids Research 26(2):544-548, 1998.
- Sections 3.4, 3.5 in Durbin et al.
- C. Burge and S. Karlin. Prediction of complete gene structures in human
genomic DNA. Journal of Molecular Biology 268(1):78-94, 1997.
- Sections 4.1, 4.2 in Durbin et al.
- L. Pachter, M. Alexandersson and S. Cawley. Applications of
generalized pair hidden Markov models to alignment and gene finding problems.
Proceedings of the Fifth Annual International Conference on Computational Biology (RECOMB), 241-248, 2001.
- optional reading
- lecture notes
- Interpolated Markov Models for Gene Finding (PDF, PPTX) (2/17, 2/19, 2/24)
- Eukaryotic Gene Finding: The GENSCAN System
(PDF,PPTX) (2/24, 2/26, 3/3)
- Comparative Gene Finding (PDF, PPTX) (3/3, 3/5)
RNA-Seq
- topics: RNA-Seq technology, transcript quantification with
RNA-Seq
- required reading
- optional reading
- lecture notes
- Transcript quantification with RNA-Seq
(PDF, PPTX) (3/12, 3/19)
- Analysis of alternative splicing with RNA-Seq and
probabilistic splice graphs
(PDF, PPTX) (3/24)
Identification of Signaling Pathways
- required reading
- lecture notes
RNA Analysis
- topics: predicting RNA secondary structure, Nussinov/energy-minimization algorithms,
stochastic context free grammars, Inside/Inside-Outside/CYK algorithms,
searching sequences for a given RNA secondary structure, RSEARCH,
RNA gene recognition via comparative sequence analysis, microRNA gene/target prediction
- required reading
- Chapter 9 in Durbin et al.
- Sections 10.1, 10.2 in Durbin et al.
- optional reading
- lecture notes
Large-Scale and Whole-Genome Sequence Alignment
- topics: large-scale alignment, whole-genome alignment, parametric alignment,
suffix trees, locality sensitive hashing, k-mer tries, sparse dynamic programming, longest increasing
subsequence problem, Markov random fields,
MUMmer, LAGAN/MLAGAN, Mauve, Mercator
- required reading
- A. Delcher, S. Kasif, R. Fleischmann, J. Peterson, O. White
and S. Salzberg.
Alignment of Whole Genomes.
Nucleic Acids Research 27(11):2369-2376, 1999.
- M. Brudno, C. Do, G. Cooper, M. Kim, E. Davydov, NISC Comparative
Sequencing Program, E. Green, A. Sidow, and S. Batzoglou.
LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale
Multiple Alignment of Genomic DNA.
Genome Research 13:721-731, 2003.
- optional reading
- lecture notes
- Alignment of Long Sequences (PDF, PPTX) (4/16, 4/21)
- Multiple Whole Genome Alignment (PDF, PPTX)
Biological network inference and evolution
- topics: Network inference, models of biological network evolution, network alignment
- required reading
- optional reading
- De Smet, R., and Marchal, K. (2010). Advantages and limitations of current network inference methods. Nature reviews Microbiology, 8(10), 717-29.
- Wohlbach, D. J., Thompson, D. A., Gasch, A. P., and Regev, A. (2009). From elements to modules: regulatory evolution in Ascomycota fungi. Current opinion in genetics and development, 19(6), 571-8.
- lecture notes
- Comparative network algorithms (PDF, PPTX) (4/23, 4/28)
Genotype Analysis
- topics: haplotype inference, genome-wide association studies (GWAS), quantitative trait loci (QTL) mapping
- recommended reading
- lecture notes
- Linking Genetic Variation to Important Phenotypes (PDF, PPTX) (4/28, 4/30)
- GWAS and multiple testing correction (PDF,PPTX) (4/30, 5/5)
Protein Structure Prediction
- topics: secondary structure prediction, threading, branch and bound search, ROSETTA
- required reading
- recommended reading
- lecture notes
- Introduction to Protein Structure Prediction (PDF, PPTX) (5/5, 5/7)
- Protein Threading (PDF, PPTX) (5/7)
|