|
Syllabus, Readings and Lecture Notes
Course Overview
Motif and cis-Regulatory Module (CRM) Modeling
- topics: Expectation Maximization (EM) algorithm, learning motif models, learning models of cis-regulatory
modules, Gibbs sampling, sequence entropy, mutual information
- required reading
- T. Bailey and C. Elkan.
The value
of prior knowledge in discovering motifs with MEME.
In Proceedings of the 3rd International Conference on
Intelligent Systems for Molecular Biology, pp. 21-29, 1995.
- C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and
J. Wootton. Detecting
subtle sequence signals: a Gibbs sampling strategy for multiple alignment.
Science 262:208-214, 1993.
- O. Elemento, N. Slonim and S. Tavazoie.
A universal framework for regulatory element discovery across all genomes and data types.
Molecular Cell 28(2):337-350, 2007.
(Supplemental materials containing key methodological details)
- optional reading
- optional viewing
- lecture notes
Genotype Analysis
- topics: haplotype inference, genome-wide association studies (GWAS), quantitative trait loci (QTL) mapping, multiple hypothesis testing
- required reading
- optional reading
- lecture notes
- Linking Genetic Variation to Phenotypes
(PDF, PPTX) (2/10)
- GWAS, multiple testing correction and QTLs
(PDF, PPTX) (2/15, 2/17)
Epigenomics
- topics: Epigenomic data types, DNase I hypersensitivity, Gaussian processes, ROC curve
- required reading
- R.I. Sherwood, T. Hashimoto, C.W. O'Donnell, S. Lewis, A.A. Barkal, J.P. van Hoff, V. Karun, T. Jaakkola, and D.K. Gifford. Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat Biotechnol 32(2):171-178, 2014.
- J. Lever, M. Krzywinski, and N. Altman. Points of Significance: Classification evaluation. Nat Methods 13(8):603-604, 2016.
- optional reading
- lecture notes
RNA-seq Analysis and Gene Discovery
- topics: RNA-seq technology, transcript quantification, gene finding, interpolated Markov models
- required reading
- B. Li, V. Ruotti, R.M. Stewart, J.A. Thomson, and C.N. Dewey. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26(4): 493-500, 2010.
- S. Salzberg, A. Delcher, S. Kasif, and O. White.
Microbial
gene identification using interpolated Markov models.
Nucleic Acids Research 26(2):544-548, 1998.
- Sections 3.1, 3.5 in Durbin et al.
- optional reading
- L.H. LeGault and C.N. Dewey. Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs. Bioinformatics 29(18): 2300-2310, 2013.
- A. Conesa, P. Madrigal, S. Tarazona, D. Gomez-Cabrero, A. Cervera, A. McPherson, M.W. Szczesniak, D.J. Gaffney, L.L. Elo, X. Zhang, and A. Mortazavi. A survey of best practices for RNA-seq data analysis. Genome Biology 17(13), 2016.
- Sections 3.4, 4.1 in Durbin et al.
- C. Burge and S. Karlin. Prediction of complete gene structures in human
genomic DNA. Journal of Molecular Biology 268(1):78-94, 1997.
- I. Korf, P. Flicek, D. Duan, and M. Brent.
Integrating genomic homology into gene structure prediction.
Bioinformatics 17(Suppl. 1):S140-S148, 2001.
- lecture notes
- RNA-Seq analysis and gene discovery
(PDF, PPTX)
(3/3,3/8)
Network Biology
- topics: biological network analysis, protein interactions, pathway identification, linear programming, min cost flow
- required reading
- E. Yeger-Lotem, L. Riva, L.J. Su, A.D. Gitler, A.G. Cashikar, O.D. King, P.K. Auluck, M.L. Geddie, J.S. Valastyan, D.R. Karger, S. Lindquist, and E. Fraenkel. Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity. Nat Genet 41(3):316-323, 2009.
- optional reading
- T. Ideker, and R. Nussinov. Network approaches and applications in biology. PLoS Comput Biol, 13(10):e1005771, 2017.
- D-Y. Cho, Y-A. Kim, and T.M. Przytycka. Chapter 5: Network Biology Approach to Complex Diseases. PLoS Comput Biol, 8(12):e1002820, 2012.
- A. Barabasi, and Z. N. Oltvai. Network biology: understanding the cell's functional organization. Nat Rev Genet, 5:101-113, 2004.
- J.W. Chinneck. Practical Optimization: A Gentle Introduction.
- lecture notes
- Network biology
(PDF, PPTX)
(3/22, 3/24)
Machine Learning in Bioinformatics
- topics: unsupervised learning, partitioning vs. hierarchical clustering, classification, support vector machine
- required reading
- optional reading
- lecture notes
- Machine Learning in Bioinformatics
(PDF, PPTX)
(3/29, 3/31)
Deep Learning Applications
- topics: deep learning, convolutional neural networks, interpreting noncoding genetic variants
- required reading
- optional reading
- lecture notes
- Interpreting noncoding variants by deep learning
(PDF, PPTX)
(4/5, 4/7)
Single-cell RNA-seq Analysis
- topics: single cell RNA-seq processing and analysis, cell clustering, cell-type regulatory networks, single cell deconvolution
- required reading
- optional reading
- lecture notes
- single cell RNA-seq processing and analysis
(PDF, PPTX)
(4/12, 4/14)
- cell-type annotation, regulatory networks and deconvolution
(PDF, PPTX)
(4/19, 4/21)
Multi-modal Data Analysis
- topics: multi-modal data, multi-view learning, manifold alignment
- reading
- C. Zhu, S. Preissl, B. Ren. Single-cell multimodal omics: the power of many . Nat Methods 17, 11-14, 2020.
- N. Nguyen, D. Wang. Multiview learning for understanding functional multiomics . PLoS Comput Biol. 16(4): e1007677, 2020.
- J. Huang, J. Sheng, D. Wang. Manifold learning analysis suggests strategies to align single-cell multimodal data of neuronal electrophysiology and transcriptomics . Commun Biol 4, 1308, 2021.
- N. Nguyen, J. Huang, D. Wang. A deep manifold-regularized learning model for improving phenotype prediction from multi-modal data . Nat Comput Sci 2, 38-46, 2022.
- Y. Hao et al. Integrated analysis of multimodal single-cell data . Cell 184(13):3573-3587, 2021.
- lecture notes
- Multi-modal data integration and analysis
(PDF, PPTX)
(4/26, 4/28)
Lecture Notes
Thank you to Professors Mark Craven, Tony Gitter and Colin Dewey for providing
lecture material. These slides, excluding third-party material, are
licensed
under CC BY-NC
4.0 by Mark Craven, Colin Dewey, Anthony Gitter and Daifeng Wang.
|