|
Syllabus, Readings and Lecture Notes
Course Overview
Motif and cis-Regulatory Module (CRM) Modeling
- topics: learning motif models, learning models of cis-regulatory
modules, Gibbs sampling, Dirichlet priors,
parameter tying, sequence entropy, mutual information
- required reading
- T. Bailey and C. Elkan.
The value
of prior knowledge in discovering motifs with MEME.
In Proceedings of the 3rd International Conference on
Intelligent Systems for Molecular Biology, pp. 21-29, 1995.
- C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and
J. Wootton. Detecting
subtle sequence signals: a Gibbs sampling strategy for multiple alignment.
Science 262:208-214, 1993.
- O. Elemento, N. Slonim and S. Tavazoie.
A universal framework for regulatory element discovery across all genomes and data types.
Molecular Cell 28(2):337-350, 2007.
(Supplemental materials containing key methodological details)
- optional reading
- optional viewing
- lecture notes
- Learning Sequence
Motif Models using EM
(PDF, PPTX)
(1/24, 1/29)
- Learning Sequence Motif
Models using Gibbs Sampling
(PDF, PPTX, Gamma
example, Dirichlet
example) (1/31, 2/5)
- Inferring Models of cis-Regulatory Modules using Information
Theory
(PDF, PPTX)
(2/7, 2/12)
RNA-seq and Transcript assembly
- topics: RNA-seq technology, transcript quantification,
alternative splicing, splice graphs, transcript assembly
- required reading
- B. Li, V. Ruotti, R.M. Stewart, J.A. Thomson, and C.N. Dewey. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26(4): 493-500, 2010.
- L.H. LeGault and C.N. Dewey. Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs. Bioinformatics 29(18): 2300-2310, 2013.
- Pertea M, Pertea GM, Antonescu CM, Chang T-C, Mendell JT,
Salzberg SL. StringTie enables improved reconstruction of a
transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33:
290–295.
- Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et
al. Transcript assembly and quantification by RNA-Seq reveals
unannotated transcripts and isoform switching during cell
differentiation. Nat Biotechnol. 2010;28: 511–515.
- optional reading
- Z. Wang, M. Gerstein, and M. Snyder. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10(1): 57-63, 2009.
- A. Conesa, P. Madrigal, S. Tarazona, D. Gomez-Cabrero, A. Cervera, A. McPherson, M.W. Szczesniak, D.J. Gaffney, L.L. Elo, X. Zhang, and A. Mortazavi. A survey of best practices for RNA-seq data analysis. Genome Biology 17(13), 2016.
- Pertea M, Pertea GM, Antonescu CM, Chang T-C, Mendell JT,
Salzberg SL. StringTie enables improved reconstruction of a
transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33:
290-295.
- Stewart R, Rascon CA, Tian S, Nie J, Barry C, Chu L-F, et
al. Comparative RNA-seq analysis in the unsequenced axolotl:
the oncogene burst highlights early gene expression in the
blastema. PLoS Comput Biol. 2013;9: e1002936.
- lecture notes
- Transcript quantification with RNA-Seq
(PDF, PPTX)
(2/12, 2/14, 2/19)
- Analysis of alternative splicing with RNA-Seq and probabilistic splice graphs
(PDF, PPTX)
(2/19, 2/21)
- Assembling transcriptomes from RNA-seq data
(PDF, PPTX)
(2/26, 2/28)
- Comparative RNA-seq for analysis of regeneration in axolotl
(PDF, PPTX)
RNA Structure Analysis
- topics: predicting RNA secondary structure, Nussinov/energy-minimization algorithms,
stochastic context free grammars
searching sequences for a given RNA secondary structure, RSEARC
RNA gene recognition via comparative sequence analysis,
microRNA gene/target prediction, Inside/Inside-Outside/CYK algorithms --
- required reading
- optional reading
- lecture notes
Epigenomics
- topics: epigenomic data types, DNase I hypersensitivity, Gaussian processes,
convolutional neural networks, interpreting noncoding genetic variants
- required reading
- R.I. Sherwood, T. Hashimoto, C.W. O'Donnell, S. Lewis, A.A. Barkal, J.P. van Hoff, V. Karun, T. Jaakkola, and D.K. Gifford. Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat Biotechnol 32(2): 171178, 2014.
- J. Lever, M. Krzywinski, and N. Altman. Points of Significance: Classification evaluation. Nat Methods 13(8):603-604, 2016.
- optional reading
- lecture notes
Genotype Analysis
- topics: haplotype inference, genome-wide association studies (GWAS),
quantitative trait loci (QTL) mapping, multiple hypothesis testing
- required reading
- optional reading
- lecture notes
- Linking Genetic Variation to Important Phenotypes
(PDF, PPTX) (4/2)
- GWAS and multiple testing correction
(PDF, PPTX)
(4/4, 4/9)
- Interpreting noncoding variants
(PDF, PPTX)
(4/9, 4/11, 4/16)
Mass Spectrometry
- topics: peptide and protein identification with mass spectrometry
- required reading
- optional reading
- lecture notes
- Mass spectrometry
(PDF, PPTX)
(4/16, 4/18)
Biological Network Analysis
- topics: protein interactions, pathway identification, linear programming, min cost flow
- required reading
- E. Yeger-Lotem, L. Riva, L.J. Su, A.D. Gitler, A.G. Cashikar, O.D. King, P.K. Auluck, M.L. Geddie, J.S. Valastyan, D.R. Karger, S. Lindquist, and E. Fraenkel. Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity. Nat Genet 41(3):316-323, 2009.
- optional reading
- lecture notes
- Identifying signaling pathways
(PDF, PPTX)
(4/18, 4/23)
Large-Scale and Whole-Genome Sequence Alignment
- topics: large-scale alignment, whole-genome alignment,
suffix trees, k-mer tries, longest increasing
subsequence problem, MUMmer
- required reading
- A. Delcher, S. Kasif, R. Fleischmann, J. Peterson, O. White
and S. Salzberg.
Alignment of Whole Genomes.
Nucleic Acids Research 27(11):2369-2376, 1999.
- M. Brudno, C. Do, G. Cooper, M. Kim, E. Davydov, NISC Comparative
Sequencing Program, E. Green, A. Sidow, and S. Batzoglou.
LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale
Multiple Alignment of Genomic DNA.
Genome Research 13:721-731, 2003.
- optional reading
- lecture notes
- Alignment of Long Sequences - MUMmer
(PDF, PPTX)
(4/25, 4/30)
- Alignment of Long Sequences - LAGAN
(PDF, PPTX)
(4/30, 5/2)
- Multiple Whole Genome Alignment
(PDF, PPTX) (5/2)
Lecture Notes
Thank you to Professors Mark Craven and Tony Gitter for providing
lecture material. These slides, excluding third-party material, are
licensed
under CC BY-NC
4.0 by Mark Craven, Colin Dewey, and Anthony Gitter.
|