Course Overview

This course provides a foundation in the following four areas: evolutionary and population genetics; comparative genomics; structural genomics and proteomics; and functional genomics and regulation. Each module consists of four lectures providing key background material, one lecture providing clinical correlates, and one guest lecture from leaders in the field. This course is required for all HST students in the Bioinformatics and Integrative Genomics training program within the doctoral program in Medical Engineering and Medical Physics (MEMP).

Module 1: Evolutionary and Population Genetics

  • The basic forces of evolution: Mutation, recombination, mating, migration. Neutral evolution and drift, effective population size, coalescent theory.
  • Selection, fitness, and diffusion models. Selection at genetic and higher levels.
  • Phylogenetic analysis. Models of nucleotide evolution: Jukes-Cantor, Kimura, maximum likelihood models; Human/mouse/rate examples.
  • Measuring selection: From 'classical' methods to maximum likelihood (with applications to disease evolution, HIV and influenza).
  • Medical Lecture: Genetic diversity and evolution of hepatitis C virus.

Module 2: Comparative Genomics

  • Sequence comparison, substitution matrices, alignment methods, alignment statistics. Multiple alignments, profiles and PSSMs.
  • Genome comparison and genome evolution: Duplication, recombination, insertions, repeats. Orthologs, paralogs, in/out-paralogs. Algorithms of genome alignment. Conserved non-coding, positive selection. Motif discovery.
  • Prediction of gene function using: Homology, context, structure, networks.
  • SNPs: Microevolution, history of population, markers medical applications.
  • Medical Lecture: Finding the keys to human heart disease in the genomes of other animals.

Module 3: Structural Genomics and Proteomics

  • Overview of protein structures, domain architecture. Sequence-structure mapping, protein folding, forces and interactions.
  • Structure-based substitution matrices. Protein structure prediction. Threading.
  • Protein function: Binding and kinetics. Michaelis-Menthen kinetics, inhibition. Protein-DNA recognition: Models and algorithms.
  • Proteomics: Networks of protein-protein interactions, complexes, modules. Power-law distributions, clustering coefficient. Evolution of networks.
  • Medical Lecture: Hemoglobin and the anemias.

Module 4: Functional Genomics and Networks

  • Gene regulation and function, conservation, detecting regulatory elements.
  • RNA expression: Clustering and classification.
  • RNA expression: Classification, 2-way clustering, regulatory modules. Integration of expression and proteomic data.
  • Dynamics of biological networks metabolic, regulatory. FBA, signaling, regulation of gene expression.
  • Medical Lecture: Two examples: Phenylketonuria (monogenic) and diabetes type 2 (multigenic+). "Disease" genes vs. "susceptibility" genes. "Environmental" vs. "Developmental" regulation of gene expression.


There are three problem sets for this course and a final project that includes an oral presentation.


Problem Sets 50%
Final Project 50%