Microarray Data Analysis


Lectures Datasets Exercises Projects Class Info

Class outline:
Date Lecture Title Description Suggested Reading
1/21 Introduction[PDF] We give a brief description of the central dogma of molecular biology and describe high density oligonucleotide and spotted cDNA technologies. Cartoon Guide to Genetics
1/23 Introduction to Data Analysis [PDF, R] We will review inference, t-tests, hypothesis testing, and the multiple comparison problem. We introduce the MA and volcano plots. None
1/29 Feature Level Data[PDF, R] We define feature level data for both high density and spotted arrays. We also demonstrate the background noise problem and some solutions. We also discuss the probe sequence effect. Throughout we will use the Spike-in experiment data as an example. Chapters 1,2
Lab Lab:Introduction to Bioconductor [PDF, R] We introduce the class and methods system used by Bioconductor and analyze part of the spike-in experiment. None
1/31 Normalization [PDF, R] We demonstrate why we need to normalize. We will describe various normalization procedures Chapters 2,4
2/5 Preprocessing Affymetrix GeneChips [PDF] We will describe different strategies for summarizing Affymetrix feature-level data as well as detection call methodology. Chapter 2
2/19 Differential Expression [PDF, R] Introduction to Empirical Bayes and Multiple Comparasons Chapter 14
2/14 Snow Day
Lab Lab: limma [Hoptag example code] We demonstrate how to use the limma package on a simple two color experiment. Chapter 23, Limma User Guide
2/21 Gene enrichment analysis[PDF] We describe ways to combine restults from various genes to obtain more powerful conclusions about biologically relevant processes. Mootha et al. Nat Gen (2003), Kim and Volsky BMC Bionf (2005), Tian et al PNAS (2005) Subramanian et al PNAS 2005
2/26 Genotyping with SNP chips [PDF] Benilton will teach this class. We will cover pre-processing and genotyping algorithms for Affymetrix SNP chips. Carvalho et al. 2007
2/28 Clustering and Prediction [PDF] We describe the basics of clustering and classifiction techniques. Chapters 12,13,16,24
3/5 Experimental Design [PDF] We will discuss the different types of designs that are available for two-color platforms. We will also describe statistical considerations related to pooling RNA samples. Kendziorski et al. PNAS 2005
3/7 Tiling Arrays [PDF] We will describe tiling arrays and the most popular application: chipChip.
End of class Slides for other topics below.
Time course analysis [PDF] We will describe experiments and statistical methods for time course experiments and data. Tai and Speed
Across platform comparisons [PDF] We describe an experiment in which three platforms where compared. Data from multiple labs was used in the assessment. Irizarry et al
Lab:heatmaps and MLInterfaces [R] Try the lab on your own. Any questions answered in class.
Quality Assessment [PDF] We will show examples of basic exploratory plots that illustrate problems with cDNA arrays. We will then describe how one can use the RMA model to obtain useful quality assessment summaries for Affymetrix arrays. Chapter 3
Annotation [PDF (11MB)] We describe various ways used to define genes for genomic studies. We then demonstrate how to use Bioconductor to quickly get this informaion. Section II



Data-sets:


Practice problem:

  1. Read in the two replicate GPR files given in the data section :
  2. Read the Affymetrix expression data and make a list of the genes you think are differentially expressed. Use graphs and numerical summaries to support your claim.
  3. Download the SNP Chip M and A data. Implement your own genotyping algorithm and make calls (calls can be "No Call") for all samples/SNPs. Show plots of the Ms (with your calls in color) for the followins SNPs: SNP_A-1675105, SNP_A-1719411, SNP_A-1670870, SNP_A-1654931 SNP_A-1738777 ,SNP_A-1643325, SNP_A-1643594, SNP_A-1644414, SNP_A-1646804, SNP_A-1649732. Use color and/or numbers to denote your call. Give a rough estimate of how many calls you got wrong.

Project topics:


Class Information



You are visitor number