**Why take this course?**

With fast-growing technology we can determine the sequences of biological molecules like DNA, RNA, and protein, and can produce tremendous amounts of data. Understanding these sequence data requires hybrid biological, mathematical, and computational expertise. This course presents algorithms and methods for working with and thinking about biological sequences, providing the first steps toward proficiency in this growing field.

**Course description:**

Presents a variety of methods for assigning function to biological sequences, emphasizing biologically informed algorithm design. Covers a variety of topics, including low- and high-throughput sequencing history and methods; multiple classes of sequence alignment problems (one-to-one, multiple alignment, alignment of a few sequences to a database, and alignment to a reference genome); interpreting sequence alignments; discovery of patterns in sequences; and visualizing data.

**Learning objectives:**

Upon successfully completing this course, students will be able to describe the algorithms used in assigning function to biological sequences, determine which methods are appropriate for analyzing sequences derived from different experiments, and design analysis pipelines that are biologically meaningful and mathematically rigorous.

**Detailed course outline**

**Homework links and information**

**Course times:**Tuesday and Thursday, 3:30-4:50

**Classroom:**W4007

**Instructor:**Sarah Wheelan

**Grading:**

Homework (3-4 assignments) 80%, final project (written critique of a publication) 20%. Late homework is not accepted except by prior agreement.

**Textbook:**

The course is taught primarily from notes, and no single textbook is required, but some textbooks are helpful as references. Feel free to consult me before buying any books.

**Pavel Pevzner's**books provide a clever biological impetus and readable descriptions of the relevant algorithms

**Biological Sequence Analysis**by Durbin et. al. is a classic, though mathematically sophisticated

**Bioinformatics and Functional Genomics**by Jonathan Pevsner gives a strong biological motivation for the tools used in computational biology and a good description of the algorithmic basis of those tools

**Statistical Methods in Bioinformatics**by Ewens and Grant is a good start for those with biological backgrounds

**Bioinformatics, Sequence and Genome Analysis**by Mount is also fairly basic, though solid, in its descriptions of algorithms

The class will also be taught in part from current literature.