This course is intended to provide a systematic introduction to the algorithms behind the most commonly-used tools in computational biology. While the course will survey a wide range of methods in the field and provide exposure to actual tools, its primary emphasis will be on understanding and analyzing the algorithms behind these tools. In the process, students will be introduced to common techniques in algorithmic design and analysis, including design of data structures and analysis of running time.
Topics to be covered include dynamic programming, string matching, probabilistic techniques, geometric algorithms, hidden Markov models, data mining, and complexity analysis. These topics will be explored in the context of applications of genome sequence assembly, protein and DNA homology detection, gene and promoter finding, protein structure prediction, motif identification, analysis of gene expression data, functional genomics, phylogenetic tree construction, and evolutionary sequence comparison, time permitting.
Assignments will be primarily in the form of problem sets with a mix of algorithm analysis and application. Students will also complete a group research project to develop hands-on experience with some aspect of the field.
Students are expected have previous exposure to probability theory and statistics, as well as a familiarity with basic concepts of cell biology. It would also be helpful to have had some exposure to a programming language (students should know what a loop is, and how to call a function or procedure with arguments, e.g.). Almost all necessary background will be provided as review, but at a relatively brisk pace. Students are certainly encouraged to speak with the instructor if they are interested in the course but are concerned about prerequisites.
All the online materials for this course will be provided using Duke's Blackboard software. Course details, syllabus, reading materials, lecture notes, and assignments are available on Blackboard.
If you have any trouble accessing the Blackboard site, please let me know.