Note: This page will be updated every week as the course progresses.
Here you can get back to the main page of the course.
Final exam (either 2-hour version or 3-hour version) on Tuesday, 5 June 2012, in class.
The exam will take place in Aula D. Be there at 13.00.
Here is a list of things you should know.
New: Final exam results here.
There is now a projects webpage with descriptions of the projects.
Note: This schedule is supposed to give information about what we covered. You still need the notes from the lecture; if you cannot attend, please get in touch with someone who does. Here I try to give the corresponding chapters in the two books Setubal & Meidanis (S&M) and Jones & Pevzner (J&P). For more books, see the main page of the course.
There are very nice lecture notes on Sequence Analysis by Jens Stoye and others from Bielefeld University. They are much too detailed for our purposes, but I do use some chapters. I refer to them below by BielefeldLN.
I once (or twice) gave a one-week intensive course on algorithms for biology masters students at the University of the Western Cape, Cape Town, South Africa. The CT course notes are short and easy to understand. All of it is useful for you except the part on exact matching (sec. 3.3 to 3.6). (But if you are interested, read that part, too. It is explained in a very simple and straightforward manner.)
For the phylogenetics part, I use mostly the Bielefeld Phylogenetics Lecture Notes: BielefeldPhylo.
No. | Date | Topic | Chapters | Other materials (supporting material, suggested reading) |
---|---|---|---|---|
1+2 | Tue+Wed, 6+7/3/2012 | Overview; Pairwise sequence alignment 1: Def. of alignment, exhaustive search, DP algorithm of Needleman-Wunsch; Formalism on strings | S&M 3.1, 3.2.1; J&P 6.4, 6.6 | Section 3.1 of the CT course notes (formalism on strings). Section 2.4 of the BielefeldLN. |
3 | Tue, 13/3/2012 | Pairwise sequence alignment 2: Analysis of the DP algorithm of Needleman-Wunsch; Introduction to algorithm analysis, Big-Oh notation |
S&M 2.3, J&P 2.8 | Please read section 4 (except 4.3) of these CT course notes. Section 2 could also be useful for non-computer scientists (a very basic explanation of algorithms). Here is also the "dirty trick version" of how to identify O-classes. Section 2.3 of the BielefeldLN. |
4 | Wed, 14/3/2012 | Pairwise sequence alignment 3: Local alignment (Smith-Waterman); other variants (sketch);
PAM scoring matrices 1: motivation |
S&M 3.2.2, J&P 6.8 | |
5 | Tue, 20/3/2012 |
PAM scoring matrices 2: how to compute PAM1, how to compute PAMk, a bit of probability, examples | S&M 3.5.1, J&P 6.7 | Here is a nice set of slides by David Gilbert, Univ. of Glasgow, on scoring matrices. You can also read the original article on PAM matrices (more exactly, book chapter), by Margret Dayhoff and others. It is very understandable! See also this nice little article by Sean Eddy about the BLOSUM matrices (S. Eddy, Where did the BLOSUM62 alignment score matrix come from?, Nature Biotechnology 22, 1035 - 1036, 2004). |
6 | Wed, 21/3/2012 |
Heuristics for sequence alignment 1: Dotplots, q-grams, FAST/FASTA |
S&M 3.5.3, J&P 9.6 | Sec. 6.1 and 6.4.1 of the BielefeldLN. See also these nice slides on heuristic similarity search by Stuart M. Brown, NYU School of Medicine, New York. |
7 | Tue, 27/3/2012 | Heuristics for sequence alignment 2: FAST/FASTA, BLAST; | S&M 3.5.2, J&P 9.6 | Sec. 6.4.2 of the BielefeldLN. Here is the orginal article on BLAST: S.F. Altschul et al. Basic Local Alignment Search Tool, J. Mol. Biol. 215: 403-410 (1990). |
8 | Wed, 28/3/2012 | String similarity and distance 1: string similarity measures: formal definition of alignment, sim(s,t); percent similarity; longest common subsequence (LCS) | S&M 3.6.1, J&P 6.5 | |
9 | Tue, 3/4/2012 | String similarity and distance 2: string distance measures: metrics; Hamming distance, edit distance, q-gram distance; relationship between edit distance and optimal alignment score | S&M 3.6.1, J&P 6.4 | Here is the original article on (unit-cost) edit distance (Levenshtein distance): Vladimir Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, Soviet Physics-Doklady, vol. 10, no. 8, 1966. And here is the one on the q-gram distance: Esko Ukkonen, Approximate string matching with q-grams and maximal matches, Theor. Comp. Sc. 92 (1992), pp. 191-211 (beginning of Section 2). |
10 | Wed, 4/4/2012 | Summary of first half: Pairwise sequence alignment (global, local; brute-force algorithm, DP-algorithms), Algorithm analysis, Scoring schemes: PAM matrices, Heuristics for sequence alignment (FASTA, BLAST, dotplots), String similarity and distance | ||
11 | Wed, 11/4/2012 | Introduction to phylogenetics: History of systematics, what is a phylogenetic tree/phylogeny, different types of phylogenetic trees, different types of input | BielefeldPhylo, ch. 1, 2.3 | |
12 | Tue, 24/4/2012 | Midterm exam (about lectures 1-10) | Midterm exam results here. | |
13+14 | Tue+Wed, 8+9/5/2012 | Introduction to graphs and trees; Discussion of midterm exam; discussion of projects |
BielefeldPhylo, ch. 2; List of projects here. |
|
15+16 | Tue+Wed, 15+16/5/2012 | More on trees: number of phylogenetic trees; distance-based data: ultrametrics, algorithm UPGMA | S&M 6.5, J&P 10.6,10.8 | BielefeldPhylo, ch. 6.1 and 6.2, Durbin-book ch. 7.3, Felsenstein-book ch. 11 (pp. 161-166). |
17 | Tue, 22/5/2012 | Character-based data: Perfect Phylogeny; Small Parsimony: Fitch' Algorithm | S&M 6.1,6.4, J&P 10.9, 10.10 | BielefeldPhylo, ch. 3.1, 3.2., 4.1, 4.2,
Durbin-book ch. 7.4. (If you want to read about NP-completeness, here is a chapter I once wrote: Chapter on NP-completeness. It is not part of this course! NP-completeness is out of scope of this course. This is only for whoever is interested.) |
18 | Wed, 23/5/2012 | Large Parsimony: exhaustive search, greedy sequential algorithm, Nearest Neighbour Interchange | J&P 10.11 | Felsenstein-book, ch. 4 |
18+19 | Tue + Wed, 30+31/5/2012 | Some very basic statistics: descriptive statistics vs. inferential statistics, sample vs. population; mean, median, percentiles, quartiles, variance, standard deviation; what is hypothesis testing, errors type I and type II, p-value | A. Aczel and J. Sounderpandian: Complete business statistics, 2008, M. Bramati: Calcolo delle probabilityà e statistica - Teoria ed esercizi, 1997. | |
20 | Tue, 5/6/2012 | Second written exam (about lectures 11 and 13-19) or: Final written exam (about the whole course) |
Final exam results here. | |
21 | Wed, 6/6/2012 | Discussion of projects | ||
22-24 | Wed, Thu, Fri, 11-12-13/7/2012 | Project presentations | Here is the complete program of the project presentations. |