Title of course: | Algorithms for Computational Biology (Institutional page) |
Lecturer: | Zsuzsanna Lipták |
email: |
zsuzsanna DOT liptak AT univr DOT it (Please put "Course Algorithms for Computational Biology" in the subject line.) |
Course times: | Thu 10.30 - 12.30 (aula L) Fri 13.30 - 16.30 (aula H) |
Student hours: |
see under "News" |
Office: | Ca' Vignal, 2, 1st floor, right corridor, stanza 1.79 |
News:
GOALS of the course:
CREDITS: 6 CFU, written exam, followed by oral exam. You are admitted to the oral exam only if you passed the written exam. For those who want this course to count for the future Masters in Medical Bioinformatics (Dip. Inf.): it can replace the second part (6 CFU) of the course "Fundamental Algorithms in Computational Biology" only if you do an additional assignment on a topic from algorithms in computational biology (to be chosen together with the lecturer).
LANGUAGE: Lectures are in English; however, questions can be asked in English or Italian. Written exams will be in Italian only, unless an English version is requested by the students; answers can be given in Italian or in English, main thing that they be legible.
A note on ATTENDANCE: As in most university courses, attendance of classes is not mandatory. All of what I teach in this course is completely standard and is contained in any algorithmically oriented bioinformatics book (see e.g. list on this page). In the exam, you will be asked to do things like compute an alignment of two strings, given a scoring function. You can learn this from any book or online course. However, I believe that one gains much more from attending a course than from only studying by oneself. University classes give the student the opportunity of following a live course given by a live lecturer. If nothing else, attending the course forces you to spend a certain amount of time each week with this course. No handouts or slide presentations can completely substitute a lecturer. So if you can make it, I would advise you to attend. If you can't, you have to rely on the notes of your colleagues and/or on books.
SYLLABUS:
SUGGESTED BOOKS:
EXAM: Written exam, followed by oral exam (only if you passed the written exam).
Disclaimer: Here I list the topics which we treated in the lectures, and put up slides if we have them, as well as pointers to the respective chapters in books, and possibly other material of interest. However, you will always need the class notes, too! (even if there were slides, as well)
In the table below, the following abbreviations are used (for full details of the books, see above):
No. | Date | Topic | Slides (if any) | Supporting material or chapters in books, comments |
---|---|---|---|---|
1 | Thu, 15/10/2015 | Organisation of course. Introduction: Computational Efficiency I (using the example of three algorithms for computing the n'th Fibonacci number). | Organisation (print version), Computational efficiency I (print version) |
J&P 2.6 (p.31 onwards), 2.7. See here for more on Fibonacci numbers and nature, and here for more on Fibonacci numbers and maths. |
2 | Fri, 16/10/2015 | Strings and sequences in biology. Pairwise sequence alignment. | Strings and sequences in biology (print version) | S&M 3.1, BielSeqAn ch. 1 |
3 | Thu, 22/10/2015 | Number of alignments of two sequences, recursive formula. | BielSeqAn ch. 5.2 | |
4 | Fri, 23/10/2015 | Strings and sequences in computer science, basic combinatorics on strings. | Strings and sequences in computer science (print version) | S&M 2.1, BielSeqAn 2.4 |
5 | Thu, 29/10/2015 | Needleman-Wunsch Dynamic Programming algorithm for global alignment. | S&M 3.2.1, J&P 6.6 | |
- | Fri, 30/10/2015 | lecture cancelled | ||
6 | Thu, 5/11/2015 | Needleman-Wunsch DP algorithm for global alignment continued: backtracing, space-saving variant. Smith-Waterman DP algorithm for local alignment. | S&M 3.2.2 | |
7 | Fri, 6/11/2015 | Computational Efficiency II: Introduction to algorithm analysis, O-notation, growth of functions, using the example of two algorithms for global alignment (DP algo and exhaustive algo). | Computational Efficiency II (print version) | S&M 2.3, J&P 2.8. Please read sections 4.1 and 4.2 of the CTcourse notes and solve all exercises in Section 4.4 (except ex. 3). It is also useful to read (and understand!) Section 2 (What is an algorithm) and Section 5 (Some basic maths). |
8 | Thu, 12/11/2015 | Semiglobal alignment. Multiple Sequence Alignment I: induced pairwise alignments, sum-of-pairs score. | S&M 3.2.3, S&M 3.4.1, BielSeqAn ch. 13.1-13.3 | |
9 | Fri, 13/11/2015 | Multiple Sequence Alignment II: DP-algorithm. | S&M 3.4.1, BielSeqAn ch. 13 (partially) | |
10 | Thu, 19/11/2015 | Search spaces. Multiple Sequence Alignment III: star alignment heuristic. | S&M 3.4.2 | |
11 | Fri, 20/11/2015 | Sequence similarity and distance: metric, edit distance, connection between optimal alignments and edit distance. | String distance (print version) | J&P 6.4, S&M 3.6.1 (partially) |
12 | Thu, 26/11/2015 | Hamming distance, LCS distance; star alignment II: analysis. | (same slides as last time) | |
- | Fri, 27/11/2015 | no lecture due to overlap with another course | ||
13 | Thu, 3/12/2015 | PAM scoring matrices I: motivation, mutation probability matrix, mutation in more than one time step | S&M 3.5.1; J&P 3.7 | |
14 | Fri, 4/12/2015 | PAM scoring matrices II: matrix multiplication, log-odds matrix. BLOSUM matrices (sketch only) Brief overview of projects, list see here. |
Here are two PAM mutation matrices, and two PAM scoring matrices (the ones handed out in the lecture). You can read the original article on PAM matrices (more exactly, book chapter), by Margret Dayhoff et al, 1978. It is very understandable! Here is the BLOSUM62 matrix. See also this nice little article by Sean Eddy about the BLOSUM matrices (S. Eddy, Where did the BLOSUM62 alignment score matrix come from? Nature Biotechnology 22, pp. 1035-1036, 2004). | |
15 | Thu, 10/12/2015 | Heuristic data base search: BLAST | Here is the orginal article on BLAST: S.F. Altschul et al. Basic Local Alignment Search Tool, J. Mol. Biol. 215: 403-410 (1990); and the one on BLAST2: S.F. Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Research, Vol. 25, No. 17: 3389-3402 (1997). There is a very detailed section on BLAST in the book by Mount (full ref. see above), within chapter 7, p. 300 onwards. BLAST is also described in Chapter 6.4.2 of the BielSeqAn. There are excellent help pages on the NCBI Blast page, among them on how to submit a request and how to read the results. |
|
16 | Fri, 11/12/2015 | BLAST (cont.) Summary of first part (sequence comparison, algorithm analysis). |
BLAST (print version) (only summary!) |
|
17 | Thu, 17/12/2015 | Phylogenetics Introduction: types of phylogenetic trees, types of input data, simple properties of trees | BielPhylo ch. 1 and 2, Felsenstein book (see above), ch. 1; S&M pp. 175-177; here is the handout of Haeckel's tree of life (source: Wikimedia Commons). | |
18 | Fri, 18/12/2015 | Number of phylogenetic trees, distance based data, ultrametric matrices and trees, UPGMA | BielPhylo ch. 2.4. Felsenstein ch. 3 (pp. 19-25): number of phylogenetic trees.
BielPhylo ch. 6 (partially). There is a very nice description of UPGMA and the molecular clock assumption in the book by Durbin et al., ch. 7.3, pp. 166-170; Felsenstein-book ch. 11 (p. 161-166). S&M ch. 6.5 (much more detailed than what we are doing!). Here is the article from Nature on the issues surrounding the phylogeny of whales which I handed out in class: Zhexi Luo: In search of the whales' sisters. Nature vol 404: p. 235-239, 2000. Please read this article. We will discuss it at the end of the course. |
|
19 | Thu, 7/1/2016 | UPGMA cont'ed. Character data. Compatibility. Perfect Phylogeny. |
Example on UPGMA: page 1 and page 2. Example: Is this a PP? |
BielPhylo ch. 3.1 and 3.2, Felsenstein Ch. 1. |
20 | Fri, 8/1/2016 | Perfect Phylogeny cont'ed. Parsimony. Most parsimonious tree. Small Parsimony: Fitch' algorithm. | BielPhylo ch. 4.1 and 4.2. | |
21 | Thu, 14/1/2016 | Heuristics for Maximum Parsimony: Greedy Sequential Addition Algorithm, Branch-and-Bound for Parsimony. | Phylogenetics Trees 2 (print version) - note that the slides for Phylogenetics Trees 1 are included in "Summary" slides. |
BielPhylo ch. 5.1 (different types of heuristics) and 5.2.3 (greedy sequential addition algorithm). For the branch-and-bound algorithm that we did, see here an excerpt from the Felsenstein book (pages 61 to 64). |
22 | Fri, 15/1/2016 | Discussion of article on whale phylogeny Summary of part on Phylogenetics. |
Phylogenetics Summary (print version) |
Article on controversy about whale phylogeny: Zhexi Luo: In search of the whales' sisters. Nature vol 404: p. 235-239, 2000. Here are two serious animations about whale evolution: from the Smithsonian Institute, Washington D.C.: animation, and a very sweet short one, from the Museum of New Zealand Te Papa Tongarewa: short animation. |