Algorithms for Computational Biology
(academic year 2015/2016)

Masters in Molecular and Medical Biotechnology

Title of course:Algorithms for Computational Biology
(Institutional page)
Lecturer:Zsuzsanna Lipták
email: zsuzsanna DOT liptak AT univr DOT it  
(Please put "Course Algorithms for Computational Biology" in the subject line.)
Course times: Thu 10.30 - 12.30 (aula L)
Fri 13.30 - 16.30 (aula H)
Student hours:
Thu 13.30-15.30 (during term time)
see under "News"
Office:Ca' Vignal, 2, 1st floor, right corridor, stanza 1.79


GOALS of the course:

  1. to learn about some basic problems and algorithms behind common bioinformatics applications (sequence alignment, sequence similarity, phylogenetics), and
  2. to get an idea of some basic computational issues (problem specification, efficiency of algorithms, limitations).

CREDITS: 6 CFU, written exam, followed by oral exam. You are admitted to the oral exam only if you passed the written exam. For those who want this course to count for the future Masters in Medical Bioinformatics (Dip. Inf.): it can replace the second part (6 CFU) of the course "Fundamental Algorithms in Computational Biology" only if you do an additional assignment on a topic from algorithms in computational biology (to be chosen together with the lecturer).

LANGUAGE: Lectures are in English; however, questions can be asked in English or Italian. Written exams will be in Italian only, unless an English version is requested by the students; answers can be given in Italian or in English, main thing that they be legible.

A note on ATTENDANCE: As in most university courses, attendance of classes is not mandatory. All of what I teach in this course is completely standard and is contained in any algorithmically oriented bioinformatics book (see e.g. list on this page). In the exam, you will be asked to do things like compute an alignment of two strings, given a scoring function. You can learn this from any book or online course. However, I believe that one gains much more from attending a course than from only studying by oneself. University classes give the student the opportunity of following a live course given by a live lecturer. If nothing else, attending the course forces you to spend a certain amount of time each week with this course. No handouts or slide presentations can completely substitute a lecturer. So if you can make it, I would advise you to attend. If you can't, you have to rely on the notes of your colleagues and/or on books.



EXAM: Written exam, followed by oral exam (only if you passed the written exam).


Disclaimer: Here I list the topics which we treated in the lectures, and put up slides if we have them, as well as pointers to the respective chapters in books, and possibly other material of interest. However, you will always need the class notes, too! (even if there were slides, as well)

In the table below, the following abbreviations are used (for full details of the books, see above):

No. Date Topic Slides (if any) Supporting material or chapters in books,
1 Thu, 15/10/2015 Organisation of course. Introduction: Computational Efficiency I (using the example of three algorithms for computing the n'th Fibonacci number). Organisation (print version),
Computational efficiency I (print version)
J&P 2.6 (p.31 onwards), 2.7.
See here for more on Fibonacci numbers and nature, and here for more on Fibonacci numbers and maths.
2 Fri, 16/10/2015 Strings and sequences in biology. Pairwise sequence alignment. Strings and sequences in biology (print version) S&M 3.1, BielSeqAn ch. 1
3 Thu, 22/10/2015 Number of alignments of two sequences, recursive formula.   BielSeqAn ch. 5.2
4 Fri, 23/10/2015 Strings and sequences in computer science, basic combinatorics on strings. Strings and sequences in computer science (print version) S&M 2.1, BielSeqAn 2.4
5 Thu, 29/10/2015 Needleman-Wunsch Dynamic Programming algorithm for global alignment.   S&M 3.2.1, J&P 6.6
- Fri, 30/10/2015 lecture cancelled    
6 Thu, 5/11/2015 Needleman-Wunsch DP algorithm for global alignment continued: backtracing, space-saving variant. Smith-Waterman DP algorithm for local alignment.   S&M 3.2.2
7 Fri, 6/11/2015 Computational Efficiency II: Introduction to algorithm analysis, O-notation, growth of functions, using the example of two algorithms for global alignment (DP algo and exhaustive algo). Computational Efficiency II (print version) S&M 2.3, J&P 2.8.
Please read sections 4.1 and 4.2 of the CTcourse notes and solve all exercises in Section 4.4 (except ex. 3). It is also useful to read (and understand!) Section 2 (What is an algorithm) and Section 5 (Some basic maths).
8 Thu, 12/11/2015 Semiglobal alignment. Multiple Sequence Alignment I: induced pairwise alignments, sum-of-pairs score.   S&M 3.2.3, S&M 3.4.1, BielSeqAn ch. 13.1-13.3
9 Fri, 13/11/2015 Multiple Sequence Alignment II: DP-algorithm.   S&M 3.4.1, BielSeqAn ch. 13 (partially)
10 Thu, 19/11/2015 Search spaces. Multiple Sequence Alignment III: star alignment heuristic.   S&M 3.4.2
11 Fri, 20/11/2015 Sequence similarity and distance: metric, edit distance, connection between optimal alignments and edit distance. String distance (print version) J&P 6.4, S&M 3.6.1 (partially)
12 Thu, 26/11/2015 Hamming distance, LCS distance; star alignment II: analysis. (same slides as last time)  
- Fri, 27/11/2015 no lecture due to overlap with another course    
13 Thu, 3/12/2015 PAM scoring matrices I: motivation, mutation probability matrix, mutation in more than one time step   S&M 3.5.1; J&P 3.7
14 Fri, 4/12/2015 PAM scoring matrices II: matrix multiplication, log-odds matrix. BLOSUM matrices (sketch only)
Brief overview of projects, list see here.
  Here are two PAM mutation matrices, and two PAM scoring matrices (the ones handed out in the lecture). You can read the original article on PAM matrices (more exactly, book chapter), by Margret Dayhoff et al, 1978. It is very understandable! Here is the BLOSUM62 matrix. See also this nice little article by Sean Eddy about the BLOSUM matrices (S. Eddy, Where did the BLOSUM62 alignment score matrix come from? Nature Biotechnology 22, pp. 1035-1036, 2004).
15 Thu, 10/12/2015 Heuristic data base search: BLAST   Here is the orginal article on BLAST: S.F. Altschul et al. Basic Local Alignment Search Tool, J. Mol. Biol. 215: 403-410 (1990); and the one on BLAST2: S.F. Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Research, Vol. 25, No. 17: 3389-3402 (1997).
There is a very detailed section on BLAST in the book by Mount (full ref. see above), within chapter 7, p. 300 onwards. BLAST is also described in Chapter 6.4.2 of the BielSeqAn.
There are excellent help pages on the NCBI Blast page, among them on how to submit a request and how to read the results.
16 Fri, 11/12/2015 BLAST (cont.)
Summary of first part (sequence comparison, algorithm analysis).
(print version)
(only summary!)
17 Thu, 17/12/2015 Phylogenetics Introduction: types of phylogenetic trees, types of input data, simple properties of trees   BielPhylo ch. 1 and 2, Felsenstein book (see above), ch. 1; S&M pp. 175-177; here is the handout of Haeckel's tree of life (source: Wikimedia Commons).
18 Fri, 18/12/2015 Number of phylogenetic trees, distance based data, ultrametric matrices and trees, UPGMA   BielPhylo ch. 2.4. Felsenstein ch. 3 (pp. 19-25): number of phylogenetic trees. BielPhylo ch. 6 (partially).
There is a very nice description of UPGMA and the molecular clock assumption in the book by Durbin et al., ch. 7.3, pp. 166-170; Felsenstein-book ch. 11 (p. 161-166). S&M ch. 6.5 (much more detailed than what we are doing!).
Here is the article from Nature on the issues surrounding the phylogeny of whales which I handed out in class: Zhexi Luo: In search of the whales' sisters. Nature vol 404: p. 235-239, 2000. Please read this article. We will discuss it at the end of the course.
19 Thu, 7/1/2016 UPGMA cont'ed.
Character data. Compatibility. Perfect Phylogeny.
Example on UPGMA: page 1 and page 2.
Example: Is this a PP?
BielPhylo ch. 3.1 and 3.2, Felsenstein Ch. 1.
20 Fri, 8/1/2016 Perfect Phylogeny cont'ed. Parsimony. Most parsimonious tree. Small Parsimony: Fitch' algorithm.   BielPhylo ch. 4.1 and 4.2.
21 Thu, 14/1/2016 Heuristics for Maximum Parsimony: Greedy Sequential Addition Algorithm, Branch-and-Bound for Parsimony. Phylogenetics Trees 2
(print version) - note that the slides for Phylogenetics Trees 1 are included in "Summary" slides.
BielPhylo ch. 5.1 (different types of heuristics) and 5.2.3 (greedy sequential addition algorithm). For the branch-and-bound algorithm that we did, see here an excerpt from the Felsenstein book (pages 61 to 64).
22 Fri, 15/1/2016 Discussion of article on whale phylogeny
Summary of part on Phylogenetics.
Phylogenetics Summary
(print version)
Article on controversy about whale phylogeny: Zhexi Luo: In search of the whales' sisters. Nature vol 404: p. 235-239, 2000. Here are two serious animations about whale evolution: from the Smithsonian Institute, Washington D.C.: animation, and a very sweet short one, from the Museum of New Zealand Te Papa Tongarewa: short animation.