Title of course:  Bioinformatics Algorithms (Module 2 of Fundamental Algorithms for Bioinformatics) 
Lecturer:  Zsuzsanna Lipták 
email: 
zsuzsanna DOT liptak AT univr DOT it 
Course times:  Thu 11.30  13.30 (aula I) (starts at 11:45) Fri 11.30  14.30 (aula C) (ends at 14:15) 
Student hours: 
Wed 10.0012.00 and by appointment 
Office:  Ca' Vignal, 2, 1st floor, right corridor, office 1.79 
News:
GOALS of the course: To learn about some of the basic problems and algorithms behind common bioinformatics applications (sequence alignment, sequence similarity, sequence assembly, phylogenetics).
CREDITS: 12 CFU, together with module 1 of this course (Algorithm Design). This module contributes half of the grade. It is gained via a written exam, followed by oral exam. You are admitted to the oral exam only if you passed the written exam.
If you are taking this course as a student of Molecular and Medical Biotechnology (LM9) ("mutuazione Algorithms for Computational Biology"), then you will get 6 CFU (credits) for this course, and will have a different exam to take. However, it still holds that you have to take a written and an oral exam.
SYLLABUS:
Here is an overview of the topics that will be covered. The topics may vary slightly.
SUGGESTED BOOKS:
Disclaimer: Here I list the topics which we covered in the lectures, and put up slides if there are some, as well as pointers to the respective chapters in books, and possibly other material of interest. However, you will always need the class notes, too! (even if there are slides)
When I include corrections of slides, this refers to the handed out version. The slides which you can download here have already been corrected. If you find any mistakes, please let me know!
No.  Date  Topic  Slides (if any)  Supporting material or chapters in books, comments 

1  Thu, 1/3/2018  Organisation of course. Introduction: Strings and sequences in Biology and in Computer Science.  Organisation (printable), Strings and Sequences in Biology (printable), Strings and Sequences in Computer Science (printable) 

2  Thu, 8/3/2018  Pairwise alignment I: Number of alignments: recursion, explicit exponential lower bound, exhaustive search. NeedlemanWunsch DPalgorithm for global alignment, analysis.  Pairwise Alignment (printable, part 1)  This and the following lectures follow pretty closely chapter 3 of the Setubal & Meidanis book. (The Ohlebusch book and the Bielefeld Lecture Notes on Sequence Analysis both use distance functions instead of similariy functions for their presentation of alignment algorithms.) Number of alignments: Bielefeld Lecture Notes on Sequence Analysis (see above), appendix B. 
3  Fri, 9/3/2018  Pairwise alignment II: Backtracing without and with traceback pointers. Spacesaving variant of NWalgorithm. Local alignment: SmithWaterman algorithm.  same as before; (printable, part 2)  global al.: Setubal & Meidanis, ch. 3.2.1, local al.: ch. 3.2.2 
4  Thu, 15/3/2018  Pairwise Alignment III: Semiglobal alignment. Affine gap functions in quadratic time and space.  Pairwise Alignment 2 (printable, part 1, printable, part 2)  semiglobal al.: Setubal & Meidanis, ch. 3.2.3 Correction of handout slides: on p.3 the score of the right alignment is 1 (not 3) 
5  Fri, 16/3/2018  Pairwise Alignment IV: Affine gap functions, cont'ed (example). Computing an optimal pairwise alignment in linear space via DivideandConquer.  Pairwise Alignment 3 (printable)  affine gap functions: Setubal & Meidanis, ch. 3.3.3 
6  Thu, 22/3/2018  Pairw. alignment IV cont'ed: Computing an optimal pairwise alignment in linear space, cont'ed. String distances I: edit distance. 
same as before  D&C algorithm for optimal al. in linear space: Bielefeld Lecture Notes on Sequence Analysis, ch. 5.4.; (a different algorithm for optimal al. in linear space: Setubal & Meidanis, ch. 3.3.1, and Ohlebusch ch. 8.1.2) String distances: Bielefeld Lecture Notes Seq. Analysis, ch. 3 
7  Fri, 23/3/2018  String distances II: metric, DP algorithm for computing the edit distance, Connection between edit distance and alignment score, LCSdistance, Hamming distance.  String distances (printable)  distance vs. similarity: Setubal & Meidanis ch. 3.6.1, Ohlebusch ch. 8.1.4 and 8.1.5 
8  Thu, 5/4/2018  String distances III: The qgram distance.  The qgram distance (printable)  qgram distance: Bielefeld Lecture Notes on Sequence Analysis, ch. 3.7 
9  Fri, 6/4/2018  De Bruijn sequences and de Bruijn graphs I: de Bruijn sequences over different alphabets, existence of de Bruijn sequences.  see class notes and slides for Lecture 22 (2) and the Compeau et al. article there  
10  Thu, 12/4/2018  De Bruijn sequences and de Bruijn graphs II: (full) de Bruijn graphs, construction of de Bruijn sequences via Euler tours.  see class notes and slides for Lecture 22 (2) and the Compeau et al. article there  
10a  Thu, 12/4/2018  Extra lecture: Computational Efficiency I.  Computational Efficiency 1 (printable)  
11  Fri, 13/4/2018  De Bruijn sequences and de Bruijn graphs III: de Bruijn subgraphs and sequences with qgram distance 0.  Bielefeld Lecture Notes on Sequence Analysis ch. 3.7  
12  Fri, 20/4/2018  Scoring matrices: PAM, BLOSUM.  Scoring Matrices (printable)  Setubal & Meidanis, ch. 3.5.1 Here are two PAM mutation matrices, and two PAM scoring matrices (the ones handed out in the lecture). You can read the original article on PAM matrices (more exactly, book chapter), by Margret Dayhoff et al, 1978. It is very understandably written! Here is the BLOSUM62 matrix. See also this nice little article by Sean Eddy about the BLOSUM matrices (S. Eddy, Where did the BLOSUM62 alignment score matrix come from? Nature Biotechnology 22, pp. 10351036, 2004). 
12a  Fri, 20/4/2018  Extra lecture: Computational Efficiency II.  Computational Efficiency 2 (printable)  
13  Thu, 3/5/2018  Multiple Sequence Alignment (MSA) I: definitions, projections, scoring, sumofpairs score, theorem about optimal projections and optimal MSAs.  Setubal & Meidanis, ch. 3.4.1, Ohlebusch, ch. 8.2  
14  Fri, 4/5/2018  MSA II: DPalgorithm, Star Alignment Algorithm  DPalgorithm: Setubal & Meidanis, ch. 3.4.1, Bielefeld Lecture Notes on Sequence Analysis: ch. 11.1, Ohlebusch 8.2 star alignment for similarity score: Setubal & Meidanis, ch. 3.4.2 

15  Thu, 10/5/2018  MSA III: star alignment as a 2approximation algorithm (for edit distance, and in general, for distance functions with triangle inequality)  star alignment as a 2approximation: Ohlebusch, ch. 8.2.2, Bielefeld Lecture Notes on Sequence Analysis, ch. 11.3  
16  Fri, 11/5/2018  MSA IV: progressive alignment  Bielefeld Lecture Notes on Sequence Analysis: Appendix H.1, Ohlebusch ch. 8.2.3  
17  Thu, 17/5/2018  Jens Stoye (Univ. Bielefeld): Phylogenetics I: Introduction. Characterbased methods. Definitions, perfect phylogeny (PP), Gusfield's algorithm for binary characters. Small Parsimony: Fitch' algorithm.  Bielefeld Phylogenetics Lecture Notes, ch. 2 (Introduction), ch. 3 (characters and states), ch. 4 (Perfect Phylogeny, Gusfield's algorithm), ch. 5.2 (Fitch' algorithm)  
18  Fri, 18/5/2018  Jens Stoye (Univ. Bielefeld): Phylogenetics II: distancebased methods. UPGMA, Neighbor Joining (NJ).  Bielefeld Phylogenetics Lecture Notes, ch. 7 (7.1, 7.3.1 and 7.3.4)  
19  Thu, 24/5/2018  Phylogenetics III: Number of rooted and unrooted trees. More on distance based methods: ultrametric matrices, 3point condition, and rooted trees; additive metrics, 4point conditon, and unrooted trees.  Phylogenetics 1 (printable)  same as previous lecture 
20  Fri, 25/5/2018  Phylogenetics IV: More on characterbased methods: homoplasies and the definitions of compatibility and perfect phylogeny (PP); parsimony problem: Large Parsimony and Small Parsimony; characterization of PP via number of statechanges.  Phylogenetics 2 (printable)  same as lecture 17 
21  Thu, 31/5/2018  Phylogenetics IV cont'ed: Algorithms for Large Parsimony: Greedy Sequential Addition Algorithm, BranchandBound Algorithm, Nearest Neighbor Interchange.  same as before  Large Parsimony (=Maximum Parsimony): Bielefeld Phylogenetics Lecture Notes, ch. 6.3 (Greedy Sequential Addition, NNI); for the branchandbound algorithm that we did, see here an excerpt from the Felsenstein book (pages 61 to 64). Correction of handout slides: Note that we did three algorithms for Large Parsimony (and not two, as it said on the handout slides!) 
22  Fri, 1/6/2018  Two more applications of qgrams: (1) Database search with BLAST. (2) Sequence Assembly with de Bruijn graphs. 
BLAST (summary only!) (printable) Fragment Assembly with de Bruijn graphs (printable) 
(1) Here is the orginal article on BLAST: S.F. Altschul et al. Basic Local Alignment Search Tool, J. Mol. Biol. 215: 403410 (1990); and the one on BLAST2: S.F. Altschul et al. Gapped BLAST and PSIBLAST: a new generation of protein database search programs, Nucleic Acids Research, Vol. 25, No. 17: 33893402 (1997).
There is a very detailed section on BLAST in the book by Mount (full ref. see above), within chapter 7, p. 300 onwards. BLAST is also described in Chapter 6.3 of the BielSeqAn Lecture Notes (see above). There are excellent help pages on the NCBI Blast page, among them on how to submit a request and how to read the results (version of Sept. 2016, always get the latest one!). (2) See this article: How to apply de Bruijn graphs to genome assembly by Phillip E.C. Compeau, Pavel A. Pevzner, and Glenn Tesler, Nature Biotechnology, vol. 29 no. 11 (Nov. 2011); there is also supplementary material for this article. 
23  Fri, 8/6/2018  Summary of course.  Study list for students of Medical Bioinformatics and for students of Molecular and Medical Biotechnology. 