Exam preparation (midterm exam).
The exam will take place on Wed 30 April, 2014, in Aula F, from 8.15 to 10.15. Please bring along a document of identification. Please arrive a bit ahead of time and make sure you have identification (with photo) with you.
Part I | |
Formal | alphabet, strings, prefixes, suffixes, substrings, subsequences; matrix, factorial, binomial coefficient, logarithms, sums |
Pairwise sequence alignment | Practical: Compute the score of a given alignment, for given scoring scheme;
compute sim(s,t) using the DP algorithm of Needleman-Wunsch; compute an optimal alignment/all optimal alignments
using the DP-table; the connection between the DP-table and alignments: be able to give the alignment represented
by a path in the table and vice versa; find all optimal local alignments using the DP-algorithm of Smith-Waterman;
semiglobal alignment (algorithm) Theoretical: Give the definition of the cell D(i,j) for global and for local alignment. Give the running times and space requirements of these algorithms. Why are these preferable to the brute-force solution? What are affine gap penalties? For which type of problem is which version of the algorithm appropriate? |
Multiple sequence alignment | Theoretical: Explain and analyse the DP algorithm. Is it feasible? Which heuristics do you know? What is the SP-score? Practical: Score a multiple alignment with the SP-score. Heuristics: star alignment, progressive alignment, apply these to a small example. |
Algorithm analysis | Theoretical: What are the two parameters of algorithms we analyse? What do O-classes measure? With respect to what? Which classes are feasible (manageable) and which aren't? Why? What are heuristics, why are they used? Practical: Put in order of O-classes a given set of functions (slowest growing first, fastest growing last). Compare two functions, which grows faster? Which is preferable for an algorithm's running time/space consumption? Say of certain functions whether they are polynomial, linear, quadratic, cubic, exponential, superexponential. |
Scoring matrices | Theoretical: Explain how the PAM scoring matrices are computed. Explain the biological motivation.
What is the underlying idea? What data are used? What does the number k mean in PAMk?
What do the entries represent? Interpret their values. Why do we use a "log-odds" matrix? What is the main difference between PAM and BLOSUM matrices? Practical: Use a given PAMk or BLUSUM-k matrix (to be supplied in the exam) to score an alignment. |