Course detail

Analysis of Biological Sequences

FEKT-MPA-ABSAcad. year: 2024/2025

The subject provides statistical foundations and an overview of the core algorithms of sequence analysis. Topics covered will include background on probability, Hidden Markov Models, and multiple hypothesis testing. Sequence analysis algorithms will include alignment, optimal pairwise local alignment, pairwise global alignment and multiple alignment, gene finding and phylogenetic trees.

Language of instruction

English

Number of ECTS credits

6

Mode of study

Not applicable.

Entry knowledge

The student should be able to explain fundamental principles of genetics, should know basic terms and laws of molecular biology and should be oriented in basic knowledge of digital signal processing. In general, knowledge on the Bachelor's degree level is requested.

Rules for evaluation and completion of the course

up to 40 points from computer exercises (3 tests and 1 homework)
up to 60 points from finel written exam
The exam is oriented to verification of orientation in terms of advanced processing of biological sequences, ability to design methods for sequence analysis, apply operations on sequences.
Computer exercises are obligatory. Excused absence can be substituted.

Aims

The aim of the course is to provide knowledge about advanced methods for analysis of biological sequences based on determinsitic as well as stochastic approach. Applications cover pairwise alignment, gene finding and phylogenetic trees.
The student will be able to:
- describe basic methods of computer processing of symbolic sequences,
- explain characteristics of DNA and protein evolution,
- describe principle of methods for construction and analysis of fylogenetic trees,
- discus advantages and disadvantages of the methods,
- explain principle of numeric conversion of symbolic biological sequences.

Study aids

Not applicable.

Prerequisites and corequisites

Not applicable.

Basic literature

Amjesh, R. Bioinformatics for beginners. LAP LAMBERT Academic Publishing, 2019. ISBN 978-6200262851 (EN)
Durbin, R. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 2002. ISBN: 978-0521629713 (EN)
Rosypal, S. Nový přehled biologie. Scientia, Praha 2003. ISBN 80-7183-268-5 (CS)
Srinivasa, K. G. Statistical Modelling and Machine Learning Principles for Bioinformatics Techniques, Tools, and Applications. Springer, 2020. ISBN 978-9811524448 (EN)

Recommended reading

Kejnovský, E., Hobza, R. Evoluční genomika, Elportál, Brno: Masarykova univerzita, 2006. ISSN 1802-128X (CS)
Pevzner, P. A. An Introduction to Bioinformatics Algorithms (Computational Molecular Biology. The MIT Press, 2004. ISBN: 978-0262101066 (EN)

Classification of course in study plans

  • Programme MPA-BTB Master's 1 year of study, summer semester, compulsory

Type of course unit

 

Lecture

26 hod., optionally

Teacher / Lecturer

Syllabus

1. Probability concepts in basic molecular biology.
2. Classic and modern pairwise alignment algorithms.
3. Statistical significance of alignment scores and the interpretation of alignment algorithm's output.
4. Mechanism and the use of dynamic programming.
5. Implementation of Needleman-Wunch and Smith-Waterman algorithms.
6. Multiple alignment and phylogenetic reconstruction.
7. Evolution assumed by different models and algorithms.
8. Likelihood approach to phylogenetic reconstruction.
9. Markov models and hidden Markov models (HMM) in the genomic context.
10. Essential algorithms for making inference on HMM.
11. HMMs to gene finding.
12. Other algorithms in gene-finding.
13. Identify important algorithmic/statistical advances in bioinformatics that address biologically important questions.

Exercise in computer lab

26 hod., compulsory

Teacher / Lecturer

Syllabus

1. Classical and Bayes probability.
2. Pairwise alignment algorithms.
3. Computing alignment scores and the interpretation of alignment algorithm's output.
4. Algorithms for dynamic programming.
5. Implementation of Needleman-Wunch and Smith-Waterman algorithms.
6. Multiple alignment.
7. Tracking sequence evolution.
8. Phylogenetic reconstruction.
9. Markov models in the genomic context.
10. Hidden Markov models in the genomic context.
11. HMMs to gene finding I.
12. HMMs to gene finding II.
13. Other algorithms in gene-finding.

Project

13 hod., compulsory

Teacher / Lecturer

Syllabus

Individual projects from the area of analysis of biological sequences.