Content of lectures:
1. Introduction to bioinformatics: characterization of the course, conditions for the course, definitions of bioinformatics, methodology, history, examples. DNA sequence as information, formats.
2. Biological databases: DNA and its structure, history. How to specify sequence similarity, substitution matrices. Sequence alignment, global and local alignment, construction of alignment, BLAST search, multiple alignment, tools and programs, use of multiple alignment. Identification of sequence motifs. Hidden Markov models.
3. Phylogenetic analysis: types of sequences used for phylogenetics analyses, description of a phylogenetics tree, rooted and unrooted trees, outgroup and ingroup. Editing of alignments for phylogenies. Informative and non-informative positions.
4. Phylogenetic analysis: taxon sampling and its influence for the tree topology. Methods for phylogenetics trees construction, distance methods, character based methods. How to deal with different speed of evolution. Robustness of the tree. Phylogenetic species trees and gene trees. Programs. Phylogenetic artifacts.
5. Classification of organisms, current view on evolution and classification of eukaryotes. Endosymbiotic origin of eukaryotes and its effect on topologies of trees. Chimeric structure of eukaryotic genome.
6. Principles of import of nuclear encoded proteins into eukaryotic organelles. Primary and secondary endosymbiosis and origin of eukaryotic organelles. Prediction of protein targeting. Mapping of metabolic pathways. Mosaic origin of eukaryotic metabolic pathways.
7. Introduction to Python for Biologists 1 Introduction to programming. How does python work. Reading, Writing and Filtering sequence data files. Counting GC content. Transcription/Translation.
8. Introduction to Python for Biologists 2 Using functions and modules. Modifying fastq files. Reading and analyzing blast result. Identifying contaminations using Blast. Pipelines.
9. High through put sequencing methods. Next-generation sequencing Pyrosequencing, Solexa, SOLiD. Third-generation sequencing Pacific Biosciences and Oxford Nanopore sequencing. Advantages and pitfalls.
10. Genome evolution historical concepts of genome evolution, evolutionary forces that shape the structure and content of the genomes, changes in genomes related to the life-history of organisms
11. Genome Sequencing Historical overview of genome sequencing. First organisms - bacteriophages (MS2 PhiX174), bacteria (Haemophilus influenza, E. coli), first eukaryotic genomes, human genome. Scaled-up DNA sequencing to tackle larger genomes (use of human genome project as a case study). Historical perspective and public versus private initiatives. Techniques used to perform large scale sequencing, Genome sequencing of model organisms - which, how and why?
12. Human evolution, medical applications. Gene expression, microarray, RNASeq, differential expression. Meta-omics: metabolic reconstruction, functional ecology.
13. Phylogenomic analyses Advantages and limitations, single-gene vs multi-gene phylogeny. Orthologue vs Paralogues. Taxon sampling, Strategies to construction large trees. Special phylogenetic models. Results evaluation.
14. High performance computing and its usage in Biology. How does High performance computers work and practical introduction to their effective usage. MPI vs OpenMP. Why do we need High Performance Computers? High memory vs computational capacity.
Content of practices:
Practical training of the methods covered by the lectures
|