[ 665MBBTGEDU16 ] VU Genomic Data Analysis

Workload Education level Study areas Responsible person Hours per week Coordinating university
6 ECTS B2 - Bachelor's programme 2. year (*)Biophysik Irene Tiemann-Boege 4 hpw Johannes Kepler University Linz
Detailed information
Original study plan Bachelor's programme Molecular Biosciences 2018W
Objectives With the advent of the human genome project, new tools and resources have become available that deeply impact the field of biology. The aim of this curse is to introduce students to the different tools and databases necessary for the analysis of genomic information that are key for any research project in biology. Additionally, the course offers a laboratory module that will guide students through the different steps for accessing biological databases and their use. Goals

  1. To provide an introduction to genomic databases with a focus on the National Center for Biotechnology Information (NCBI), UCSC, and ENSEMBL
  2. To focus on the analysis of DNA and proteins
  3. To introduce the student to the analysis of genomes
  4. To combine theory and practice to help students solving common research problems in biology with the resources and information available in different online databases.
Subject PART 1 – Introduction to genomics
1. The Human Genome Project

  • Definition of bioinformatics/genomics
  • The Human Genome Project - the start of genomics
  • Sequencing the human genome
  • Assembly: paired-end and shotgun sequencing
  • Main conclusions of the human genome project

2. Genomic variation

  • Genomic variation
  • From SNPs to copy number variants and their evolution
  • HapMap project
  • Uses of SNPs

3. Genome projects/ Comparative genomics

  • Methods to detect genomic variation
  • Sequencing Projects
  • Understanding a genome sequence
  • Structural features of a genome

4. Emerging sequencing technologies

  • New sequencing technologies (NGS)
  • Principles of next generation sequencing technologies
  • Commercial platforms
  • Uses of NGS
  • Individual genomes

5. Application of genomics

  • 3 study cases - why is genomics important?
  • Genetics perspective - cure diseases
  • Synthetic biology: build your own genome
  • Evolutionary biology: Where do we come from?
  • Commercializing genomics
  • Ethical aspects

PART 2 – Introduction to databases (computer lab based)
1. Accessing information about DNA, proteins, diseases, and literature using the NCBI database

  • Introduction into databases in general
  • Overview of the NCBI website
  • Accessing information: accession numbers, RefSeq, FASTA sequence, genome assembly
  • NCBI databases: Gene, CCDS, Taxonomy, Nucleotide, Protein
  • NCBI-based database to get information about genetic diseases: OMIM
  • NCBI-based database for literature search: Pubmed

2. Polymorphisms, PCR, primer design and genotyping

  • Definition of SNP, allele, genotype, haplotype
  • dbSNP database and Hardy-Weinberg Equilibrium
  • What is a PCR and how does it work?
  • Primer design with Primer3Plus
  • Design of restriction enzyme digests and genotyping assay using NEBCutter
  • Polymorphisms associated with cancer using COSMIC database
  • Polymorphisms within a population using gnomAD database

3. Genome Browsers and sequence alignments

  • Two genome browsers: UCSC, ENSEMBL
  • Definitions: homologs, paralogs, orthologs
  • BLAST – Basic local alignment search tool
  • How to use BLAST: pairwise and database sequence alignments
  • Scoring Matrices
  • How to interpret BLAST results
  • Primer-BLAST
  • Multiple sequence alignments using Clustal Omega

4. Protein analysis

  • Introduction to proteins and protein structure
  • Protein databases to get general information: UniProt, ExPASy (ProtParam)
  • 3D structure of proteins using the Protein Data Bank (PDB)
  • Analyze 3D protein structures using JmolS
  • Mapping genomic variants to protein sequence and structure with VarMap
  • The Human Protein Atlas

5. Revision and final report

  • Clinical Report about a rare genetic variant/disease is used to practice the use of all databases and tools again discussed in the computer lab sessions. Students report their results within in a ‘Final Report’.
Criteria for evaluation
  • 50% Final exam (short answer / multiple choice questions based on the material learned during the course and the lab modules)
  • 50% Lab work (results from computer lab)
    • 32% Quizzes: exercises that have been addressed in computer lab sessions 1-4 are reviewed with short answers / multiple choice questions based on a Moodle-Test
    • 18% Final report of computer lab session 5, due at predefined date following the last class
Methods The course will be taught in two parts. The first part will focus on the theoretical background of genomics including topics in genetics, molecular biology, and biochemistry. The second part will provide an introduction to the databases with step to step examples of how to retrieve different information. During the laboratory module students will solve a series of problems based on the taught material.
Language English
Study material
  1. Bioinformatics and Functional Genomics by Jonathan Pevsner (Wiley-Blackwell, 2nd edition 2009).
  2. A Primer of Genome Science by Greg Gibson (Spencer V. Muse Publisher: Sinauer Associates, 3rd Edition 2008)
Changing subject? No
Further information Until term 2016S known as: 665GEDAGEDU11 VU Genomic Data Analysis
Earlier variants They also cover the requirements of the curriculum (from - to)
665GEDAGEDU11: VU Genomic Data Analysis (2011W-2016S)
On-site course
Maximum number of participants 25
Assignment procedure Assignment according to priority