Inhalt
[ 921INSYMSRK13 ] KV Multimedia Search and Retrieval
|
|
|
|
 |
Workload |
Education level |
Study areas |
Responsible person |
Hours per week |
Coordinating university |
4,5 ECTS |
M1 - Master's programme 1. year |
Computer Science |
Markus Schedl |
3 hpw |
Johannes Kepler University Linz |
|
|
 |
Detailed information |
Original study plan |
Master's programme Computer Science 2025W |
Learning Outcomes |
Competences |
Students understand and can apply basic methods of multimedia signal processing and analysis. They are able to design, implement, and evaluate multimedia search and retrieval systems.
|
|
Skills |
Knowledge |
- Knowing, understanding, and applying various methods to extract features from texts [k3]
- Knowing, understanding, and applying various methods to extract features from audio/music [k3]
- Knowing, understanding, and applying various methods to extract features from images and videos [k3]
- Knowing, understanding, and applying various methods to fuse single-modal representations using early and late fusion approaches, to build multimodal ML-based systems [k3]
- Understanding various metrics to quantify accuracy aspects of a retrieval system, and analyze them for different retrieval algorithms [k4]
- Given a predefined search/retrieval task on multimedia data, elaborate an approach to solve it, using precomputed descriptors and datasets; build a prototype system implementing the devised approach; evaluate the approach/system [k6]
|
- Different media types and their "semantic gap"
- Text retrieval: data sources, key concepts, inverted index, Boolean retrieval model, vector space model, relevance feedback, latent semantic analysis, PageRank
- Audio/music retrieval: basics of audio signal processing, time- and frequency-domain features, content similarity, acoustic scales, bag-of-audio-words, hash tokens/fingerprinting, applications of music information retrieval, context-based similarity
- Image retrieval: low-level features (color, texture, shape), salient points/SIFT, bag-of-visual-words, semantic descriptors
- Video: compression, content-based video descriptors (e.g., color, texture, shape, motion), segmentation and summarization, deep learning-based image/video processing
- Multimedia data fusion: early fusion and late fusion
- Evaluation metrics for information retrieval systems, including precision, recall, F1 score, NDCG, MAP, beyond-accuracy aspects
|
|
Criteria for evaluation |
Written exam (exceptional oral exams possible), practical exercise(s), reports, presentations
|
Methods |
Lectures, practical exercise(s)
|
Language |
English |
Study material |
Slides, scientific papers
|
Changing subject? |
No |
|
|
 |
On-site course |
Maximum number of participants |
- |
Assignment procedure |
Direct assignment |
|
|
|