Machine Learning οn Multimedia Data

Course semester	2nd semester
Course category	Elective
ECTS	7,5
Tutors	Th. Giannakopoulos, I. Maglogiannis

Goal

Upon successful completion of the course, the student will be able to:

Identify and recognize opportunities, limitations and possibilities of applying multimedia signal analysis and recognition techniques in various areas of modern life
Point out the specificity of the individual problems, the selection and adaptation to them of the appropriate techniques of analysis and recognition of multimedia signals
Plan the evaluation of machine learning methods in comparison with each other, to recognize the possibilities and limitations of each method/technique, always taking into account the specificities of the multimedia data under analysis
With the ultimate goal of being able to design, build and evaluate multimedia data content segmentation, analysis, recognition and visualization systems

Also, the course targets to the following general competencies:

Ability to organize and plan work and manage time effectively
Ability to communicate effectively (orally and written)
Ability to solve problems
Ability to develop critical thinking and capacity for critical approaches
Ability to work in a team
Ability to apply theoretical knowledge in practice
Ability to research
Ability to adapt methods and techniques to new situations and conditions

Signal and image analysis topics
Audio representations and feature extraction
Audio signal characterization: classification, segmentation, clustering, matching
Voice recognition
Introduction to image data, coding and representation, basic machine vision concepts
Image processing with machine learning: segmentation, edge detection, alignment, feature extraction classification, search and retrieval
Video analysis: motion and flow analysis, time-dimensional event recognition, video metadata and annotation, search and retrieval
Using deep learning for image and video classification, convolutional neural networks, visualization and understanding, transfer learning
Using temporal representation models for video analysis

Bibliography

Digital Image Processing (4th Edition) 4th Edition, by Rafael C. Gonzalez, Richard E. Woods
Computer Vision: Models, Learning, and Inference 1st Edition, by Simon J. D. Prince
Theory and Applications of Digital Speech Processing, by Lawrence Rabiner
MPEG-7 Audio and Beyond.: Audio Content Indexing and Retrieval, by Hyoung-Gook Kim, Nicolas Moreau, Thomas Sikora
Introduction to Audio Analysis: A MATLAB® Approach, by Theodoros Giannakopoulos, Aggelos Pikrakis
Fundamentals of Music Processing: Audio, Analysis, Algorithms, Applications, by Meinard Müller
Discrete-Time Speech Signal Processing: Principles and Practice, by Thomas F. Quatieri

all classes

Detailed list

According to the decision of the Special Inter-Institutional Committee, next to the title of each course, there are the indications C, P, E, which mean that the course is compulsory, preparatory and elective respectively.