Student projects

Dictionary learning for sparse image representations

Sparse representations over redundant dictionaries is an emerging field that has obtained very promising results in many image processing and analysis tasks. The main idea relies on representing a given signal as a sparse linear combination of a few basis functions (called atoms) extracted from a redundant dictionary. The dictionary can be either prespecified or it can be learnt by a set of training signals. In this project we are interested in learning the basis functions of the dictionary. In particular, the goal of the project is to study the basics of dictionary learning and implement the K-SVD algorithm [1] and the Method of Optimal Directions (MOD) [2]. Optionally, the student may propose various ways of reducing the computational complexity of dictionary learning.

The basis functions that have been learnt in the first part will be further used for various image classification tasks, such as face recognition and handwritten digit image recognition.

Contact: Effrosyni Kokiopoulou (e-mail: effrosyni.kokiopoulou@sam.math.ethz.ch)

Prerequisites: Linear algebra, MATLAB programming, basic image processing.

References:
[1] M. Aharon, M. Elad, and A. Bruckstein, "The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representation", IEEE Transactions on Signal Processing, vol. 54, no. 11, pp. 4311-4322, 2006.
[2] K. Engan, S.O. Aase and J. Hakon Husoy, "Method of optimal directions for frame design", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1999.

Subspace modeling of image manifolds

Recent years have experienced a dramatic growth of rich visual information sets. When images are lexicographically represented, they give rise to high dimensional vectors due to the large number of pixels. Moreover, when an object or a pattern undergoes motion or a geometric transformation, it spans a manifold in a high dimensional space. A typical example of such manifold is a video sequence of a moving object. In many applications one needs to define an appropriate metric that allows to compare two different manifolds. The goal of this project is to develop a subspace-based model of a manifold and subsequently use it to define a distance metric between manifolds. The subspace metric will be subsequently evaluated in the context of video-based face recognition.

Contact: Effrosyni Kokiopoulou (e-mail: effrosyni.kokiopoulou@sam.math.ethz.ch)

Prerequisites: Linear algebra, MATLAB programming, basic image processing.

References:
[1]. O. Yamaguchi, K. Fukui, and K. Maeda. "Face recognition using temporal image sequence". IEEE Int. Conf. on Automatic Face and Gesture Recognition, pages 318-323, 1998.
[2]. K. Fukui and O. Yamaguchi, "Face recognition using multi-viewpoint patterns for robot vision". Int. Symp. on Robotics Research, 15:192-201, 2005.
[3]. R. Wang, S. Shan, X. Chen and W. Gao, "Manifold-Manifold Distance with Application to Face Recognition based on Image Set", IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), 2008.