Probleemstelling:

Recently, the rapid development of neural networks (NNs) in machine learning has led to many empirical successes in image and data processing, even outperforming humans in certain tasks. A substantial part of this success can be attributed to human-machine learning experts. They are usually in charge of a wide variety of aspects of machine learning problems. From analyzing the problem and data, towards constructing an experimental setup and tuning hyperparameters. This is often experienced as a sophisticated process driven by a considerable amount of intuition and trial-and-error strategies, where many open questions remain. For instance, how do we design an optimal NN architecture? What information in the data is exploited by the NN to make them arrive at a particular decision? Since this lack of transparency can be a major drawback, the development of methods for explaining and interpreting deep learning models has recently attracted increasing attention.

Deep learning as an instance of general representation learning is naturally connected to sparse signal representations and dictionary learning. While the development of new variants of deep neural networks has been largely driven by a considerable amount of intuition, dictionary learning offers a sound theoretical formulation. Hence, there is a lot of interest in combining the two into a powerful, yet better interpretable framework. Many problems like image and video restoration benefit from the sparse representation model. The goal of such a sparse representation model is to approximate well a signal using only a few elements from a (typically redundant) dictionary. Recently, the convolutional sparse coding (CSC) paradigm has been introduced [2]. CSC is a special case of the sparse representation model, built around a very structured dictionary being a union of banded and circulant matrices.

Figure 1: Decomposition of an image from MNIST in terms of two multilayer convolutional dictionaries. Two local convolutional atoms (bottom row) are combined to create molecules – at the second level, which are then combined to create the global atom (number 6) (see [3])

Figure 2: A high-level overview of algorithm unrolling. Given (a) an iterative algorithm, (b) a corresponding deep network can be generated by cascading the algorithm’s iterations. [4]

An emerging technique called algorithm unrolling, or unfolding offers promise in eliminating many interpretability issues by providing a concrete and systematic connection between iterative algorithms that are widely used in signal processing and deep neural networks. Unrolling methods recently has attracted enormous attention, and it is rapidly growing in both theoretic investigations and practical applications. However, many aspects of this approach are yet to be explored, both theoretically and in terms of practical design. Taking into account that this approach allows analysis of neural network architectures and suggests how to build new ones in a systematic fashion, different sparse coding and optimization techniques should be analyzed and compared since different architectures are resulting from different solvers for the features. The student will be guided by the supervisors from the research group GAIM whose research expertise is on representation learning, deep learning, and sparse coding.

Figure 2: Multimodal medical data (BraTS 2020 Challenge). Figure 3: Hyperspectral image collection.

Doelstelling:

The ability to interpret the learning process is receiving an increasing amount of attention and therefore there is a high need for extension of the interpretable framework that sparse representations offer. The goal of the thesis is to build on recent works in multi-scale convolutional sparse coding. Firstly, the student should study and understand the theory behind representation learning, deep learning, and deep unrolling in particular. In this regard, a 3-day minicourse will be organized at the beginning in order to familarize the student with the topic and to put him/her on the right track. Secondly, concrete solvers for the features should be chosen and the performance of the constructed architectures should be compared both among themselves and against some of the more conventional tools for image processing tasks. Practical applications will be chosen in agreement with the student based on his/her affinities. Possible applications are in large-scale hyperspectral data processing in remote sensing and multimodal data analysis in art investigation or medical image processing (see Fig. 2 and 3). This implies that the existing deep unrolling and convolutional sparse coding models should be adapted towards hyperspectral and multimodal data. The existing code and literature will also be made available to the student.

References:

- M. Elad, D. Simon, A. Aberdam, Another step toward demystifying deep neural networks, 2020.
- P. Vardan, Y. Romano, M. Elad. Convolutional neural networks analyzed via convolutional sparse coding, 2017.
- V. Papyan, Y. Romano, J. Sulam, M. Elad, Theoretical foundations of deep learning via sparse representations, 2018.
- V. Monga, Y. Li, and Y. C. Eldar, Algorithm Unrolling Vishal Monga, Yuelong Li, and Yonina C. Eldar Interpretable, efficient deep learning for signal and image processing, 2021.