Semi-supervised learning of local image features

Supervision: Michał Tyszkiewicz, Sina Honari

Our recent paper proposes DISK: a new, powerful approach to finding features which can be reliably matched across different images (see also a visualization as sparse optical flow). Thanks to using principles of reinforcement learning, it is very flexible - the scientist merely needs to define a reward for each proposed feature match and the algorithm learns to maximize it. The idea behind this project is to formulate a semi-supervised reward for images with no ground truth correspondences.

One idea is to extract features on triplets of images A, B, C and reward matches which are consistent across the images, that is A↔B, A↔C, B↔C. Although with no ground truth we can’t supervise any single match, their consistency across three images is unlikely to be spurious. Nevertheless, there are some trivial solutions which will need to be ruled out. We hope that the algorithm discovers semantically meaningful features - for instance, given human images, learns to match different body parts. Aside from the scientific nature of this project, it also has an engineering side to it, with some adaptations to the algorithm necessary in order to keep the problem tractable. Other ideas for formulating un- and semi-supervised objectives for local features are also welcome.

Familiarity with Python and machine learning is necessary. Unlike the project on human tracking with DISK, here the student will likely need to become familiar with the RL aspect of the algorithm.