Abstract:
The recent success in Computer Vision has been mostly attributed to improved results using deep learning models trained on large labeled datasets. Many of these datasets have been labeled by humans. The labeling process, however, can be time-consuming, and in many applications, it may require expertise that could be costly to acquire. In order to address this requirement, more research focus and effort have shifted toward unsupervised learning algorithms, in order to utilize the ever-increasing quantities of unlabeled data. Self-supervised learning (SSL), in particular, is a set of algorithms that specifically aim to learn rich data representations from unlabeled samples, and it achieves comparable results to fully supervised methods on common benchmarks for image classification and segmentation. The idea behind SSL methods is to learn broad features from the signals that exist in unlabeled data. In other words, to acquire more general information and knowledge, and store them as neural network features that will be useful as prior knowledge for subsequent downstream supervised tasks (classification, segmentation, regression, ..etc.).
There are two types of SSL methods. First, self-prediction methods, which predict some omitted parts (in purpose) of the data using the other existing part of the data, such as jigsaw puzzle solving. Second, contrastive learning methods, which utilize similarities and dissimilarities, or simply relations, amongst data samples to form a classification problem, such as SimCLR (simple contrastive learning of representations). Contrastive learning methods have proven effective as representation learners in applications of natural image classification. Nevertheless, extending such algorithms to multiple application domains comes with challenges, and we identify certain limitations in these approaches. Therefore, in this talk, we will focus on contrastive learning methods and how to apply them in several computer vision applications. We also discuss the challenges and limitations we identified and how to address them in project 29.
This talk will take place in person at SCIoI.