Object localization in images is a key problem in a wide range of application domains that are embedded in critical settings such as self-driving vehicles or healthcare. However, most efficient solutions able to perform an object localization task follow the standard object detection and semantic segmentation frameworks, meaning that they require large amounts of annotated data for training. Different heuristics and tools can now assist and enhance human annotators, however manual annotation remains a largely heavy and expensive process. Moreover, perception models based on annotations enter a dependence circle of additional annotations for every new object class to detect or new external conditions to cover, e.g. in/outdoor, different times of the day, weathers. Such models struggle in dealing with our open complex world that is evolving continuously.
Recent works have shown exciting prospects of avoiding annotations altogether by (1) leveraging self-supervised features, (2) building self-supervised object-centric objectives and (3) combining different modalities. In this context, we propose a half-day tutorial in which we will provide an in-depth coverage of different angles on performing/building-upon object localization with no human supervision.
08:30 - 09:10 - Setting the stage: Visual objects in scene understanding by Patrick Pérez
09:10 - 10:00 - Exploiting self-supervised features: unsupervised object localization by Oriane Siméoni
10:00 - 10:20 - Break
10:20 - 11:10 - Self-supervised learning integrating object aware priors by Thomas Kipf
11:10 - 12:00 - Discovering objects with multi-modal signals by Weidi Xie
12:00 - 12:10 - Closing Remarks
For details, please contact Oriane Siméoni.
Last updated: 28th or March 2023