Agenda

PhD defense Yasser Benigmim: Domain Adaptation in the Era of Foundation Models

Friday 12 December, 2025, at 13:30 (Paris time) at Télécom Paris

Télécom Paris, 19 place Marguerite Perey F-91120 Palaiseau [getting there], amphi Estaunié and in videoconferencing

Jury

  • Ismail Ben Ayed, Professor, ETS Montréal, Canada (Rapporteur)
  • Yuki Asano, Professor, University of Technology Nuremberg, Germany (Rapporteur)
  • Renaud Marlet, Research Director (HDR), Ecole Nationale des Ponts et Chaussées, France (Examiner)
  • Camille Couprie, Research Scientist, Facebook AI Research, France (Examiner)
  • Karteek Alahari, Research Director (HDR), Inria, Grenoble Alpes University, France (Examiner)
  • Stéphane Lathuilière, Research Scientist (HDR), Inria, Grenoble Alpes University, France (Thesis Director)
  • Vicky Kalogeiton, Professor (HDR), Ecole Polytechnique, France (Thesis Co-Supervisor)

Guests:

  • Slim Essid, Senior Scientist (HDR), NVIDIA (Thesis Co-Supervisor, Guest)
  • Raoul de Charette, Research Director (HDR), Inria (Guest)

Abstract

Deep learning has revolutionized computer vision, yet its reliance on massive labeled datasets creates a significant bottleneck for semantic segmentation. This challenge is further compounded by « domain shift, » which occurs when a model encounters data drawn from a different distribution than the one it was trained on, leading to poor generalization in real-world environments.

Learn more
In this presentation, I will address how Foundation Models (FMs) can be tailored to solve these adaptation challenges under resource constraints through three key contributions. I will first present a method for One-shot Unsupervised Domain Adaptation that personalizes text-to-image diffusion models to generate diverse, style-consistent training data from a single target image. Next, I will introduce a framework where multiple FMs (CLIP, LLMs, Diffusion Models, and SAM) collaborate to achieve domain generalization by automating the generation of high-quality pseudo-labels. Finally, I will discuss FLOSS, a training-free strategy for open-vocabulary segmentation that optimizes CLIP performance by automatically identifying specific « class-expert » text templates.