PhD defense Khalid Oublal: From Signals to Structures: Advances in Explainable Representations for Sequential Generative Models
Télécom Paris, 19 place Marguerite Perey F-91120 Palaiseau [getting there], amphi 3 and in videoconferencing
Jury
- Francesco Locatello, Assistant Professor, ISTA, Google Research, Austria (Reviewer)
- Yann Traonmilin, Research Scientist CNRS (HDR), Institut de Mathématiques de Bordeaux, France (Reviewer)
- Jesse Read, Professor (HDR), École polytechnique, France (Examiner)
-
François Roueff, Professor (HDR), Télécom Paris, France (Thesis Supervisor)
- Saïd Ladjal, Professor (HDR), Télécom Paris, France (Thesis Co-Supervisor)
- Emmanuel Le Borgne, Senior Research Scientist, TotalEnergies, France (Thesis Co-Supervisor, guest)
- David Benhaiem, Senior Research Scientist, TotalEnergies, France (Thesis Co-Supervisor, guest)
Abstract
At the heart of intelligence, whether biological or artificial, lies the ability to transform raw signals into structured, meaningful, and interpretable representations. Human cognition continuously organizes sensory streams into complex patterns that reveal regularities, causal relationships, and abstract concepts. Modern machine learning pursues a similar goal through generative models, whose purpose is to uncover the hidden processes shaping observed data.
Yet in practice, learned representations often struggle to isolate the true underlying factors or to fully preserve causal and temporal structure across different contexts. This stems from a fundamental issue: without appropriate structural constraints, multiple latent explanations can equally reproduce the same observations. This dissertation, « From Signals to Structures », explores how structural biases, such as sparsity, invariance, support factorization, and generator compositionality, can effectively guide sequential models toward reliable and interpretable latent representations. DIOSC disentangles correlated latent factors by enforcing independence-of-support constraints, facilitating out-of-distribution generalization. TABVAE captures instantaneous causal relationships through a temporal attention bottleneck and leverages sparsity to achieve this effectively. Extensions based on pretrained autoencoders and diffusion models further enhance latent causal graph recovery and representation quality. DiLOS exploits sparsity and compositional structure to separate sources when some latent factors are only partially observed. Finally, TimeSAE provides faithful post-hoc interpretability by estimating causal effects in latent space and aligning learned representations with human-understandable concepts. To evaluate these approaches, we introduce eHabitat, a dataset and benchmark for sequential representation learning and source separation, contributing to CO₂ reduction and energy sustainability.