PhD defense Iyad Walwil: Efficiently Solving Saddle Point Problems
Télécom Paris, 19 place Marguerite Perey F-91120 Palaiseau [getting there], amphi 3 and in videoconferencing
Full title: Efficiently Solving Saddle Point Problems: Smoothed Duality Gap-Based Stopping Criterion for Convex Settings and a Nonconvex Coordinate Descent Algorithm
Jury
- Antonin Chambolle, Research Director, CNRS & Université Paris Dauphine-PSL, France — Reviewer
- Franck Iutzeler, Professor, Institut Mathématique de Toulouse, France — Reviewer
- Andrea Simonetto, Research Professor, ENSTA, France — Examiner
- Panagiotis Patrinos, Associate Professor, KU Leuven, Belgium — Examiner
- Aude Rondepierre, Professor, L’INSA de Toulouse, France — Examiner
- Olivier Fercoq, Professor, Télécom Paris, France — Thesis Supervisor
Abstract
Mathematical optimization lies at the heart of a vast range of scientific and engineering disciplines. From machine learning, artificial intelligence, and data science to transportation networks, resource allocation, and game theory, among many others. Regardless of the domain, such problems are ultimately formulated as optimization problems, with the aim of solving them efficiently—most often through a suitable algorithm.
While this thesis does not focus on a specific real-world application, it concerns a more-in-depth and fundamental topic: algorithms. In particular, we study primal-dual algorithms for solving saddle point problems, with the overarching goal of contributing toward making such algorithms more efficient in practice. Two elements are especially critical to the performance of any algorithm: the stopping criterion that decides when to halt, and the update steps that drive progress toward a solution. Accordingly, this work is divided into two main parts, each devoted to one of these key aspects.
In the first part, we optimize the running time of the primal-dual algorithms by optimizing their stopping criteria for solving convex optimization problems under affine equality constraints, which means terminating the algorithm earlier with fewer iterations. We study the relations between four stopping criteria and show under which conditions they are accurate to detect optimal solutions. The uncomputable one: « Optimality gap and Feasibility error”, and the computable ones: the « Karush-Kuhn-Tucker error », the « Projected Duality Gap”, and the « Smoothed Duality Gap”.
Assuming metric sub-regularity or quadratic error bound, we establish that all of the computable criteria provide practical upper bounds for the optimality gap, and approximate it effectively. Furthermore, we establish comparability between some of the computable criteria under certain conditions. Numerical experiments on basis pursuit, and quadratic programs with(out) non-negative weights corroborate these findings and show that the smoothed duality gap is more widely applicable than the rest.
In the second part, we introduce two novel primal-dual algorithms for addressing nonconvex, nonconcave, and nonsmooth saddle point problems characterized by the weak Minty Variational Inequality (MVI) assumption. The first algorithm, Nonconvex-Nonconcave Primal-Dual Hybrid Gradient (NC-PDHG), extends the well-known Primal-Dual Hybrid Gradient (PDHG) method to this challenging problem class. The second algorithm, Nonconvex-Nonconcave Stochastic Primal-Dual Hybrid Gradient (NC-SPDHG), incorporates a randomly extrapolated primal-dual coordinate descent approach, extending the Stochastic Primal-Dual Hybrid Gradient (SPDHG) algorithm.
To our knowledge, designing a coordinate-based algorithm to solve nonconvex-nonconcave saddle point problems is unprecedented, and proving its convergence posed significant difficulties. This challenge motivated us to utilize PEPit, a Python-based tool for computer-assisted worst-case analysis of first-order optimization methods. By integrating PEPit with automated Lyapunov function techniques, we successfully derived the NC-SPDHG algorithm.
Both methods are effective under a mild condition on the weak MVI parameter, achieving convergence with constant step sizes that adapt to the structure of the problem. Numerical experiments on sigmoid regression with squared loss and perceptron-regression problems validate our theoretical findings and show their efficiency compared to existing state-of-the-art algorithms, where linear convergence is observed. Additionally, we conduct a convex-concave least-squares experiment to show that NC-SPDHG performs competitively with SAGA, a leading algorithm in the smooth convex setting.