Source author record

Matthieu Kowalski

Matthieu Kowalski appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC Artificial Intelligence Computation and Language Computer Vision Cryptography and Security Discrete Mathematics math.NA Methodology Neurons and Cognition

Catalog footprint

What is connected

5works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Tuning without Peeking: Provable Generalization Bounds and Robust LLM Post-Training

Gradient-based optimization is the workhorse of deep learning, offering efficient and scalable training via backpropagation. However, exposing gradients during training can leak sensitive information about the underlying data, raising privacy and security concerns such as susceptibility to data poisoning attacks. In contrast, black box optimization methods, which treat the model as an opaque function, relying solely on function evaluations to guide optimization, offer a promising alternative in scenarios where data access is restricted, adversarial risks are high, or overfitting is a concern. This paper introduces BBoxER, an evolutionary black-box method for LLM post-training that induces an information bottleneck via implicit compression of the training data. Leveraging the tractability of information flow, we provide non-vacuous generalization bounds and strong theoretical guarantees for privacy, robustness to data poisoning attacks, and extraction attacks. In experiments with LLMs, we demonstrate empirically that black-box optimization methods, despite the scalability and computational challenges inherent to black-box approaches, are able to learn, showing how a few iterations of BBoxER improve performance, generalize well on a benchmark of reasoning datasets, and are robust to membership inference attacks. This positions BBoxER as an attractive add-on on top of gradient-based optimization, offering suitability for deployment in restricted or privacy-sensitive environments while also providing non-vacuous generalization guarantees.

preprint2022arXiv

Understanding approximate and unrolled dictionary learning for pattern recovery

Dictionary learning consists of finding a sparse representation from noisy data and is a common way to encode data-driven prior knowledge on signals. Alternating minimization (AM) is standard for the underlying optimization, where gradient descent steps alternate with sparse coding procedures. The major drawback of this method is its prohibitive computational cost, making it unpractical on large real-world data sets. This work studies an approximate formulation of dictionary learning based on unrolling and compares it to alternating minimization to find the best trade-off between speed and precision. We analyze the asymptotic behavior and convergence rate of gradients estimates in both methods. We show that unrolling performs better on the support of the inner problem solution and during the first iterations. Finally, we apply unrolling on pattern learning in magnetoencephalography (MEG) with the help of a stochastic algorithm and compare the performance to a state-of-the-art method.

preprint2020arXiv

MIP and Set Covering approaches for Sparse Approximation

The Sparse Approximation problem asks to find a solution $x$ such that $||y - Hx|| < α$, for a given norm $||\cdot||$, minimizing the size of the support $||x||_0 := \#\{j \ |\ x_j \neq 0 \}$. We present valid inequalities for Mixed Integer Programming (MIP) formulations for this problem and we show that these families are sufficient to describe the set of feasible supports. This leads to a reformulation of the problem as an Integer Programming (IP) model which in turn represents a Minimum Set Covering formulation, thus yielding many families of valid inequalities which may be used to strengthen the models up. We propose algorithms to solve sparse approximation problems including a branch \& cut for the MIP, a two-stages algorithm to tackle the set covering IP and a heuristic approach based on Local Branching type constraints. These methods are compared in a computational experimentation with the goal of testing their practical potential.

preprint2016arXiv

Convex Optimization approach to signals with fast varying instantaneous frequency

Motivated by the limitation of analyzing oscillatory signals composed of multiple components with fast-varying instantaneous frequency, we approach the time-frequency analysis problem by optimization. Based on the proposed adaptive harmonic model, the time-frequency representation of a signal is obtained by directly minimizing a functional, which involves few properties an "ideal time-frequency representation" should satisfy, for example, the signal reconstruction and concentrative time frequency representation. FISTA (Fast Iterative Shrinkage-Thresholding Algorithm) is applied to achieve an efficient numerical approximation of the functional. We coin the algorithm as {\it Time-frequency bY COnvex OptimizatioN} (Tycoon). The numerical results confirm the potential of the Tycoon algorithm.

preprint2016arXiv

Social-sparsity brain decoders: faster spatial sparsity

Spatially-sparse predictors are good models for brain decoding: they give accurate predictions and their weight maps are interpretable as they focus on a small number of regions. However, the state of the art, based on total variation or graph-net, is computationally costly. Here we introduce sparsity in the local neighborhood of each voxel with social-sparsity, a structured shrinkage operator. We find that, on brain imaging classification problems, social-sparsity performs almost as well as total-variation models and better than graph-net, for a fraction of the computational cost. It also very clearly outlines predictive regions. We give details of the model and the algorithm.

Matthieu Kowalski

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Tuning without Peeking: Provable Generalization Bounds and Robust LLM Post-Training

Understanding approximate and unrolled dictionary learning for pattern recovery

MIP and Set Covering approaches for Sparse Approximation

Convex Optimization approach to signals with fast varying instantaneous frequency

Social-sparsity brain decoders: faster spatial sparsity