Source author record

René Vidal

René Vidal appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning math.OC Artificial Intelligence Computational Geometry Computation and Language Cryptography and Security eess.IV Graphics Information Theory math.DG math.DS math.IT math.NA Numerical Analysis physics.med-ph

Catalog footprint

What is connected

15works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations

Large language models (LLMs) achieve strong performance across many tasks but remain vulnerable to hallucinations, motivating the need for realistic adversarial prompts that elicit such failures. We formulate hallucination elicitation as a constrained optimization problem, where the goal is to find semantically coherent adversarial prompts that are equivalent to benign user prompts. Existing methods remain limited: discrete prompt-based attacks preserve semantic equivalence and coherence but search only over a limited set of prompt variations, while continuous latent-space attacks explore a richer space but often decode into prompts that are no longer valid rephrasings. To address these limitations, we propose REALISTA, a realistic latent-space attack framework. REALISTA constructs an input-dependent dictionary of valid editing directions, each corresponding to a semantically equivalent and coherent rephrasing, and optimizes continuous combinations of these directions in latent space. This design combines the optimization flexibility of continuous attacks with the semantic realism of discrete rephrasing-based attacks. Experiments demonstrate that REALISTA achieves superior or comparable performance to state-of-the-art realistic attacks on open-source LLMs and, crucially, succeeds in attacking large reasoning models under free-form response settings, where prior realistic attacks fail. Code is available at https://github.com/Buyun-Liang/REALISTA.

preprint2022arXiv

Analysis and Extensions of Adversarial Training for Video Classification

Adversarial training (AT) is a simple yet effective defense against adversarial attacks to image classification systems, which is based on augmenting the training set with attacks that maximize the loss. However, the effectiveness of AT as a defense for video classification has not been thoroughly studied. Our first contribution is to show that generating optimal attacks for video requires carefully tuning the attack parameters, especially the step size. Notably, we show that the optimal step size varies linearly with the attack budget. Our second contribution is to show that using a smaller (sub-optimal) attack budget at training time leads to a more robust performance at test time. Based on these findings, we propose three defenses against attacks with variable attack budgets. The first one, Adaptive AT, is a technique where the attack budget is drawn from a distribution that is adapted as training iterations proceed. The second, Curriculum AT, is a technique where the attack budget is increased as training iterations proceed. The third, Generative AT, further couples AT with a denoising generative adversarial network to boost robust performance. Experiments on the UCF101 dataset demonstrate that the proposed methods improve adversarial robustness against multiple attack types.

preprint2022arXiv

ARCS: Accurate Rotation and Correspondence Search

This paper is about the old Wahba problem in its more general form, which we call "simultaneous rotation and correspondence search". In this generalization we need to find a rotation that best aligns two partially overlapping $3$D point sets, of sizes $m$ and $n$ respectively with $m\geq n$. We first propose a solver, $\texttt{ARCS}$, that i) assumes noiseless point sets in general position, ii) requires only $2$ inliers, iii) uses $O(m\log m)$ time and $O(m)$ space, and iv) can successfully solve the problem even with, e.g., $m,n\approx 10^6$ in about $0.1$ seconds. We next robustify $\texttt{ARCS}$ to noise, for which we approximately solve consensus maximization problems using ideas from robust subspace learning and interval stabbing. Thirdly, we refine the approximately found consensus set by a Riemannian subgradient descent approach over the space of unit quaternions, which we show converges globally to an $\varepsilon$-stationary point in $O(\varepsilon^{-4})$ iterations, or locally to the ground-truth at a linear rate in the absence of noise. We combine these algorithms into $\texttt{ARCS+}$, to simultaneously search for rotations and correspondences. Experiments show that $\texttt{ARCS+}$ achieves state-of-the-art performance on large-scale datasets with more than $10^6$ points with a $10^4$ time-speedup over alternative methods. \url{https://github.com/liangzu/ARCS}

preprint2022arXiv

Implicit Bias of Projected Subgradient Method Gives Provable Robust Recovery of Subspaces of Unknown Codimension

Robust subspace recovery (RSR) is a fundamental problem in robust representation learning. Here we focus on a recently proposed RSR method termed Dual Principal Component Pursuit (DPCP) approach, which aims to recover a basis of the orthogonal complement of the subspace and is amenable to handling subspaces of high relative dimension. Prior work has shown that DPCP can provably recover the correct subspace in the presence of outliers, as long as the true dimension of the subspace is known. We show that DPCP can provably solve RSR problems in the {\it unknown} subspace dimension regime, as long as orthogonality constraints -- adopted in previous DPCP formulations -- are relaxed and random initialization is used instead of spectral one. Namely, we propose a very simple algorithm based on running multiple instances of a projected sub-gradient descent method (PSGM), with each problem instance seeking to find one vector in the null space of the subspace. We theoretically prove that under mild conditions this approach will succeed with high probability. In particular, we show that 1) all of the problem instances will converge to a vector in the nullspace of the subspace and 2) the ensemble of problem instance solutions will be sufficiently diverse to fully span the nullspace of the subspace thus also revealing its true unknown codimension. We provide empirical results that corroborate our theoretical results and showcase the remarkable implicit rank regularization behavior of PSGM algorithm that allows us to perform RSR without being aware of the subspace dimension.

preprint2022arXiv

Lens free holographic imaging for urinary tract infection screening

Urinary tract infections (UTIs) are a common condition that can lead to serious complications including kidney injury, altered mental status, sepsis, and death. Laboratory tests such as urinalysis and urine culture are the mainstays of UTI diagnosis, whereby a urine specimen is collected and processed to reveal its cellular and chemical composition. This process requires precise specimen collection, handling infectious human waste, controlled urine storage, and timely transportation to modern laboratory equipment for analysis. Holographic lens free imaging (LFI) can measure large volumes of urine via a simple and compact optical setup, potentially enabling automatic urine analysis at the patient bedside. We introduce an LFI system capable of resolving important urine clinical biomarkers such as red blood cells, white blood cells, crystals, casts, and E. Coli in urine phantoms. This approach is sensitive to the particulate concentrations relevant for detecting several clinical urine abnormalities such as hematuria, pyuria, and bacteriuria. We show bacteria concentrations across eight orders of magnitude can be estimated by analyzing LFI measurements. LFI measurements of blood cell concentrations are relatively insensitive to changes in bacteria concentrations of over seven orders of magnitude. Lastly, LFI reveals clear differences between UTI-positive and UTI-negative urine from human patients. Together, these results show promise for LFI as a tool for urine screening, potentially offering early, point-of-care detection of UTI and other pathological processes.

preprint2022arXiv

Reverse Engineering $\ell_p$ attacks: A block-sparse optimization approach with recovery guarantees

Deep neural network-based classifiers have been shown to be vulnerable to imperceptible perturbations to their input, such as $\ell_p$-bounded norm adversarial attacks. This has motivated the development of many defense methods, which are then broken by new attacks, and so on. This paper focuses on a different but related problem of reverse engineering adversarial attacks. Specifically, given an attacked signal, we study conditions under which one can determine the type of attack ($\ell_1$, $\ell_2$ or $\ell_\infty$) and recover the clean signal. We pose this problem as a block-sparse recovery problem, where both the signal and the attack are assumed to lie in a union of subspaces that includes one subspace per class and one subspace per attack type. We derive geometric conditions on the subspaces under which any attacked signal can be decomposed as the sum of a clean signal plus an attack. In addition, by determining the subspaces that contain the signal and the attack, we can also classify the signal and determine the attack type. Experiments on digit and face classification demonstrate the effectiveness of the proposed approach.

preprint2022arXiv

Towards Understanding The Semidefinite Relaxations of Truncated Least-Squares in Robust Rotation Search

The rotation search problem aims to find a 3D rotation that best aligns a given number of point pairs. To induce robustness against outliers for rotation search, prior work considers truncated least-squares (TLS), which is a non-convex optimization problem, and its semidefinite relaxation (SDR) as a tractable alternative. Whether this SDR is theoretically tight in the presence of noise, outliers, or both has remained largely unexplored. We derive conditions that characterize the tightness of this SDR, showing that the tightness depends on the noise level, the truncation parameters of TLS, and the outlier distribution (random or clustered). In particular, we give a short proof for the tightness in the noiseless and outlier-free case, as opposed to the lengthy analysis of prior work.

preprint2020arXiv

Finding the Sparsest Vectors in a Subspace: Theory, Algorithms, and Applications

The problem of finding the sparsest vector (direction) in a low dimensional subspace can be considered as a homogeneous variant of the sparse recovery problem, which finds applications in robust subspace recovery, dictionary learning, sparse blind deconvolution, and many other problems in signal processing and machine learning. However, in contrast to the classical sparse recovery problem, the most natural formulation for finding the sparsest vector in a subspace is usually nonconvex. In this paper, we overview recent advances on global nonconvex optimization theory for solving this problem, ranging from geometric analysis of its optimization landscapes, to efficient optimization algorithms for solving the associated nonconvex optimization problem, to applications in machine intelligence, representation learning, and imaging sciences. Finally, we conclude this review by pointing out several interesting open problems for future research.

preprint2020arXiv

On the Regularization Properties of Structured Dropout

Dropout and its extensions (eg. DropBlock and DropConnect) are popular heuristics for training neural networks, which have been shown to improve generalization performance in practice. However, a theoretical understanding of their optimization and regularization properties remains elusive. Recent work shows that in the case of single hidden-layer linear networks, Dropout is a stochastic gradient descent method for minimizing a regularized loss, and that the regularizer induces solutions that are low-rank and balanced. In this work we show that for single hidden-layer linear networks, DropBlock induces spectral k-support norm regularization, and promotes solutions that are low-rank and have factors with equal norm. We also show that the global minimizer for DropBlock can be computed in closed form, and that DropConnect is equivalent to Dropout. We then show that some of these results can be extended to a general class of Dropout-strategies, and, with some assumptions, to deep non-linear networks when Dropout is applied to the last layer. We verify our theoretical claims and assumptions experimentally with commonly used network architectures.

preprint2016arXiv

Car Segmentation and Pose Estimation using 3D Object Models

Image segmentation and 3D pose estimation are two key cogs in any algorithm for scene understanding. However, state-of-the-art CRF-based models for image segmentation rely mostly on 2D object models to construct top-down high-order potentials. In this paper, we propose new top-down potentials for image segmentation and pose estimation based on the shape and volume of a 3D object model. We show that these complex top-down potentials can be easily decomposed into standard forms for efficient inference in both the segmentation and pose estimation tasks. Experiments on a car dataset show that knowledge of segmentation helps perform pose estimation better and vice versa.

preprint2016arXiv

Realization Theory of Stochastic Jump-Markov Linear Systems

In this paper, we present a complete stochastic realization theory for stochastic jump-linear systems. We present necessary and sufficient conditions for the existence of a realization, along with a characterization of minimality in terms of reachability and observability. We also formulate a realization algorithm and argue that minimality can be checked algorithmically. The main tool for solving the stochastic realization problem for jump-linear systems is the formulation and solution of a stochastic realization problem for a general class of bilinear systems with non-white-noise inputs. The solution to this generalized stochastic bilinear realization problem is based on the theory of formal power series. Stochastic jump-linear systems represent a special case of generalized stochastic bilinear systems.

preprint2012arXiv

Riemannian Consensus for Manifolds with Bounded Curvature

Consensus algorithms are popular distributed algorithms for computing aggregate quantities, such as averages, in ad-hoc wireless networks. However, existing algorithms mostly address the case where the measurements lie in a Euclidean space. In this work we propose Riemannian consensus, a natural extension of the existing averaging consensus algorithm to the case of Riemannian manifolds. Unlike previous generalizations, our algorithm is intrinsic and, in principle, can be applied to any complete Riemannian manifold. We characterize our algorithm by giving sufficient convergence conditions on Riemannian manifolds with bounded curvature and we analyze the differences that rise with respect to the classical Euclidean case. We test the proposed algorithms on synthetic data sampled from manifolds such as the space of rotations, the sphere and the Grassmann manifold.

preprint2011arXiv

Hypothesize and Bound: A Computational Focus of Attention Mechanism for Simultaneous 3D Shape Reconstruction, Pose Estimation and Classification from a Single 2D Image

This article presents a mathematical framework to simultaneously tackle the problems of 3D reconstruction, pose estimation and object classification, from a single 2D image. In sharp contrast with state of the art methods that rely primarily on 2D information and solve each of these three problems separately or iteratively, we propose a mathematical framework that incorporates prior "knowledge" about the 3D shapes of different object classes and solves these problems jointly and simultaneously, using a hypothesize-and-bound (H&B) algorithm. In the proposed H&B algorithm one hypothesis is defined for each possible pair [object class, object pose], and the algorithm selects the hypothesis H that maximizes a function L(H) encoding how well each hypothesis "explains" the input image. To find this maximum efficiently, the function L(H) is not evaluated exactly for each hypothesis H, but rather upper and lower bounds for it are computed at a much lower cost. In order to obtain bounds for L(H) that are tight yet inexpensive to compute, we extend the theory of shapes described in [14] to handle projections of shapes. This extension allows us to define a probabilistic relationship between the prior knowledge given in 3D and the 2D input image. This relationship is derived from first principles and is proven to be the only relationship having the properties that we intuitively expect from a "projection." In addition to the efficiency and optimality characteristics of H&B algorithms, the proposed framework has the desirable property of integrating information in the 2D image with information in the 3D prior to estimate the optimal reconstruction. While this article focuses primarily on the problem mentioned above, we believe that the theory presented herein has multiple other potential applications.

preprint2011arXiv

Hypothesize and Bound: A Computational Focus of Attention Mechanism for Simultaneous N-D Segmentation, Pose Estimation and Classification Using Shape Priors

Given the ever increasing bandwidth of the visual information available to many intelligent systems, it is becoming essential to endow them with a sense of what is worthwhile their attention and what can be safely disregarded. This article presents a general mathematical framework to efficiently allocate the available computational resources to process the parts of the input that are relevant to solve a given perceptual problem. By this we mean to find the hypothesis H (i.e., the state of the world) that maximizes a function L(H), representing how well each hypothesis "explains" the input. Given the large bandwidth of the sensory input, fully evaluating L(H) for each hypothesis H is computationally infeasible (e.g., because it would imply checking a large number of pixels). To address this problem we propose a mathematical framework with two key ingredients. The first one is a Bounding Mechanism (BM) to compute lower and upper bounds of L(H), for a given computational budget. These bounds are much cheaper to compute than L(H) itself, can be refined at any time by increasing the budget allocated to a hypothesis, and are frequently enough to discard a hypothesis. To compute these bounds, we develop a novel theory of shapes and shape priors. The second ingredient is a Focus of Attention Mechanism (FoAM) to select which hypothesis' bounds should be refined next, with the goal of discarding non-optimal hypotheses with the least amount of computation. The proposed framework: 1) is very efficient since most hypotheses are discarded with minimal computation; 2) is parallelizable; 3) is guaranteed to find the globally optimal hypothesis; and 4) its running time depends on the problem at hand, not on the bandwidth of the input. We instantiate the proposed framework for the problem of simultaneously estimating the class, pose, and a noiseless version of a 2D shape in a 2D image.

preprint2011arXiv

On The Convergence of Gradient Descent for Finding the Riemannian Center of Mass

We study the problem of finding the global Riemannian center of mass of a set of data points on a Riemannian manifold. Specifically, we investigate the convergence of constant step-size gradient descent algorithms for solving this problem. The challenge is that often the underlying cost function is neither globally differentiable nor convex, and despite this one would like to have guaranteed convergence to the global minimizer. After some necessary preparations we state a conjecture which we argue is the best (in a sense described) convergence condition one can hope for. The conjecture specifies conditions on the spread of the data points, step-size range, and the location of the initial condition (i.e., the region of convergence) of the algorithm. These conditions depend on the topology and the curvature of the manifold and can be conveniently described in terms of the injectivity radius and the sectional curvatures of the manifold. For manifolds of constant nonnegative curvature (e.g., the sphere and the rotation group in $\mathbb{R}^{3}$) we show that the conjecture holds true (we do this by proving and using a comparison theorem which seems to be of a different nature from the standard comparison theorems in Riemannian geometry). For manifolds of arbitrary curvature we prove convergence results which are weaker than the conjectured one (but still superior over the available results). We also briefly study the effect of the configuration of the data points on the speed of convergence.

René Vidal

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations

Analysis and Extensions of Adversarial Training for Video Classification

ARCS: Accurate Rotation and Correspondence Search

Implicit Bias of Projected Subgradient Method Gives Provable Robust Recovery of Subspaces of Unknown Codimension

Lens free holographic imaging for urinary tract infection screening

Reverse Engineering $\ell_p$ attacks: A block-sparse optimization approach with recovery guarantees

Towards Understanding The Semidefinite Relaxations of Truncated Least-Squares in Robust Rotation Search

Finding the Sparsest Vectors in a Subspace: Theory, Algorithms, and Applications

On the Regularization Properties of Structured Dropout

Car Segmentation and Pose Estimation using 3D Object Models

Realization Theory of Stochastic Jump-Markov Linear Systems

Riemannian Consensus for Manifolds with Bounded Curvature

Hypothesize and Bound: A Computational Focus of Attention Mechanism for Simultaneous 3D Shape Reconstruction, Pose Estimation and Classification from a Single 2D Image

Hypothesize and Bound: A Computational Focus of Attention Mechanism for Simultaneous N-D Segmentation, Pose Estimation and Classification Using Shape Priors

On The Convergence of Gradient Descent for Finding the Riemannian Center of Mass