Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2026arXiv

Asymptotic Universal Alignment: A New Alignment Framework via Test-Time Scaling

Aligning large language models (LLMs) to serve users with heterogeneous and potentially conflicting preferences is a central challenge for personalized and trustworthy AI. We formalize an ideal notion of universal alignment through test-time scaling: for each prompt, the model produces $k\ge 1$ candidate responses and a user selects their preferred one. We introduce $(k,f(k))$-robust alignment, which requires the $k$-output model to have win rate $f(k)$ against any other single-output model, and asymptotic universal alignment (U-alignment), which requires $f(k)\to 1$ as $k\to\infty$. Our main result characterizes the optimal convergence rate: there exists a family of single-output policies whose $k$-sample product policies achieve U-alignment at rate $f(k)=\frac{k}{k+1}$, and no method can achieve a faster rate in general. We show that popular post-training methods, including Nash learning from human feedback (NLHF), can fundamentally underutilize the benefits of test-time scaling. Even though NLHF is optimal for $k=1$, sampling from the resulting (often deterministic) policy cannot guarantee win rates above $\tfrac{1}{2}$ except for an arbitrarily small slack. This stems from a lack of output diversity: existing alignment methods can collapse to a single majority-preferred response, making additional samples redundant. In contrast, our approach preserves output diversity and achieves the optimal test-time scaling rate. In particular, we propose a family of symmetric multi-player alignment games and prove that any symmetric Nash equilibrium policy of the $(k+1)$-player alignment game achieves the optimal $(k,\frac{k}{k+1})$-robust alignment. Finally, we provide theoretical convergence guarantees for self-play learning dynamics in these games and extend the framework to opponents that also generate multiple responses.

preprint2026arXiv

Smooth Operator: Smooth Verifiable Reward Activates Spatial Reasoning Ability of Vision-Language Model

Vision-Language Models (VLMs) face a critical bottleneck in achieving precise numerical prediction for 3D scene understanding. Traditional reinforcement learning (RL) approaches, primarily based on relative ranking, often suffer from severe reward sparsity and gradient instability, failing to effectively exploit the verifiable signals provided by 3D physical constraints. Notably, in standard GRPO frameworks, relative normalization causes "near-miss" samples (characterized by small but non-zero errors) to suffer from advantage collapse. This leads to a severe data utilization bottleneck where valuable boundary samples are discarded during optimization. To address this, we introduce the Smooth Numerical Reward Activation (SNRA) operator and the Absolute-Preserving GRPO (AP-GRPO) framework. SNRA employs a dynamically parameterized Sigmoid function to transform raw feedback into a dense, continuous reward continuum. Concurrently, AP-GRPO integrates absolute scalar gradients to mitigate the numerical information loss inherent in conventional relative-ranking mechanisms. By leveraging this approach, we constructed Numerical3D-50k, a dataset comprising 50,000 verifiable 3D subtasks. Empirical results indicate that AP-GRPO achieves performance parity with large-scale supervised methods while maintaining higher data efficiency, effectively activating latent 3D reasoning in VLMs without requiring architectural modifications.

preprint2025arXiv

A Unified Approach to Submodular Maximization Under Noise

We consider the problem of maximizing a submodular function with access to a noisy value oracle for the function instead of an exact value oracle. Similar to prior work, we assume that the noisy oracle is persistent in that multiple calls to the oracle for a specific set always return the same value. In this model, Hassidim and Singer (2017) design a $(1-1/e)$-approximation algorithm for monotone submodular maximization subject to a cardinality constraint, and Huang et al (2022) design a $(1-1/e)/2$-approximation algorithm for monotone submodular maximization subject to any arbitrary matroid constraint. In this paper, we design a meta-algorithm that allows us to take any "robust" algorithm for exact submodular maximization as a black box and transform it into an algorithm for the noisy setting while retaining the approximation guarantee. By using the meta-algorithm with the measured continuous greedy algorithm, we obtain a $(1-1/e)$-approximation (resp. $1/e$-approximation) for monotone (resp. non-monotone) submodular maximization subject to a matroid constraint under noise. Furthermore, by using the meta-algorithm with the double greedy algorithm, we obtain a $1/2$-approximation for unconstrained (non-monotone) submodular maximization under noise.

preprint2022arXiv

Compute- and Data-Intensive Networks: The Key to the Metaverse

The worlds of computing, communication, and storage have for a long time been treated separately, and even the recent trends of cloud computing, distributed computing, and mobile edge computing have not fundamentally changed the role of networks, still designed to move data between end users and pre-determined computation nodes, without true optimization of the end-to-end compute-communication process. However, the emergence of Metaverse applications, where users consume multimedia experiences that result from the real-time combination of distributed live sources and stored digital assets, has changed the requirements for, and possibilities of, systems that provide distributed caching, computation, and communication. We argue that the real-time interactive nature and high demands on data storage, streaming rates, and processing power of Metaverse applications will accelerate the merging of the cloud into the network, leading to highly-distributed tightly-integrated compute- and data-intensive networks becoming universal compute platforms for next-generation digital experiences. In this paper, we first describe the requirements of Metaverse applications and associated supporting infrastructure, including relevant use cases. We then outline a comprehensive cloud network flow mathematical framework, designed for the end-to-end optimization and control of such systems, and show numerical results illustrating its promising role for the efficient operation of Metaverse-ready networks.

preprint2022arXiv

Computing Simple Mechanisms: Lift-and-Round over Marginal Reduced Forms

We study revenue maximization in multi-item multi-bidder auctions under the natural item-independence assumption - a classical problem in Multi-Dimensional Bayesian Mechanism Design. One of the biggest challenges in this area is developing algorithms to compute (approximately) optimal mechanisms that are not brute-force in the size of the bidder type space, which is usually exponential in the number of items in multi-item auctions. Unfortunately, such algorithms were only known for basic settings of our problem when bidders have unit-demand [CHMS10,CMS15] or additive valuations [Yao15]. In this paper, we significantly improve the previous results and design the first algorithm that runs in time polynomial in the number of items and the number of bidders to compute mechanisms that are $O(1)$-approximations to the optimal revenue when bidders have XOS valuations, resolving an open problem raised in [CM16,CZ17]. Moreover, the computed mechanism has a simple structure: It is either a posted price mechanism or a two-part tariff mechanism. As a corollary of our result, we show how to compute an approximately optimal and simple mechanism efficiently using only sample access to the bidders' value distributions. Our algorithm builds on two innovations that allow us to search over the space of mechanisms efficiently: (i) a new type of succinct representation of mechanisms - the marginal reduced forms, and (ii) a novel Lift-and-Round procedure that concavifies the problem.

preprint2022arXiv

Dynamic Control of Data-Intensive Services over Edge Computing Networks

Next-generation distributed computing networks (e.g., edge and fog computing) enable the efficient delivery of delay-sensitive, compute-intensive applications by facilitating access to computation resources in close proximity to end users. Many of these applications (e.g., augmented/virtual reality) are also data-intensive: in addition to user-specific (live) data streams, they require access to (static) digital objects (e.g., image database) to complete the required processing tasks. When required objects are not available at the servers hosting the associated service functions, they must be fetched from other edge locations, incurring additional communication cost and latency. In such settings, overall service delivery performance shall benefit from jointly optimized decisions around (i) routing paths and processing locations for live data streams, together with (ii) cache selection and distribution paths for associated digital objects. In this paper, we address the problem of dynamic control of data-intensive services over edge cloud networks. We characterize the network stability region and design the first throughput-optimal control policy that coordinates processing and routing decisions for both live and static data-streams. Numerical results demonstrate the superior performance (e.g., throughput, delay, and resource consumption) obtained via the novel multi-pipeline flow control mechanism of the proposed policy, compared with state-of-the-art algorithms that lack integrated stream processing and data distribution control.

preprint2022arXiv

Is Selling Complete Information (Approximately) Optimal?

We study the problem of selling information to a data-buyer who faces a decision problem under uncertainty. We consider the classic Bayesian decision-theoretic model pioneered by [Blackwell, 1951, 1953]. Initially, the data buyer has only partial information about the payoff-relevant state of the world. A data seller offers additional information about the state of the world. The information is revealed through signaling schemes, also referred to as experiments. In the single-agent setting, any mechanism can be represented as a menu of experiments. [Bergemann et al., 2018] present a complete characterization of the revenue-optimal mechanism in a binary state and binary action environment. By contrast, no characterization is known for the case with more actions. In this paper, we consider more general environments and study arguably the simplest mechanism, which only sells the fully informative experiment. In the environment with binary state and $m\geq 3$ actions, we provide an $O(m)$-approximation to the optimal revenue by selling only the fully informative experiment and show that the approximation ratio is tight up to an absolute constant factor. An important corollary of our lower bound is that the size of the optimal menu must grow at least linearly in the number of available actions, so no universal upper bound exists for the size of the optimal menu in the general single-dimensional setting. For multi-dimensional environments, we prove that even in arguably the simplest matching utility environment with 3 states and 3 actions, the ratio between the optimal revenue and the revenue by selling only the fully informative experiment can grow immediately to a polynomial of the number of agent types. Nonetheless, if the distribution is uniform, we show that selling only the fully informative experiment is indeed the optimal mechanism.

preprint2022arXiv

Joint Compute-Caching-Communication Control for Online Data-Intensive Service Delivery

Emerging Metaverse applications, designed to deliver highly interactive and immersive experiences that seamlessly blend physical reality and digital virtuality, are accelerating the need for distributed compute platforms with unprecedented storage, computation, and communication requirements. To this end, the integrated evolution of next-generation networks (e.g., 5G and beyond) and distributed cloud technologies (e.g., fog and mobile edge computing), have emerged as a promising paradigm to address the interaction- and resource-intensive nature of Metaverse applications. In this paper, we focus on the design of control policies for the joint orchestration of compute, caching, and communication (3C) resources in next-generation distributed cloud networks for the efficient delivery of Metaverse applications that require the real-time aggregation, processing, and distribution of multiple live media streams and pre-stored digital assets. We describe Metaverse applications via directed acyclic graphs able to model the combination of real-time stream-processing and content distribution pipelines. We design the first throughput-optimal control policy that coordinates joint decisions around (i) routing paths and processing locations for live data streams, together with (ii) cache selection and distribution paths for associated data objects. We then extend the proposed solution to include a max-throughput database placement policy and two efficient replacement policies. In addition, we characterize the network stability regions for all studied scenarios. Numerical results demonstrate the superior performance obtained via the novel multi-pipeline flow control and 3C resource orchestration mechanisms of the proposed policy, compared with state-of-the-art algorithms that lack full 3C integrated control.

preprint2022arXiv

Mobile Edge Computing Network Control: Tradeoff Between Delay and Cost

As mobile edge computing (MEC) finds widespread use for relieving the computational burden of compute- and interaction-intensive applications on end user devices, understanding the resulting delay and cost performance is drawing significant attention. While most existing works focus on singletask offloading in single-hop MEC networks, next generation applications (e.g., industrial automation, augmented/virtual reality) require advance models and algorithms for dynamic configuration of multi-task services over multi-hop MEC networks. In this work, we leverage recent advances in dynamic cloud network control to provide a comprehensive study of the performance of multi-hop MEC networks, addressing the key problems of multi-task offloading, timely packet scheduling, and joint computation and communication resource allocation. We present a fully distributed algorithm based on Lyapunov control theory that achieves throughput-optimal performance with delay and cost guarantees. Simulation results validate our theoretical analysis and provide insightful guidelines on the interplay between communication and computation resources in MEC networks.

preprint2022arXiv

On Multi-Dimensional Gains from Trade Maximization

We study gains from trade in multi-dimensional two-sided markets. Specifically, we focus on a setting with $n$ heterogeneous items, where each item is owned by a different seller $i$, and there is a constrained-additive buyer with feasibility constraint $\mathcal{F}$. Multi-dimensional settings in one-sided markets, e.g. where a seller owns multiple heterogeneous items but also is the mechanism designer, are well-understood. In addition, single-dimensional settings in two-sided markets, e.g. where a buyer and seller each seek or own a single item, are also well-understood. Multi-dimensional two-sided markets, however, encapsulate the major challenges of both lines of work: optimizing the sale of heterogeneous items, ensuring incentive-compatibility among both sides of the market, and enforcing budget balance. We present, to the best of our knowledge, the first worst-case approximation guarantee for gains from trade in a multi-dimensional two-sided market. Our first result provides an $O(\log (1/r))$-approximation to the first-best gains from trade for a broad class of downward-closed feasibility constraints (such as matroid, matching, knapsack, or the intersection of these). Here $r$ is the minimum probability over all items that a buyer's value for the item exceeds the seller's cost. Our second result removes the dependence on $r$ and provides an unconditional $O(\log n)$-approximation to the second-best gains from trade. We extend both results for a general constrained-additive buyer, losing another $O(\log n)$-factor en-route.

preprint2022arXiv

Optimal Cloud Network Control with Strict Latency Constraints

The timely delivery of resource-intensive and latency-sensitive services (e.g., industrial automation, augmented reality) over distributed computing networks (e.g., mobile edge computing) is drawing increasing attention. Motivated by the insufficiency of average delay performance guarantees provided by existing studies, we focus on the critical goal of delivering next generation real-time services ahead of corresponding deadlines on a per-packet basis, while minimizing overall cloud network resource cost. We introduce a novel queuing system that is able to track data packets' lifetime and formalize the optimal cloud network control problem with strict deadline constraints. After illustrating the main challenges in delivering packets to their destinations before getting dropped due to lifetime expiry, we construct an equivalent formulation, where relaxed flow conservation allows leveraging Lyapunov optimization to derive a provably near-optimal fully distributed algorithm for the original problem. Numerical results validate the theoretical analysis and show the superior performance of the proposed control policy compared with state-of-the-art cloud network control.

preprint2022arXiv

Optimal Multicast Service Chain Control: Packet Processing, Routing, and Duplication

Distributed computing (cloud) networks, e.g., mobile edge computing (MEC), are playing an increasingly important role in the efficient hosting, running, and delivery of real-time stream-processing applications such as industrial automation, immersive video, and augmented reality. While such applications require timely processing of real-time streams that are simultaneously useful for multiple users/devices, existing technologies lack efficient mechanisms to handle their increasingly multicast nature, leading to unnecessary traffic redundancy and associated network congestion. In this paper, we address the design of distributed packet processing, routing, and duplication policies for optimal control of multicast stream-processing services. We present a characterization of the enlarged capacity region that results from efficient packet duplication, and design the first fully distributed multicast traffic management policy that stabilizes any input rate in the interior of the capacity region while minimizing overall operational cost. Numerical results demonstrate the effectiveness of the proposed policy to achieve throughput- and cost-optimal delivery of stream-processing services over distributed computing networks.

preprint2022arXiv

Recommender Systems meet Mechanism Design

Machine learning has developed a variety of tools for learning and representing high-dimensional distributions with structure. Recent years have also seen big advances in designing multi-item mechanisms. Akin to overfitting, however, these mechanisms can be extremely sensitive to the Bayesian prior that they target, which becomes problematic when that prior is only approximately known. At the same time, even if access to the exact Bayesian prior is given, it is known that optimal or even approximately optimal multi-item mechanisms run into sample, computational, representation and communication intractability barriers. We consider a natural class of multi-item mechanism design problems with very large numbers of items, but where the bidders' value distributions can be well-approximated by a topic model akin to those used in recommendation systems with very large numbers of possible recommendations. We propose a mechanism design framework for this setting, building on a recent robustification framework by Brustle et al., which disentangles the statistical challenge of estimating a multi-dimensional prior from the task of designing a good mechanism for it, and robustifies the performance of the latter against the estimation error of the former. We provide an extension of this framework appropriate for our setting, which allows us to exploit the expressive power of topic models to reduce the effective dimensionality of the mechanism design problem and remove the dependence of its computational, communication and representation complexity on the number of items.

preprint2022arXiv

Tight Last-Iterate Convergence of the Extragradient and the Optimistic Gradient Descent-Ascent Algorithm for Constrained Monotone Variational Inequalities

The monotone variational inequality is a central problem in mathematical programming that unifies and generalizes many important settings such as smooth convex optimization, two-player zero-sum games, convex-concave saddle point problems, etc. The extragradient algorithm by Korpelevich [1976] and the optimistic gradient descent-ascent algorithm by Popov [1980] are arguably the two most classical and popular methods for solving monotone variational inequalities. Despite their long histories, the following major problem remains open. What is the last-iterate convergence rate of the extragradient algorithm or the optimistic gradient descent-ascent algorithm for monotone and Lipschitz variational inequalities with constraints? We resolve this open problem by showing that both the extragradient algorithm and the optimistic gradient descent-ascent algorithm have a tight $O\left(\frac{1}{\sqrt{T}}\right)$ last-iterate convergence rate for arbitrary convex feasible sets, which matches the lower bound by Golowich et al. [2020a,b]. Our rate is measured in terms of the standard gap function. At the core of our results lies a non-standard performance measure -- the tangent residual, which can be viewed as an adaptation of the norm of the operator that takes the local constraints into account. We use the tangent residual (or a slight variation of the tangent residual) as the the potential function in our analysis of the extragradient algorithm (or the optimistic gradient descent-ascent algorithm) and prove that it is non-increasing between two consecutive iterates.

preprint2022arXiv

Ultra-Reliable Distributed Cloud Network Control with End-to-End Latency Constraints

We are entering a rapidly unfolding future driven by the delivery of real-time computation services, such as industrial automation and augmented reality, collectively referred to as AgI services, over highly distributed cloud/edge computing networks. The interaction intensive nature of AgI services is accelerating the need for networking solutions that provide strict latency guarantees. In contrast to most existing studies that can only characterize average delay performance, we focus on the critical goal of delivering AgI services ahead of corresponding deadlines on a per-packet basis, while minimizing overall cloud network operational cost. To this end, we design a novel queuing system able to track data packets' lifetime and formalize the delay-constrained least-cost dynamic network control problem. To address this challenging problem, we first study the setting with average capacity (or resource budget) constraints, for which we characterize the delay-constrained stability region and design a near-optimal control policy leveraging Lyapunov optimization theory on an equivalent virtual network. Guided by the same principle, we tackle the peak capacity constrained scenario by developing the reliable cloud network control (RCNC) algorithm, which employs a two-way optimization method to make actual and virtual network flow solutions converge in an iterative manner. Extensive numerical results show the superior performance of the proposed control policy compared with the state-of-the-art cloud network control algorithm, and the value of guaranteeing strict end-to-end deadlines for the delivery of next-generation AgI services.

preprint2020arXiv

Multi-Item Mechanisms without Item-Independence: Learnability via Robustness

We study the sample complexity of learning revenue-optimal multi-item auctions. We obtain the first set of positive results that go beyond the standard but unrealistic setting of item-independence. In particular, we consider settings where bidders' valuations are drawn from correlated distributions that can be captured by Markov Random Fields or Bayesian Networks -- two of the most prominent graphical models. We establish parametrized sample complexity bounds for learning an up-to-$\varepsilon$ optimal mechanism in both models, which scale polynomially in the size of the model, i.e.~the number of items and bidders, and only exponential in the natural complexity measure of the model, namely either the largest in-degree (for Bayesian Networks) or the size of the largest hyper-edge (for Markov Random Fields). We obtain our learnability results through a novel and modular framework that involves first proving a robustness theorem. We show that, given only ``approximate distributions'' for bidder valuations, we can learn a mechanism whose revenue is nearly optimal simultaneously for all ``true distributions'' that are close to the ones we were given in Prokhorov distance. Thus, to learn a good mechanism, it suffices to learn approximate distributions. When item values are independent, learning in Prokhorov distance is immediate, hence our framework directly implies the main result of Gonczarowski and Weinberg. When item values are sampled from more general graphical models, we combine our robustness theorem with novel sample complexity results for learning Markov Random Fields or Bayesian Networks in Prokhorov distance, which may be of independent interest. Finally, in the single-item case, our robustness result can be strengthened to hold under an even weaker distribution distance, the Lévy distance.

preprint2020arXiv

Simple Mechanisms for Profit Maximization in Multi-item Auctions

We study a classical Bayesian mechanism design problem where a seller is selling multiple items to multiple buyers. We consider the case where the seller has costs to produce the items, and these costs are private information to the seller. How can the seller design a mechanism to maximize her profit? Two well-studied problems, revenue maximization in multi-item auctions and signaling in ad auctions, are special cases of our problem. We show that there exists a simple mechanism whose profit is at least $\frac{1}{44}$ of the optimal profit for multiple buyers with matroid-rank valuation functions. When there is a single buyer, the approximation factor is $11$ for general constraint-additive valuations and $6$ for additive valuations. Our result holds even when the seller's costs are correlated across items. We introduce a new class of mechanisms called permit-selling mechanisms. For single buyer case these mechanisms are quite simple: there are two stages. For each item $j$, we create a separate permit that allows the buyer to purchase the item in the second stage. In the first stage, we sell the permits without revealing any information about the costs. In the second stage, the seller reveals all the costs, and the buyer can buy item $j$ by paying the item price if the buyer has purchased the permit for item $j$ in the first stage. We show that either selling the permits separately or as a grand bundle suffices to achieve the constant factor approximation to the optimal profit (6 for additive, and 11 for constrained additive). For multiple buyers, we sell the permits sequentially and obtain the constant factor approximation. Our proof is enabled by constructing a benchmark for the optimal profit by combining a novel dual solution with the existing ex-ante relaxation technique.

preprint2020arXiv

Third-Party Data Providers Ruin Simple Mechanisms

Motivated by the growing prominence of third-party data providers in online marketplaces, this paper studies the impact of the presence of third-party data providers on mechanism design. When no data provider is present, it has been shown that simple mechanisms are "good enough" -- they can achieve a constant fraction of the revenue of optimal mechanisms. The results in this paper demonstrate that this is no longer true in the presence of a third-party data provider who can provide the bidder with a signal that is correlated with the item type. Specifically, even with a single seller, a single bidder, and a single item of uncertain type for sale, the strategies of pricing each item-type separately (the analog of item pricing for multi-item auctions) and bundling all item-types under a single price (the analog of grand bundling) can both simultaneously be a logarithmic factor worse than the optimal revenue. Further, in the presence of a data provider, item-type partitioning mechanisms---a more general class of mechanisms which divide item-types into disjoint groups and offer prices for each group---still cannot achieve within a $\log \log$ factor of the optimal revenue. Thus, our results highlight that the presence of a data-provider forces the use of more complicated mechanisms in order to achieve a constant fraction of the optimal revenue.