Source author record

Haifeng Xu

Haifeng Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Science and Game Theory Machine Learning Artificial Intelligence Cryptography and Security math.CA math.NT Biological Physics Cell Behavior cs.CY Information Retrieval Multiagent Systems physics.atom-ph Social and Information Networks

Catalog footprint

What is connected

21works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

An AI-guided mechanotyping instrument for fully automated oocyte quality assessment

The mechanical properties of oocytes are regarded as important indicators of their developmental potential. During fertilization, deviations from the normal mechanical range can hinder sperm penetration, ultimately reducing fertilization efficiency and compromising embryo quality. However, current methods for measuring oocyte mechanics often suffer from serious cellular damage, low automation levels, and large measurement errors. To address these limitations, we developed an AI-guided micronewton-scale mechanical measurement system for safe and automated oocyte quality assessment. The system integrates voice interaction with automated experimental workflows to control a magnetically actuated microgripper, which applies defined loading forces to induce micron-scale compressive deformation of the oocyte. Combined with AI-assisted object detection and image segmentation algorithms, the system captures cellular deformation in real time, enabling precise calculation of the oocyte's compressive modulus. This measurement system enables automated, quantitative, and non-destructive evaluation of oocyte mechanical properties, providing an effective approach for oocyte quality screening in in vitro fertilization (IVF) and other assisted reproductive technologies (ART).

preprint2023arXiv

Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards

Incrementality, which is used to measure the causal effect of showing an ad to a potential customer (e.g. a user in an internet platform) versus not, is a central object for advertisers in online advertising platforms. This paper investigates the problem of how an advertiser can learn to optimize the bidding sequence in an online manner \emph{without} knowing the incrementality parameters in advance. We formulate the offline version of this problem as a specially structured episodic Markov Decision Process (MDP) and then, for its online learning counterpart, propose a novel reinforcement learning (RL) algorithm with regret at most $\widetilde{O}(H^2\sqrt{T})$, which depends on the number of rounds $H$ and number of episodes $T$, but does not depend on the number of actions (i.e., possible bids). A fundamental difference between our learning problem from standard RL problems is that the realized reward feedback from conversion incrementality is \emph{mixed} and \emph{delayed}. To handle this difficulty we propose and analyze a novel pairwise moment-matching algorithm to learn the conversion incrementality, which we believe is of independent of interest.

preprint2023arXiv

Learning in Online Principal-Agent Interactions: The Power of Menus

We study a ubiquitous learning challenge in online principal-agent problems during which the principal learns the agent's private information from the agent's revealed preferences in historical interactions. This paradigm includes important special cases such as pricing and contract design, which have been widely studied in recent literature. However, existing work considers the case where the principal can only choose a single strategy at every round to interact with the agent and then observe the agent's revealed preference through their actions. In this paper, we extend this line of study to allow the principal to offer a menu of strategies to the agent and learn additionally from observing the agent's selection from the menu. We provide a thorough investigation of several online principal-agent problem settings and characterize their sample complexities, accompanied by the corresponding algorithms we have developed. We instantiate this paradigm to several important design problems $-$ including Stackelberg (security) games, contract design, and information design. Finally, we also explore the connection between our findings and existing results about online learning in Stackelberg games, and we offer a solution that can overcome a key hard instance of Peng et al. (2019).

preprint2022arXiv

Algorithmic Information Design in Multi-Player Games: Possibility and Limits in Singleton Congestion

Most algorithmic studies on multi-agent information design so far have focused on the restricted situation with no inter-agent externalities; a few exceptions investigated truly strategic games such as zero-sum games and second-price auctions but have all focused only on optimal public signaling. This paper initiates the algorithmic information design of both \emph{public} and \emph{private} signaling in a fundamental class of games with negative externalities, i.e., singleton congestion games, with wide application in today's digital economy, machine scheduling, routing, etc. For both public and private signaling, we show that the optimal information design can be efficiently computed when the number of resources is a constant. To our knowledge, this is the first set of efficient \emph{exact} algorithms for information design in succinctly representable many-player games. Our results hinge on novel techniques such as developing certain "reduced forms" to compactly characterize equilibria in public signaling or to represent players' marginal beliefs in private signaling. When there are many resources, we show computational intractability results. To overcome the issue of multiple equilibria, here we introduce a new notion of equilibrium-\emph{oblivious} hardness, which rules out any possibility of computing a good signaling scheme, irrespective of the equilibrium selection rule.

preprint2022arXiv

Learning from a Learning User for Optimal Recommendations

In real-world recommendation problems, especially those with a formidably large item space, users have to gradually learn to estimate the utility of any fresh recommendations from their experience about previously consumed items. This in turn affects their interaction dynamics with the system and can invalidate previous algorithms built on the omniscient user assumption. In this paper, we formalize a model to capture such "learning users" and design an efficient system-side learning solution, coined Noise-Robust Active Ellipsoid Search (RAES), to confront the challenges brought by the non-stationary feedback from such a learning user. Interestingly, we prove that the regret of RAES deteriorates gracefully as the convergence rate of user learning becomes worse, until reaching linear regret when the user's learning fails to converge. Experiments on synthetic datasets demonstrate the strength of RAES for such a contemporaneous system-user learning problem. Our study provides a novel perspective on modeling the feedback loop in recommendation problems.

preprint2022arXiv

Multi-Channel Bayesian Persuasion

The celebrated Bayesian persuasion model considers strategic communication between an informed agent (the sender) and uninformed decision makers (the receivers). The current rapidly-growing literature mostly assumes a dichotomy: either the sender is powerful enough to communicate separately with each receiver (a.k.a. private persuasion), or she cannot communicate separately at all (a.k.a. public persuasion). We study a model that smoothly interpolates between the two, by considering a natural multi-channel communication structure in which each receiver observes a subset of the sender's communication channels. This captures, e.g., receivers on a network, where information spillover is almost inevitable. We completely characterize when one communication structure is better for the sender than another, in the sense of yielding higher optimal expected utility universally over all prior distributions and utility functions. The characterization is based on a simple pairwise relation among receivers - one receiver information-dominates another if he observes at least the same channels. We prove that a communication structure $M_1$ is (weakly) better than $M_2$ if and only if every information-dominating pair of receivers in $M_1$ is also such in $M_2$. We also provide an additive FPTAS for the optimal sender's signaling scheme when the number of states is constant and the graph of information-dominating pairs is a directed forest. Finally, we prove that finding an optimal signaling scheme under multi-channel persuasion is, generally, computationally harder than under both public and private persuasion.

preprint2022arXiv

Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

We study bandit algorithms under data poisoning attacks in a bounded reward setting. We consider a strong attacker model in which the attacker can observe both the selected actions and their corresponding rewards and can contaminate the rewards with additive noise. We show that any bandit algorithm with regret $O(\log T)$ can be forced to suffer a regret $Ω(T)$ with an expected amount of contamination $O(\log T)$. This amount of contamination is also necessary, as we prove that there exists an $O(\log T)$ regret bandit algorithm, specifically the classical UCB, that requires $Ω(\log T)$ amount of contamination to suffer regret $Ω(T)$. To combat such attacks, our second main contribution is to propose verification based mechanisms, which use limited verification to access a limited number of uncontaminated rewards. In particular, for the case of unlimited verifications, we show that with $O(\log T)$ expected number of verifications, a simple modified version of the ETC type bandit algorithm can restore the order optimal $O(\log T)$ regret irrespective of the amount of contamination used by the attacker. We also provide a UCB-like verification scheme, called Secure-UCB, that also enjoys full recovery from any attacks, also with $O(\log T)$ expected number of verifications. To derive a matching lower bound on the number of verifications, we prove that for any order-optimal bandit algorithm, this number of verifications $Ω(\log T)$ is necessary to recover the order-optimal regret. On the other hand, when the number of verifications is bounded above by a budget $B$, we propose a novel algorithm, Secure-BARBAR, which provably achieves $O(\min\{C,T/\sqrt{B} \})$ regret with high probability against weak attackers where $C$ is the total amount of contamination by the attacker, which breaks the known $Ω(C)$ lower bound of the non-verified setting if $C$ is large.

preprint2022arXiv

Understanding the Limits of Poisoning Attacks in Episodic Reinforcement Learning

To understand the security threats to reinforcement learning (RL) algorithms, this paper studies poisoning attacks to manipulate \emph{any} order-optimal learning algorithm towards a targeted policy in episodic RL and examines the potential damage of two natural types of poisoning attacks, i.e., the manipulation of \emph{reward} and \emph{action}. We discover that the effect of attacks crucially depend on whether the rewards are bounded or unbounded. In bounded reward settings, we show that only reward manipulation or only action manipulation cannot guarantee a successful attack. However, by combining reward and action manipulation, the adversary can manipulate any order-optimal learning algorithm to follow any targeted policy with $\tildeΘ(\sqrt{T})$ total attack cost, which is order-optimal, without any knowledge of the underlying MDP. In contrast, in unbounded reward settings, we show that reward manipulation attacks are sufficient for an adversary to successfully manipulate any order-optimal learning algorithm to follow any targeted policy using $\tilde{O}(\sqrt{T})$ amount of contamination. Our results reveal useful insights about what can or cannot be achieved by poisoning attacks, and are set to spur more works on the design of robust RL algorithms.

preprint2022arXiv

When Are Linear Stochastic Bandits Attackable?

We study adversarial attacks on linear stochastic bandits: by manipulating the rewards, an adversary aims to control the behaviour of the bandit algorithm. Perhaps surprisingly, we first show that some attack goals can never be achieved. This is in sharp contrast to context-free stochastic bandits, and is intrinsically due to the correlation among arms in linear stochastic bandits. Motivated by this finding, this paper studies the attackability of a $k$-armed linear bandit environment. We first provide a complete necessity and sufficiency characterization of attackability based on the geometry of the arms' context vectors. We then propose a two-stage attack method against LinUCB and Robust Phase Elimination. The method first asserts whether the given environment is attackable; and if yes, it poisons the rewards to force the algorithm to pull a target arm linear times using only a sublinear cost. Numerical experiments further validate the effectiveness and cost-efficiency of the proposed attack method.

preprint2020arXiv

Collapsing Bandits and Their Application to Public Health Interventions

We propose and study Collpasing Bandits, a new restless multi-armed bandit (RMAB) setting in which each arm follows a binary-state Markovian process with a special structure: when an arm is played, the state is fully observed, thus "collapsing" any uncertainty, but when an arm is passive, no observation is made, thus allowing uncertainty to evolve. The goal is to keep as many arms in the "good" state as possible by planning a limited budget of actions per round. Such Collapsing Bandits are natural models for many healthcare domains in which workers must simultaneously monitor patients and deliver interventions in a way that maximizes the health of their patient cohort. Our main contributions are as follows: (i) Building on the Whittle index technique for RMABs, we derive conditions under which the Collapsing Bandits problem is indexable. Our derivation hinges on novel conditions that characterize when the optimal policies may take the form of either "forward" or "reverse" threshold policies. (ii) We exploit the optimality of threshold policies to build fast algorithms for computing the Whittle index, including a closed-form. (iii) We evaluate our algorithm on several data distributions including data from a real-world healthcare task in which a worker must monitor and deliver interventions to maximize their patients' adherence to tuberculosis medication. Our algorithm achieves a 3-order-of-magnitude speedup compared to state-of-the-art RMAB techniques while achieving similar performance.

preprint2020arXiv

Computing Equilibria of Prediction Markets via Persuasion

We study the computation of equilibria in prediction markets in perhaps the most fundamental special case with two players and three trading opportunities. To do so, we show equivalence of prediction market equilibria with those of a simpler signaling game with commitment introduced by Kong and Schoenebeck (2018). We then extend their results by giving computationally efficient algorithms for additional parameter regimes. Our approach leverages a new connection between prediction markets and Bayesian persuasion, which also reveals interesting conceptual insights.

preprint2020arXiv

Selling Information Through Consulting

We consider a monopoly information holder selling information to a budget-constrained decision maker, who may benefit from the seller's information. The decision maker has a utility function that depends on his action and an uncertain state of the world. The seller and the buyer each observe a private signal regarding the state of the world, which may be correlated with each other. The seller's goal is to sell her private information to the buyer and extract maximum possible revenue, subject to the buyer's budget constraints. We consider three different settings with increasing generality, i.e., the seller's signal and the buyer's signal can be independent, correlated, or follow a general distribution accessed through a black-box sampling oracle. For each setting, we design information selling mechanisms which are both optimal and simple in the sense that they can be naturally interpreted, have succinct representations, and can be efficiently computed. Notably, though the optimal mechanism exhibits slightly increasing complexity as the setting becomes more general, all our mechanisms share the same format of acting as a consultant who recommends the best action to the buyer but uses different and carefully designed payment rules for different settings. Each of our optimal mechanisms can be easily computed by solving a single polynomial-size linear program. This significantly simplifies exponential-size LPs solved by the Ellipsoid method in the previous work, which computes the optimal mechanisms in the same setting but without budget limit. Such simplification is enabled by our new characterizations of the optimal mechanism in the (more realistic) budget-constrained setting.

preprint2019arXiv

Quantum dynamics of atomic Rydberg excitation in strong laser fields

Neutral atoms have been observed to survive intense laser pulses in high Rydberg states with surprisingly large probability. Only with this Rydberg-state excitation (RSE) included is the picture of intense-laser-atom interaction complete. Various mechanisms have been proposed to explain the underlying physics. However, neither one can explain all the features observed in experiments and in time-dependent Schrödinger equation (TDSE) simulations. Here we propose a fully quantum-mechanical model based on the strong-field approximation (SFA). It well reproduces the intensity dependence of RSE obtained by the TDSE, which exhibits a series of modulated peaks. They are due to recapture of the liberated electron and the fact that the pertinent probability strongly depends on the position and the parity of the Rydberg state. We also present measurements of RSE in xenon at 800 nm, which display the peak structure consistent with the calculations.

preprint2016arXiv

Algorithmic Bayesian Persuasion

Persuasion, defined as the act of exploiting an informational advantage in order to effect the decisions of others, is ubiquitous. Indeed, persuasive communication has been estimated to account for almost a third of all economic activity in the US. This paper examines persuasion through a computational lens, focusing on what is perhaps the most basic and fundamental model in this space: the celebrated Bayesian persuasion model of Kamenica and Gentzkow. Here there are two players, a sender and a receiver. The receiver must take one of a number of actions with a-priori unknown payoff, and the sender has access to additional information regarding the payoffs. The sender can commit to revealing a noisy signal regarding the realization of the payoffs of various actions, and would like to do so as to maximize her own payoff assuming a perfectly rational receiver. We examine the sender's optimization task in three of the most natural input models for this problem, and essentially pin down its computational complexity in each. When the payoff distributions of the different actions are i.i.d. and given explicitly, we exhibit a polynomial-time (exact) algorithm, and a "simple" $(1-1/e)$-approximation algorithm. Our optimal scheme for the i.i.d. setting involves an analogy to auction theory, and makes use of Border's characterization of the space of reduced-forms for single-item auctions. When action payoffs are independent but non-identical with marginal distributions given explicitly, we show that it is #P-hard to compute the optimal expected sender utility. Finally, we consider a general (possibly correlated) joint distribution of action payoffs presented by a black box sampling oracle, and exhibit a fully polynomial-time approximation scheme (FPTAS) with a bi-criteria guarantee. We show that this result is the best possible in the black-box model for information-theoretic reasons.

preprint2016arXiv

The largest cycles consist by the quadratic residues and Fermat primes

This paper studies the largest cycles consisted by the quadratic residues modulo prime numbers. We give some formulae about the maximum length of the cycles. Especially, the formula for modulo Fermat primes is given.

preprint2016arXiv

Using Social Networks to Aid Homeless Shelters: Dynamic Influence Maximization under Uncertainty - An Extended Version

This paper presents HEALER, a software agent that recommends sequential intervention plans for use by homeless shelters, who organize these interventions to raise awareness about HIV among homeless youth. HEALER's sequential plans (built using knowledge of social networks of homeless youth) choose intervention participants strategically to maximize influence spread, while reasoning about uncertainties in the network. While previous work presents influence maximizing techniques to choose intervention participants, they do not address three real-world issues: (i) they completely fail to scale up to real-world sizes; (ii) they do not handle deviations in execution of intervention plans; (iii) constructing real-world social networks is an expensive process. HEALER handles these issues via four major contributions: (i) HEALER casts this influence maximization problem as a POMDP and solves it using a novel planner which scales up to previously unsolvable real-world sizes; (ii) HEALER allows shelter officials to modify its recommendations, and updates its future plans in a deviation-tolerant manner; (iii) HEALER constructs social networks of homeless youth at low cost, using a Facebook application. Finally, (iv) we show hardness results for the problem that HEALER solves. HEALER will be deployed in the real world in early Spring 2016 and is currently undergoing testing at a homeless shelter.

preprint2015arXiv

Periodicity related to a sieve method of producing primes

In this paper we consider a slightly different sieve method from Eratosthenes' to get primes. We find the periodicity and mirror symmetry of the pattern.

preprint2015arXiv

Security Games with Information Leakage: Modeling and Computation

Most models of Stackelberg security games assume that the attacker only knows the defender's mixed strategy, but is not able to observe (even partially) the instantiated pure strategy. Such partial observation of the deployed pure strategy -- an issue we refer to as information leakage -- is a significant concern in practical applications. While previous research on patrolling games has considered the attacker's real-time surveillance, our settings, therefore models and techniques, are fundamentally different. More specifically, after describing the information leakage model, we start with an LP formulation to compute the defender's optimal strategy in the presence of leakage. Perhaps surprisingly, we show that a key subproblem to solve this LP (more precisely, the defender oracle) is NP-hard even for the simplest of security game models. We then approach the problem from three possible directions: efficient algorithms for restricted cases, approximation algorithms, and heuristic algorithms for sampling that improves upon the status quo. Our experiments confirm the necessity of handling information leakage and the advantage of our algorithms.

preprint2014arXiv

Note on the Hölder norm estimate of the function $x\sin(1/x)$

In this paper, we prove the following inequality: for any $x, y>0$, there holds $$\big|x\sin\frac{1}{x} - y\sin\frac{1}{y} \big| \leq \sqrt{2|x - y|}.$$

preprint2014arXiv

The limit of the m-norms of a class of symmetric matrices and its applications

We consider a special symmetric matrix and obtain a similar formula as the one obtained by Weyl's criterion. Some applications of the formula are given, where we give a new way to calculate the integral of $\lnΓ(x)$ on $[0,1]$, and we claim that one class of matrices are not Hadamard matrices.

preprint2013arXiv

The connection between the Basel problem and a special integral

By using Fubini theorem or Tonelli theorem, we find that the zeta function value at 2 is equal to a special integral. Furthermore, We find that this special integral is two times of another special integral. By using this fact we obtain the relationship between Genocchi numbers and Bernoulli numbers. And get some results about Bernoulli polynomials.

Haifeng Xu

What is connected

Connect this record

See the researcher in context

Building this map preview

21 published item(s)

An AI-guided mechanotyping instrument for fully automated oocyte quality assessment

Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards

Learning in Online Principal-Agent Interactions: The Power of Menus

Algorithmic Information Design in Multi-Player Games: Possibility and Limits in Singleton Congestion

Learning from a Learning User for Optimal Recommendations

Multi-Channel Bayesian Persuasion

Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

Understanding the Limits of Poisoning Attacks in Episodic Reinforcement Learning

When Are Linear Stochastic Bandits Attackable?

Collapsing Bandits and Their Application to Public Health Interventions

Computing Equilibria of Prediction Markets via Persuasion

Selling Information Through Consulting

Quantum dynamics of atomic Rydberg excitation in strong laser fields

Algorithmic Bayesian Persuasion

The largest cycles consist by the quadratic residues and Fermat primes

Using Social Networks to Aid Homeless Shelters: Dynamic Influence Maximization under Uncertainty - An Extended Version

Periodicity related to a sieve method of producing primes

Security Games with Information Leakage: Modeling and Computation

Note on the Hölder norm estimate of the function $x\sin(1/x)$

The limit of the m-norms of a class of symmetric matrices and its applications

The connection between the Basel problem and a special integral