Source author record

Zhenliang Zhang

Zhenliang Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning math.OC Artificial Intelligence math.NT Social and Information Networks Applications Computation and Language Data Structures and Algorithms Discrete Mathematics math.DS Multiagent Systems

Catalog footprint

What is connected

18works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

SCOUT: Active Information Foraging for Long-Text Understanding with Decoupled Epistemic States

Long-Text Understanding (LTU) at million-token scale requires balancing reasoning fidelity with computational efficiency. Frontier long-context LLMs can process millions of token contexts end-to-end, but they suffer from high token consumption and attention dilution. In parallel, specialized LTU agents often sacrifice fidelity through task-agnostic abstractions like graph construction or indexing. We identify a key insight for LTU: query-relevant information is typically sparse relative to the full document, so effective reasoning should rely on a query-sufficient subset rather than the entire context. To address this, we propose SCOUT, a new paradigm for LTU that shifts from passive processing to active information foraging. It treats the document as an explorable environment and answers from a compact, provenance-grounded epistemic state. Guided by state-level gap diagnosis, SCOUT adaptively alternates between coarse-to-fine exploration and anchored state updates that progressively contract its epistemic state toward query sufficiency. Experiments show that SCOUT matches state-of-the-art proprietary models while reducing token consumption by up to 8x. Moreover, SCOUT remains stable as context length scales, substantially alleviating the practical cost-performance trade-off.

preprint2026arXiv

TeachAnything: A Multimodal Crowdsourcing Platform for Training Embodied AI Agents in Symmetrical Reality

Symmetrical Reality (SR) is emerging as a future trend for human-agent coexistence, placing higher demands on agents to acquire human-like intelligence. It calls for richer and more diverse human guidance. We introduce a three-stage demonstration paradigm integrating multimodal demonstration signals. Building on this paradigm, we developed TeachAnything, a cloud-based, crowdsourcing-oriented demonstration platform with physics simulation capable of collecting diverse demonstration data across varied scenes, tasks, and embodiments. By unifying virtual and physical interactions through both methodological design and physics simulation, the system serves as a practical foundation for developing embodied agents aligned with Symmetrical Reality.

preprint2022arXiv

A One-bit, Comparison-Based Gradient Estimator

We study zeroth-order optimization for convex functions where we further assume that function evaluations are unavailable. Instead, one only has access to a $\textit{comparison oracle}$, which given two points $x$ and $y$ returns a single bit of information indicating which point has larger function value, $f(x)$ or $f(y)$. By treating the gradient as an unknown signal to be recovered, we show how one can use tools from one-bit compressed sensing to construct a robust and reliable estimator of the normalized gradient. We then propose an algorithm, coined SCOBO, that uses this estimator within a gradient descent scheme. We show that when $f(x)$ has some low dimensional structure that can be exploited, SCOBO outperforms the state-of-the-art in terms of query complexity. Our theoretical claims are verified by extensive numerical experiments.

preprint2020arXiv

A General Framework for Bounding Approximate Dynamic Programming Schemes

For years, there has been interest in approximation methods for solving dynamic programming problems, because of the inherent complexity in computing optimal solutions characterized by Bellman's principle of optimality. A wide range of approximate dynamic programming (ADP) methods now exists. It is of great interest to guarantee that the performance of an ADP scheme be at least some known fraction, say $β$, of optimal. This paper introduces a general approach to bounding the performance of ADP methods, in this sense, in the stochastic setting. The approach is based on new results for bounding greedy solutions in string optimization problems, where one has to choose a string (ordered set) of actions to maximize an objective function. This bounding technique is inspired by submodularity theory, but submodularity is not required for establishing bounds. Instead, the bounding is based on quantifying certain notions of curvature of string functions; the smaller the curvatures the better the bound. The key insight is that any ADP scheme is a greedy scheme for some surrogate string objective function that coincides in its optimal solution and value with those of the original optimal control problem. The ADP scheme then yields to the bounding technique mentioned above, and the curvatures of the surrogate objective determine the value $β$ of the bound. The surrogate objective and its curvatures depend on the specific ADP.

preprint2019arXiv

Inhomogeneous Diophantine approximation over fields of formal power series

We prove a sharp analogue of Minkowski's inhomogeneous approximation theorem over fields of power series $\mathbb{F}_q((T^{-1}))$. Furthermore, we study the approximation to a given point $\underline{y}$ in $\mathbb{F}_q((T^{-1}))^2$ by the $SL_2(\mathbb{F}_q[T])$-orbit of a given point $\underline{x}$ in $\mathbb{F}_q((T^{-1}))^2$.

preprint2016arXiv

Influential Node Detection in Implicit Social Networks using Multi-task Gaussian Copula Models

Influential node detection is a central research topic in social network analysis. Many existing methods rely on the assumption that the network structure is completely known \textit{a priori}. However, in many applications, network structure is unavailable to explain the underlying information diffusion phenomenon. To address the challenge of information diffusion analysis with incomplete knowledge of network structure, we develop a multi-task low rank linear influence model. By exploiting the relationships between contagions, our approach can simultaneously predict the volume (i.e. time series prediction) for each contagion (or topic) and automatically identify the most influential nodes for each contagion. The proposed model is validated using synthetic data and an ISIS twitter dataset. In addition to improving the volume prediction performance significantly, we show that the proposed approach can reliably infer the most influential users for specific contagions.

preprint2016arXiv

Performance Bounds for the $k$-Batch Greedy Strategy in Optimization Problems with Curvature

The $k$-batch greedy strategy is an approximate algorithm to solve optimization problems where the optimal solution is hard to obtain. Starting with the empty set, the $k$-batch greedy strategy adds a batch of $k$ elements to the current solution set with the largest gain in the objective function while satisfying the constraints. In this paper, we bound the performance of the $k$-batch greedy strategy with respect to the optimal strategy by defining the total curvature $α_k$. We show that when the objective function is nondecreasing and submodular, the $k$-batch greedy strategy satisfies a harmonic bound $1/(1+α_k)$ for a general matroid constraint and an exponential bound $\left(1-(1-α_k/{t})^t\right)/α_k$ for a uniform matroid constraint, where $k$ divides the cardinality of the maximal set in the general matroid, $t=K/k$ is an integer, and $K$ is the rank of the uniform matroid. We also compare the performance of the $k$-batch greedy strategy with that of the $k_1$-batch greedy strategy when $k_1$ divides $k$. Specifically, we prove that when the objective function is nondecreasing and submodular, the $k$-batch greedy strategy has better harmonic and exponential bounds in terms of the total curvature. Finally, we illustrate our results by considering a task-assignment problem.

preprint2015arXiv

Multifractal analysis of the divergence points of Birkhoff averages in $beta$-dynamical systems

This paper is aimed at a detailed study of the multifractal analysis of the so-called divergence points in the system of $β$-expansions. More precisely, let $([0,1),T_β)$ be the $β$-dynamical system for a general $β>1$ and $ψ:[0,1]\mapsto\mathbb{R}$ be a continuous function. Denote by $\textsf{A}(ψ,x)$ all the accumulation points of $\Big\{\frac{1}{n}\sum_{j=0}^{n-1}ψ(T^jx): n\ge 1\Big\}$. The Hausdorff dimensions of the sets $$\Big\{x:\textsf{A}(ψ,x)\supset[a,b]\Big\},\ \ \Big\{x:\textsf{A}(ψ,x)=[a,b]\Big\}, \ \Big\{x:\textsf{A}(ψ,x)\subset[a,b]\Big\}$$ i.e., the points for which the Birkhoff averages of $ψ$ do not exist but behave in a certain prescribed way, are determined completely for any continuous function $ψ$.

preprint2015arXiv

String Submodular Functions with Curvature Constraints

The problem of objectively choosing a string of actions to optimize an objective function that is string submodular has been considered in [1]. There it is shown that the greedy strategy, consisting of a string of actions that only locally maximizes the step-wise gain in the objective function achieves at least a (1-e^{-1})-approximation to the optimal strategy. This paper improves this approximation by introducing additional constraints on curvatures, namely, total backward curvature, total forward curvature, and elemental forward curvature. We show that if the objective function has total backward curvature σ, then the greedy strategy achieves at least a \frac{1}σ(1-e^{-σ})-approximation of the optimal strategy. If the objective function has total forward curvature ε, then the greedy strategy achieves at least a (1-ε)-approximation of the optimal strategy. Moreover, we consider a generalization of the diminishing-return property by defining the elemental forward curvature. We also consider the problem of maximizing the objective function subject to general a string-matroid constraint. We investigate an applications of string submodular functions with curvature constraints.

preprint2015arXiv

Subspace selection for projection maximization with matroid constraints

Suppose that there is a ground set which consists of a large number of vectors in a Hilbert space. Consider the problem of selecting a subset of the ground set such that the projection of a vector of interest onto the subspace spanned by the vectors in the chosen subset reaches the maximum norm. This problem is generally NP-hard, and alternative approximation algorithms such as forward regression and orthogonal matching pursuit have been proposed as heuristic approaches. In this paper, we investigate bounds on the performance of these algorithms by introducing the notions of elemental curvatures. More specifically, we derive lower bounds, as functions of these elemental curvatures, for performance of the aforementioned algorithms with respect to that of the optimal solution under uniform and non-uniform matroid constraints, respectively. We show that if the elements in the ground set are mutually orthogonal, then these algorithms are optimal when the matroid is uniform and they achieve at least $1/2$-approximations of the optimal solution when the matroid is non-uniform.

preprint2013arXiv

Hypothesis Testing in Feedforward Networks with Broadcast Failures

Consider a countably infinite set of nodes, which sequentially make decisions between two given hypotheses. Each node takes a measurement of the underlying truth, observes the decisions from some immediate predecessors, and makes a decision between the given hypotheses. We consider two classes of broadcast failures: 1) each node broadcasts a decision to the other nodes, subject to random erasure in the form of a binary erasure channel; 2) each node broadcasts a randomly flipped decision to the other nodes in the form of a binary symmetric channel. We are interested in whether there exists a decision strategy consisting of a sequence of likelihood ratio tests such that the node decisions converge in probability to the underlying truth. In both cases, we show that if each node only learns from a bounded number of immediate predecessors, then there does not exist a decision strategy such that the decisions converge in probability to the underlying truth. However, in case 1, we show that if each node learns from an unboundedly growing number of predecessors, then the decisions converge in probability to the underlying truth, even when the erasure probabilities converge to 1. We also derive the convergence rate of the error probability. In case 2, we show that if each node learns from all of its previous predecessors, then the decisions converge in probability to the underlying truth when the flipping probabilities of the binary symmetric channels are bounded away from 1/2. In the case where the flipping probabilities converge to 1/2, we derive a necessary condition on the convergence rate of the flipping probabilities such that the decisions still converge to the underlying truth. We also explicitly characterize the relationship between the convergence rate of the error probability and the convergence rate of the flipping probabilities.

preprint2012arXiv

Detection Performance in Balanced Binary Relay Trees with Node and Link Failures

We study the distributed detection problem in the context of a balanced binary relay tree, where the leaves of the tree correspond to $N$ identical and independent sensors generating binary messages. The root of the tree is a fusion center making an overall decision. Every other node is a relay node that aggregates the messages received from its child nodes into a new message and sends it up toward the fusion center. We derive upper and lower bounds for the total error probability $P_N$ as explicit functions of $N$ in the case where nodes and links fail with certain probabilities. These characterize the asymptotic decay rate of the total error probability as $N$ goes to infinity. Naturally, this decay rate is not larger than that in the non-failure case, which is $\sqrt N$. However, we derive an explicit necessary and sufficient condition on the decay rate of the local failure probabilities $p_k$ (combination of node and link failure probabilities at each level) such that the decay rate of the total error probability in the failure case is the same as that of the non-failure case. More precisely, we show that $\log P_N^{-1}=Θ(\sqrt N)$ if and only if $\log p_k^{-1}=Ω(2^{k/2})$.

preprint2012arXiv

Detection Performance of M-ary Relay Trees with Non-binary Message Alphabets

We study the detection performance of $M$-ary relay trees, where only the leaves of the tree represent sensors making measurements. The root of the tree represents the fusion center which makes an overall detection decision. Each of the other nodes is a relay node which aggregates $M$ messages sent by its child nodes into a new compressed message and sends the message to its parent node. Building on previous work on the detection performance of $M$-ary relay trees with binary messages, in this paper we study the case of non-binary relay message alphabets. We characterize the exponent of the error probability with respect to the message alphabet size $\mathcal D$, showing how the detection performance increases with $\mathcal D$. Our method involves reducing a tree with non-binary relay messages into an equivalent higher-degree tree with only binary messages.

preprint2012arXiv

Error Probability Bounds for M-ary Relay Trees

We study the detection error probabilities associated with an M-ary relay tree, where the leaves of the tree correspond to identical and independent sensors. Only these leaves are sensors. The root of the tree represents a fusion center that makes the overall detection decision. Each of the other nodes in the tree is a relay node that combines M summarized messages from its immediate child nodes to form a single output message using the majority dominance rule. We derive tight upper and lower bounds for the Type I and II error probabilities at the fusion center as explicit functions of the number of sensors in the case of binary message alphabets. These bounds characterize how fast the error probabilities converge to 0 with respect to the number of sensors.

preprint2012arXiv

Learning in Hierarchical Social Networks

We study a social network consisting of agents organized as a hierarchical M-ary rooted tree, common in enterprise and military organizational structures. The goal is to aggregate information to solve a binary hypothesis testing problem. Each agent at a leaf of the tree, and only such an agent, makes a direct measurement of the underlying true hypothesis. The leaf agent then makes a decision and sends it to its supervising agent, at the next level of the tree. Each supervising agent aggregates the decisions from the M members of its group, produces a summary message, and sends it to its supervisor at the next level, and so on. Ultimately, the agent at the root of the tree makes an overall decision. We derive upper and lower bounds for the Type I and II error probabilities associated with this decision with respect to the number of leaf agents, which in turn characterize the converge rates of the Type I, Type II, and total error probabilities. We also provide a message-passing scheme involving non-binary message alphabets and characterize the exponent of the error probability with respect to the message alphabet size.

preprint2012arXiv

Submodularity and Optimality of Fusion Rules in Balanced Binary Relay Trees

We study the distributed detection problem in a balanced binary relay tree, where the leaves of the tree are sensors generating binary messages. The root of the tree is a fusion center that makes the overall decision. Every other node in the tree is a fusion node that fuses two binary messages from its child nodes into a new binary message and sends it to the parent node at the next level. We assume that the fusion nodes at the same level use the same fusion rule. We call a string of fusion rules used at different levels a fusion strategy. We consider the problem of finding a fusion strategy that maximizes the reduction in the total error probability between the sensors and the fusion center. We formulate this problem as a deterministic dynamic program and express the solution in terms of Bellman's equations. We introduce the notion of stringsubmodularity and show that the reduction in the total error probability is a stringsubmodular function. Consequentially, we show that the greedy strategy, which only maximizes the level-wise reduction in the total error probability, is within a factor of the optimal strategy in terms of reduction in the total error probability.

preprint2011arXiv

Error Probability Bounds for Balanced Binary Relay Trees

We study the detection error probability associated with a balanced binary relay tree, where the leaves of the tree correspond to $N$ identical and independent detectors. The root of the tree represents a fusion center that makes the overall detection decision. Each of the other nodes in the tree are relay nodes that combine two binary messages to form a single output binary message. In this way, the information from the detectors is aggregated into the fusion center via the intermediate relay nodes. In this context, we describe the evolution of Type I and Type II error probabilities of the binary data as it propagates from the leaves towards the root. Tight upper and lower bounds for the total error probability at the fusion center as functions of $N$ are derived. These characterize how fast the total error probability converges to 0 with respect to $N$, even if the individual sensors have error probabilities that converge to 1/2.

preprint2011arXiv

Error Probability Bounds for Binary Relay Trees with Crummy Sensors

We study the detection error probability associated with balanced binary relay trees, in which sensor nodes fail with some probability. We consider N identical and independent crummy sensors, represented by leaf nodes of the tree. The root of the tree represents the fusion center, which makes the final decision between two hypotheses. Every other node is a relay node, which fuses at most two binary messages into one binary message and forwards the new message to its parent node. We derive tight upper and lower bounds for the total error probability at the fusion center as functions of N and characterize how fast the total error probability converges to 0 with respect to N. We show that the convergence of the total error probability is sub-linear, with the same decay exponent as that in a balanced binary relay tree without sensor failures. We also show that the total error probability converges to 0, even if the individual sensors have total error probabilities that converge to 1/2 and the failure probabilities that converge to 1, provided that the convergence rates are sufficiently slow.

Zhenliang Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

SCOUT: Active Information Foraging for Long-Text Understanding with Decoupled Epistemic States

TeachAnything: A Multimodal Crowdsourcing Platform for Training Embodied AI Agents in Symmetrical Reality

A One-bit, Comparison-Based Gradient Estimator

A General Framework for Bounding Approximate Dynamic Programming Schemes

Inhomogeneous Diophantine approximation over fields of formal power series

Influential Node Detection in Implicit Social Networks using Multi-task Gaussian Copula Models

Performance Bounds for the $k$-Batch Greedy Strategy in Optimization Problems with Curvature

Multifractal analysis of the divergence points of Birkhoff averages in $beta$-dynamical systems

String Submodular Functions with Curvature Constraints

Subspace selection for projection maximization with matroid constraints

Hypothesis Testing in Feedforward Networks with Broadcast Failures

Detection Performance in Balanced Binary Relay Trees with Node and Link Failures

Detection Performance of M-ary Relay Trees with Non-binary Message Alphabets

Error Probability Bounds for M-ary Relay Trees

Learning in Hierarchical Social Networks

Submodularity and Optimality of Fusion Rules in Balanced Binary Relay Trees

Error Probability Bounds for Balanced Binary Relay Trees

Error Probability Bounds for Binary Relay Trees with Crummy Sensors