Researcher profile

Rui Hu

Rui Hu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

AgentOrchestra: Orchestrating Multi-Agent Intelligence with the Tool-Environment-Agent(TEA) Protocol

Recent advances in LLM-based agent systems have shown promise in tackling complex, long-horizon tasks. However, existing LLM-based agentprotocols (e.g., A2A and MCP) under-specify cross-entity lifecycle and context management, version tracking, and ad-hoc environment integration, which in turn encourages fixed, monolithic agent compositions and brittle glue code. To address these limitations, we introduce the Tool-Environment-Agent (TEA) protocol, a unified abstraction that models environments, agents, and tools as first-class resources with explicit lifecycles and versioned interfaces. TEA provides a principled foundation for end-to-end lifecycle and version management, and for associating each run with its context and outputs across components, improving traceability and reproducibility. Moreover, TEA enables continual self-evolution of agent-associated components through a closed feedback loop, producing improved versions while supporting version selection and rollback. Building on TEA, we present AgentOrchestra, a hierarchical multi-agent framework in which a central planner orchestrates specialized sub-agents for web navigation, data analysis, and file operations, and supports continual adaptation by dynamically instantiating, retrieving, and refining tools online during execution. We evaluate AgentOrchestra on three challenging benchmarks, where it consistently outperforms strong baselines and achieves 89.04% on GAIA, establishing state-of-the-art performance to the best of our knowledge. Overall, our results provide evidence that TEA and hierarchical orchestration improve scalability and generality in multi-agent systems.

preprint2022arXiv

Data-based analysis of Forward-Backward Asymmetry in $B^\pm \to K^\pm K^\mp K^\pm$

An analysis of the Forward-Backward Asymmetry (FBA) in the decay $B^\pm \to K^\pm K^\mp K^\pm$ is carried out based on the LHCb data. It is found that the large FBA observed for the invariant mass of the $K^+ K^-$ pair around 1.5 GeV can be explained by the interference of the amplitudes between the resonances with even and odd spins, where the former can be the spin-0 $f_0(1500)$ resonance plus a non-resonance $s$-wave, while the latter is a spin-1 resonance which is probably $ρ^0 (1450)$. The analysis shows the existence of the decay channel $B^\pm\to ρ^0(1450) K^\pm$, with $C\!P$ asymmetry of $A_{CP}(B^\pm\to ρ^0(1450) K^\pm)=(-3.4\pm3.0)\%$. This is in contradiction with the conclusion of BaBar in Phys. Rev. D 85, 112010, according to which the analysis showed no signal of the spin-1 resonance $ρ^0(1450)$. We suggest our experimental colleagues to perform a closer analysis to this channel. We also suggest to perform the measurements of the FBAs (as well as the FB-$C\!P$As) in other three-body decay channels of beauty and charmed mesons, as it is helpful for resonance analysis.

preprint2022arXiv

FasterX: Real-Time Object Detection Based on Edge GPUs for UAV Applications

Real-time object detection on Unmanned Aerial Vehicles (UAVs) is a challenging issue due to the limited computing resources of edge GPU devices as Internet of Things (IoT) nodes. To solve this problem, in this paper, we propose a novel lightweight deep learning architectures named FasterX based on YOLOX model for real-time object detection on edge GPU. First, we design an effective and lightweight PixSF head to replace the original head of YOLOX to better detect small objects, which can be further embedded in the depthwise separable convolution (DS Conv) to achieve a lighter head. Then, a slimmer structure in the Neck layer termed as SlimFPN is developed to reduce parameters of the network, which is a trade-off between accuracy and speed. Furthermore, we embed attention module in the Head layer to improve the feature extraction effect of the prediction head. Meanwhile, we also improve the label assignment strategy and loss function to alleviate category imbalance and box optimization problems of the UAV dataset. Finally, auxiliary heads are presented for online distillation to improve the ability of position embedding and feature extraction in PixSF head. The performance of our lightweight models are validated experimentally on the NVIDIA Jetson NX and Jetson Nano GPU embedded platforms.Extensive experiments show that FasterX models achieve better trade-off between accuracy and latency on VisDrone2021 dataset compared to state-of-the-art models.

preprint2022arXiv

Fractional Besov Trace/Extension Type Inequalities via the Caffarelli-Silvestre extension

Let $u(\cdot,\cdot)$ be the Caffarelli-Silvestre extension of $f.$ The first goal of this article is to establish the fractional trace type inequalities involving the Caffarelli-Silvestre extension $u(\cdot,\cdot)$ of $f.$ In doing so, firstly, we establish the fractional Sobolev/ logarithmic Sobolev/ Hardy trace inequalities in terms of $\nabla_{(x,t)}u(x,t).$ Then, we prove the fractional anisotropic Sobolev/ logarithmic Sobolev/ Hardy trace inequalities in terms of $ {\partial_{t} u(x,t)}$ or $(-Δ)^{-γ/2}u(x,t)$ only. Moreover, based on an estimate of the Fourier transform of the Caffarelli-Silvestre extension kernel and the sharp affine weighted $L^p$ Sobolev inequality, we prove that the $\dot{H}^{-β/2}(\mathbb{R}^n)$ norm of $f(x)$ can be controlled by the product of the weighted $L^p-$affine energy and the weighted $L^p-$norm of ${\partial_{t} u(x,t)}.$ The second goal of this article is to characterize non-negative measures $μ$ on $\mathbb{R}^{n+1}_+$ such that the embeddings $$\|u(\cdot,\cdot)\|_{L^{q_0,p_0}_μ(\mathbb{R}^{n+1})}\lesssim \|f\|_{\dotΛ^{p,q}_β(\mathbb{R}^n)}$$ hold for some $p_0$ and $q_0$ depending on $p$ and $q$ which are classified in three different cases: (1). $p=q\in (n/(n+β),1];$ (2) $(p,q)\in (1,n/β)\times (1,\infty);$ (3). $(p,q)\in (1,n/β)\times\{\infty\}.$ For case (1), the embeddings can be characterized in terms of an analytic condition of the variational capacity minimizing function, the iso-capacitary inequality of open balls, and other weak type inequalities. For cases (2) and (3), the embeddings are characterized by the iso-capacitary inequality for fractonal Besov capacity of open sets.

preprint2021arXiv

PolyTransform: Deep Polygon Transformer for Instance Segmentation

In this paper, we propose PolyTransform, a novel instance segmentation algorithm that produces precise, geometry-preserving masks by combining the strengths of prevailing segmentation approaches and modern polygon-based methods. In particular, we first exploit a segmentation network to generate instance masks. We then convert the masks into a set of polygons that are then fed to a deforming network that transforms the polygons such that they better fit the object boundaries. Our experiments on the challenging Cityscapes dataset show that our PolyTransform significantly improves the performance of the backbone instance segmentation network and ranks 1st on the Cityscapes test-set leaderboard. We also show impressive gains in the interactive annotation setting. We release the code at https://github.com/uber-research/PolyTransform.

preprint2020arXiv

Certified Robustness of Graph Classification against Topology Attack with Randomized Smoothing

Graph classification has practical applications in diverse fields. Recent studies show that graph-based machine learning models are especially vulnerable to adversarial perturbations due to the non i.i.d nature of graph data. By adding or deleting a small number of edges in the graph, adversaries could greatly change the graph label predicted by a graph classification model. In this work, we propose to build a smoothed graph classification model with certified robustness guarantee. We have proven that the resulting graph classification model would output the same prediction for a graph under $l_0$ bounded adversarial perturbation. We also evaluate the effectiveness of our approach under graph convolutional network (GCN) based multi-class graph classification model.

preprint2020arXiv

Concentrated Differentially Private and Utility Preserving Federated Learning

Federated learning is a machine learning setting where a set of edge devices collaboratively train a model under the orchestration of a central server without sharing their local data. At each communication round of federated learning, edge devices perform multiple steps of stochastic gradient descent with their local data and then upload the computation results to the server for model update. During this process, the challenge of privacy leakage arises due to the information exchange between edge devices and the server when the server is not fully trusted. While some previous privacy-preserving mechanisms could readily be used for federated learning, they usually come at a high cost on convergence of the algorithm and utility of the learned model. In this paper, we develop a federated learning approach that addresses the privacy challenge without much degradation on model utility through a combination of local gradient perturbation, secure aggregation, and zero-concentrated differential privacy (zCDP). We provide a tight end-to-end privacy guarantee of our approach and analyze its theoretical convergence rates. Through extensive numerical experiments on real-world datasets, we demonstrate the effectiveness of our proposed method and show its superior trade-off between privacy and model utility.

preprint2020arXiv

Conditional Entropy Coding for Efficient Video Compression

We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames. Unlike prior learning-based approaches, we reduce complexity by not performing any form of explicit transformations between frames and assume each frame is encoded with an independent state-of-the-art deep image compressor. We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs while being much faster and easier to implement. We then propose a novel internal learning extension on top of this architecture that brings an additional 10% bitrate savings without trading off decoding speed. Importantly, we show that our approach outperforms H.265 and other deep learning baselines in MS-SSIM on higher bitrate UVG video, and against all video codecs on lower framerates, while being thousands of times faster in decoding than deep models utilizing an autoregressive entropy model.

preprint2020arXiv

Development of a Data-driven Turbulence Model for 3d Thermal Stratification Simulation during Reactor Transients

SAM, a plant-level system analysis tool for advanced reactors (SFR, LFR, MSR/FHR) is under development at Argonne. As a modern system code, SAM aims to improve the predictions of 3D flows relevant to reactor safety during transient conditions. In order to fulfill this goal, one approach is to implement modeling of turbulent flow in SAM through establishing an embedded surrogate model for Reynolds stress/turbulence viscosity based on machine learning techniques. The proposed approach is based on an assumption that there exists a functional dependency relationship between local flow features and local turbulence viscosity or Reynolds stress. There have been very limited studies performed to validate this assumption. This paper documents a case study to examine the assumption in a scenario of potential reactor applications. The work doesn't aim to theoretically validate the assumption, but practically validate the assumption within the limited application domain. From the methodological point of view, the approach used in this paper could be classified into the so-called Type I machine learning (ML) approach, where a scale separation assumption is proposed claiming that conservation equations and closure relations are scale separable, for which the turbulence models are local rather than global. The CFD case studied in this work is a 3D transient thermal stratification tank flow problem performed using a Reynolds-averaged Navier-Stokes turbulence model in STARCCM+ code. Flow information of all geometric points in all timesteps is collected as training data and test data.

preprint2020arXiv

Differentially Private Federated Learning for Resource-Constrained Internet of Things

With the proliferation of smart devices having built-in sensors, Internet connectivity, and programmable computation capability in the era of Internet of things (IoT), tremendous data is being generated at the network edge. Federated learning is capable of analyzing the large amount of data from a distributed set of smart devices without requiring them to upload their data to a central place. However, the commonly-used federated learning algorithm is based on stochastic gradient descent (SGD) and not suitable for resource-constrained IoT environments due to its high communication resource requirement. Moreover, the privacy of sensitive data on smart devices has become a key concern and needs to be protected rigorously. This paper proposes a novel federated learning framework called DP-PASGD for training a machine learning model efficiently from the data stored across resource-constrained smart devices in IoT while guaranteeing differential privacy. The optimal schematic design of DP-PASGD that maximizes the learning performance while satisfying the limits on resource cost and privacy loss is formulated as an optimization problem, and an approximate solution method based on the convergence analysis of DP-PASGD is developed to solve the optimization problem efficiently. Numerical results based on real-world datasets verify the effectiveness of the proposed DP-PASGD scheme.

preprint2020arXiv

Learning Lane Graph Representations for Motion Forecasting

We propose a motion forecasting model that exploits a novel structured map representation as well as actor-map interactions. Instead of encoding vectorized maps as raster images, we construct a lane graph from raw map data to explicitly preserve the map structure. To capture the complex topology and long range dependencies of the lane graph, we propose LaneGCN which extends graph convolutions with multiple adjacency matrices and along-lane dilation. To capture the complex interactions between actors and maps, we exploit a fusion network consisting of four types of interactions, actor-to-lane, lane-to-lane, lane-to-actor and actor-to-actor. Powered by LaneGCN and actor-map interactions, our model is able to predict accurate and realistic multi-modal trajectories. Our approach significantly outperforms the state-of-the-art on the large scale Argoverse motion forecasting benchmark.

preprint2020arXiv

PnPNet: End-to-End Perception and Prediction with Tracking in the Loop

We tackle the problem of joint perception and motion forecasting in the context of self-driving vehicles. Towards this goal we propose PnPNet, an end-to-end model that takes as input sequential sensor data, and outputs at each time step object tracks and their future trajectories. The key component is a novel tracking module that generates object tracks online from detections and exploits trajectory level features for motion forecasting. Specifically, the object tracks get updated at each time step by solving both the data association problem and the trajectory estimation problem. Importantly, the whole model is end-to-end trainable and benefits from joint optimization of all tasks. We validate PnPNet on two large-scale driving datasets, and show significant improvements over the state-of-the-art with better occlusion recovery and more accurate future prediction.

preprint2020arXiv

Trading Data For Learning: Incentive Mechanism For On-Device Federated Learning

Federated Learning rests on the notion of training a global model distributedly on various devices. Under this setting, users' devices perform computations on their own data and then share the results with the cloud server to update the global model. A fundamental issue in such systems is to effectively incentivize user participation. The users suffer from privacy leakage of their local data during the federated model training process. Without well-designed incentives, self-interested users will be unwilling to participate in federated learning tasks and contribute their private data. To bridge this gap, in this paper, we adopt the game theory to design an effective incentive mechanism, which selects users that are most likely to provide reliable data and compensates for their costs of privacy leakage. We formulate our problem as a two-stage Stackelberg game and solve the game's equilibrium. Effectiveness of the proposed mechanism is demonstrated by extensive simulations.

preprint2019arXiv

DP-ADMM: ADMM-based Distributed Learning with Differential Privacy

Alternating Direction Method of Multipliers (ADMM) is a widely used tool for machine learning in distributed settings, where a machine learning model is trained over distributed data sources through an interactive process of local computation and message passing. Such an iterative process could cause privacy concerns of data owners. The goal of this paper is to provide differential privacy for ADMM-based distributed machine learning. Prior approaches on differentially private ADMM exhibit low utility under high privacy guarantee and often assume the objective functions of the learning problems to be smooth and strongly convex. To address these concerns, we propose a novel differentially private ADMM-based distributed learning algorithm called DP-ADMM, which combines an approximate augmented Lagrangian function with time-varying Gaussian noise addition in the iterative process to achieve higher utility for general objective functions under the same differential privacy guarantee. We also apply the moments accountant method to bound the end-to-end privacy loss. The theoretical analysis shows that DP-ADMM can be applied to a wider class of distributed learning problems, is provably convergent, and offers an explicit utility-privacy tradeoff. To our knowledge, this is the first paper to provide explicit convergence and utility properties for differentially private ADMM-based distributed learning algorithms. The evaluation results demonstrate that our approach can achieve good convergence and model accuracy under high end-to-end differential privacy guarantee.