Source author record

Yan Zheng

Yan Zheng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computation and Language Computational Geometry Computer Vision cond-mat.supr-con Cryptography and Security Databases Distributed, Parallel, and Cluster Computing eess.SY Human-Computer Interaction math.CT math.PR math.RT Networking and Internet Architecture Robotics Systems and Control

Catalog footprint

What is connected

22works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching

Existing imitation learning methods enable robots to interact autonomously with the physical environment. However, contact-rich manipulation tasks remain a significant challenge due to complex contact dynamics that demand high-precision force feedback and control. Although recent efforts have attempted to integrate force/torque sensing into policies, how to build a simple yet effective framework that achieves robust generalization under multimodal observations remains an open question. In this paper, we propose ForceFlow, a force-aware reactive framework built upon flow matching. For contact-stage policy design, we investigate force signal fusion mechanisms and adopt an asymmetric multimodal fusion architecture that treats force as a global regulatory signal, combined with a joint prediction paradigm that enhances the policy's understanding of instantaneous force and historical information, thereby achieving deep coupling between force and motion. For task-level hierarchical decomposition, we divide manipulation into a vision-dominant approach stage (VLM-based pointing for target localization) and a touch-dominant interaction stage (force-driven contact execution), with a Vision-to-Force (V2F) handover mechanism that explicitly decouples spatial generalization from contact regulation. Experimental results across six real-world contact-rich tasks demonstrate that ForceFlow achieves a 37% success rate improvement over the strong baseline ForceVLA while maintaining significantly lower cost. Moreover, ForceFlow exhibits accurate force signal prediction and demonstrates superior performance in contact force self-regulation and zero-shot out-of-distribution (OOD) generalization.

preprint2026arXiv

TabKDE: Simple and Scalable Tabular Data Generation with Kernel Density Estimates

Tabular data generation considers a large table with multiple columns -- each column comprised of numerical, categorical, or sometimes ordinal values. The goal is to produce new rows for the table that replicate the distribution of rows from the original data -- without just copying those initial rows. The last 4 years have seen enormous progress on this problem, mostly using computational expensive methods that employ one-hot encoding, VAEs, and diffusion. This paper describes a new approach to the problem of tabular data generation. By employing copula transformations and modeling the distribution as a kernel density estimate we can nearly match the accuracy and leakage-avoidance achievements of the previous methods, but with almost no training time. Our method is very scalable, and can be run on data sets orders of magnitude larger than prior state-of-the-art on a simple laptop. Moreover, because we employ kernel density estimates, we can store the model as a coreset of the original data -- we believe the first for generative modeling -- and as a result, require significantly less space as well. Our code is available here: \url{https://github.com/tabkde/tabkde-main}

preprint2025arXiv

Enhancing Foundation Models in Transaction Understanding with LLM-based Sentence Embeddings

The ubiquity of payment networks generates vast transactional data encoding rich consumer and merchant behavioral patterns. Recent foundation models for transaction analysis process tabular data sequentially but rely on index-based representations for categorical merchant fields, causing substantial semantic information loss by converting rich textual data into discrete tokens. While Large Language Models (LLMs) can address this limitation through superior semantic understanding, their computational overhead challenges real-time financial deployment. We introduce a hybrid framework that uses LLM-generated embeddings as semantic initializations for lightweight transaction models, balancing interpretability with operational efficiency. Our approach employs multi-source data fusion to enrich merchant categorical fields and a one-word constraint principle for consistent embedding generation across LLM architectures. We systematically address data quality through noise filtering and context-aware enrichment. Experiments on large-scale transaction datasets demonstrate significant performance improvements across multiple transaction understanding tasks.

preprint2022arXiv

Embedding Compression with Hashing for Efficient Representation Learning in Large-Scale Graph

Graph neural networks (GNNs) are deep learning models designed specifically for graph data, and they typically rely on node features as the input to the first layer. When applying such a type of network on the graph without node features, one can extract simple graph-based node features (e.g., number of degrees) or learn the input node representations (i.e., embeddings) when training the network. While the latter approach, which trains node embeddings, more likely leads to better performance, the number of parameters associated with the embeddings grows linearly with the number of nodes. It is therefore impractical to train the input node embeddings together with GNNs within graphics processing unit (GPU) memory in an end-to-end fashion when dealing with industrial-scale graph data. Inspired by the embedding compression methods developed for natural language processing (NLP) tasks, we develop a node embedding compression method where each node is compactly represented with a bit vector instead of a floating-point vector. The parameters utilized in the compression method can be trained together with GNNs. We show that the proposed node embedding compression method achieves superior performance compared to the alternatives.

preprint2022arXiv

Energy Savings When Migrating Workloads to the Cloud

In the cloud environment, data centers are efficiently manipulated by cloud service providers (CSPs) in terms of energy consumption. Consequently, migrating workloads to clouds can result in lower energy consumption. This paper demonstrates that the Lift-and-Shift migration with optimal selections of cloud instances can provide significant energy savings, and explains how much and where the energy savings are obtained from. Additionally, the analysis on the variation of energy consumption is given when Auto-Scaling is deployed showing that further energy savings are expected even without refactoring applications. All the conclusions and analyses are based on the real data collected by Cloudamize Inc. from May 2016 to August 2016 over 40,000 machines across approximately 300 data centers.

preprint2022arXiv

GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis

Despite achieving superior performance in human-level control problems, unlike humans, deep reinforcement learning (DRL) lacks high-order intelligence (e.g., logic deduction and reuse), thus it behaves ineffectively than humans regarding learning and generalization in complex problems. Previous works attempt to directly synthesize a white-box logic program as the DRL policy, manifesting logic-driven behaviors. However, most synthesis methods are built on imperative or declarative programming, and each has a distinct limitation, respectively. The former ignores the cause-effect logic during synthesis, resulting in low generalizability across tasks. The latter is strictly proof-based, thus failing to synthesize programs with complex hierarchical logic. In this paper, we combine the above two paradigms together and propose a novel Generalizable Logic Synthesis (GALOIS) framework to synthesize hierarchical and strict cause-effect logic programs. GALOIS leverages the program sketch and defines a new sketch-based hybrid program language for guiding the synthesis. Based on that, GALOIS proposes a sketch-based program synthesis method to automatically generate white-box programs with generalizable and interpretable cause-effect logic. Extensive evaluations on various decision-making tasks with complex logic demonstrate the superiority of GALOIS over mainstream baselines regarding the asymptotic performance, generalizability, and great knowledge reusability across different environments.

preprint2022arXiv

HIFI-Net: A Novel Network for Enhancement to Underwater Images

A novel network for enhancement to underwater images is proposed in this paper. It contains a Reinforcement Fusion Module for Haar wavelet images (RFM-Haar) based on Reinforcement Fusion Unit (RFU), which is used to fuse an original image and some important information within it. Fusion is achieved for better enhancement. As this network make "Haar Images into Fusion Images", it is called HIFI-Net. The experimental results show the proposed HIFI-Net performs best among many state-of-the-art methods on three datasets at three normal metrics and a new metric.

preprint2022arXiv

HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation

Discrete-continuous hybrid action space is a natural setting in many practical problems, such as robot control and game AI. However, most previous Reinforcement Learning (RL) works only demonstrate the success in controlling with either discrete or continuous action space, while seldom take into account the hybrid action space. One naive way to address hybrid action RL is to convert the hybrid action space into a unified homogeneous action space by discretization or continualization, so that conventional RL algorithms can be applied. However, this ignores the underlying structure of hybrid action space and also induces the scalability issue and additional approximation difficulties, thus leading to degenerated results. In this paper, we propose Hybrid Action Representation (HyAR) to learn a compact and decodable latent representation space for the original hybrid action space. HyAR constructs the latent space and embeds the dependence between discrete action and continuous parameter via an embedding table and conditional Variantional Auto-Encoder (VAE). To further improve the effectiveness, the action representation is trained to be semantically smooth through unsupervised environmental dynamics prediction. Finally, the agent then learns its policy with conventional DRL algorithms in the learned representation space and interacts with the environment by decoding the hybrid action embeddings to the original action space. We evaluate HyAR in a variety of environments with discrete-continuous action space. The results demonstrate the superiority of HyAR when compared with previous baselines, especially for high-dimensional action spaces.

preprint2022arXiv

Learning-From-Disagreement: A Model Comparison and Visual Analytics Framework

With the fast-growing number of classification models being produced every day, numerous model interpretation and comparison solutions have also been introduced. For example, LIME and SHAP can interpret what input features contribute more to a classifier's output predictions. Different numerical metrics (e.g., accuracy) can be used to easily compare two classifiers. However, few works can interpret the contribution of a data feature to a classifier in comparison with its contribution to another classifier. This comparative interpretation can help to disclose the fundamental difference between two classifiers, select classifiers in different feature conditions, and better ensemble two classifiers. To accomplish it, we propose a learning-from-disagreement (LFD) framework to visually compare two classification models. Specifically, LFD identifies data instances with disagreed predictions from two compared classifiers and trains a discriminator to learn from the disagreed instances. As the two classifiers' training features may not be available, we train the discriminator through a set of meta-features proposed based on certain hypotheses of the classifiers to probe their behaviors. Interpreting the trained discriminator with the SHAP values of different meta-features, we provide actionable insights into the compared classifiers. Also, we introduce multiple metrics to profile the importance of meta-features from different perspectives. With these metrics, one can easily identify meta-features with the most complementary behaviors in two classifiers, and use them to better ensemble the classifiers. We focus on binary classification models in the financial services and advertising industry to demonstrate the efficacy of our proposed framework and visualizations.

preprint2022arXiv

PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations

Deep Reinforcement Learning (DRL) has been a promising solution to many complex decision-making problems. Nevertheless, the notorious weakness in generalization among environments prevent widespread application of DRL agents in real-world scenarios. Although advances have been made recently, most prior works assume sufficient online interaction on training environments, which can be costly in practical cases. To this end, we focus on an offline-training-online-adaptation setting, in which the agent first learns from offline experiences collected in environments with different dynamics and then performs online policy adaptation in environments with new dynamics. In this paper, we propose Policy Adaptation with Decoupled Representations (PAnDR) for fast policy adaptation. In offline training phase, the environment representation and policy representation are learned through contrastive learning and policy recovery, respectively. The representations are further refined by mutual information optimization to make them more decoupled and complete. With learned representations, a Policy-Dynamics Value Function (PDVF) [Raileanu et al., 2020] network is trained to approximate the values for different combinations of policies and environments from offline experiences. In online adaptation phase, with the environment context inferred from few experiences collected in new environments, the policy is optimized by gradient ascent with respect to the PDVF. Our experiments show that PAnDR outperforms existing algorithms in several representative policy adaptation problems.

preprint2022arXiv

Relational Representation Learning in Visually-Rich Documents

Relational understanding is critical for a number of visually-rich documents (VRDs) understanding tasks. Through multi-modal pre-training, recent studies provide comprehensive contextual representations and exploit them as prior knowledge for downstream tasks. In spite of their impressive results, we observe that the widespread relational hints (e.g., relation of key/value fields on receipts) built upon contextual knowledge are not excavated yet. To mitigate this gap, we propose DocReL, a Document Relational Representation Learning framework. The major challenge of DocReL roots in the variety of relations. From the simplest pairwise relation to the complex global structure, it is infeasible to conduct supervised training due to the definition of relation varies and even conflicts in different tasks. To deal with the unpredictable definition of relations, we propose a novel contrastive learning task named Relational Consistency Modeling (RCM), which harnesses the fact that existing relations should be consistent in differently augmented positive views. RCM provides relational representations which are more compatible to the urgent need of downstream tasks, even without any knowledge about the exact definition of relation. DocReL achieves better performance on a wide variety of VRD relational understanding tasks, including table structure recognition, key information extraction and reading order detection.

preprint2022arXiv

Revealing Reliable Signatures by Learning Top-Rank Pairs

Signature verification, as a crucial practical documentation analysis task, has been continuously studied by researchers in machine learning and pattern recognition fields. In specific scenarios like confirming financial documents and legal instruments, ensuring the absolute reliability of signatures is of top priority. In this work, we proposed a new method to learn "top-rank pairs" for writer-independent offline signature verification tasks. By this scheme, it is possible to maximize the number of absolutely reliable signatures. More precisely, our method to learn top-rank pairs aims at pushing positive samples beyond negative samples, after pairing each of them with a genuine reference signature. In the experiment, BHSig-B and BHSig-H datasets are used for evaluation, on which the proposed model achieves overwhelming better pos@top (the ratio of absolute top positive samples to all of the positive samples) while showing encouraging performance on both Area Under the Curve (AUC) and accuracy.

preprint2020arXiv

Continuous Multiagent Control using Collective Behavior Entropy for Large-Scale Home Energy Management

With the increasing popularity of electric vehicles, distributed energy generation and storage facilities in smart grid systems, an efficient Demand-Side Management (DSM) is urgent for energy savings and peak loads reduction. Traditional DSM works focusing on optimizing the energy activities for a single household can not scale up to large-scale home energy management problems. Multi-agent Deep Reinforcement Learning (MA-DRL) shows a potential way to solve the problem of scalability, where modern homes interact together to reduce energy consumers consumption while striking a balance between energy cost and peak loads reduction. However, it is difficult to solve such an environment with the non-stationarity, and existing MA-DRL approaches cannot effectively give incentives for expected group behavior. In this paper, we propose a collective MA-DRL algorithm with continuous action space to provide fine-grained control on a large scale microgrid. To mitigate the non-stationarity of the microgrid environment, a novel predictive model is proposed to measure the collective market behavior. Besides, a collective behavior entropy is introduced to reduce the high peak loads incurred by the collective behaviors of all householders in the smart grid. Empirical results show that our approach significantly outperforms the state-of-the-art methods regarding power cost reduction and daily peak loads optimization.

preprint2020arXiv

Diverse Behavior Is What Game AI Needs: Generating Varied Human-Like Playing Styles Using Evolutionary Multi-Objective Deep Reinforcement Learning

this paper has been withdrawn

preprint2020arXiv

Exponential mixing for the fractional Magneto-Hydrodynamic equations with degenerate stochastic forcing

We establish the existence, uniqueness and exponential attraction properties of an invariant measure for the MHD equations with degenerate stochastic forcing acting only in the magnetic equation. The central challenge is to establish time asymptotic smoothing properties of the associated Markovian semigroup corresponding to this system. Towards this aim we take full advantage of the characteristics of the advective structure to discover a novel Hörmander-type condition which only allows for several noises in the magnetic direction.

preprint2020arXiv

KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge

Reinforcement learning agents usually learn from scratch, which requires a large number of interactions with the environment. This is quite different from the learning process of human. When faced with a new task, human naturally have the common sense and use the prior knowledge to derive an initial policy and guide the learning process afterwards. Although the prior knowledge may be not fully applicable to the new task, the learning process is significantly sped up since the initial policy ensures a quick-start of learning and intermediate guidance allows to avoid unnecessary exploration. Taking this inspiration, we propose knowledge guided policy network (KoGuN), a novel framework that combines human prior suboptimal knowledge with reinforcement learning. Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to fine-tune suboptimal prior knowledge. The proposed framework is end-to-end and can be combined with existing policy-based reinforcement learning algorithm. We conduct experiments on both discrete and continuous control tasks. The empirical results show that our approach, which combines human suboptimal knowledge and RL, achieves significant improvement on learning efficiency of flat RL algorithms, even with very low-performance human prior knowledge.

preprint2020arXiv

Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning

Adversarial attacks against conventional Deep Learning (DL) systems and algorithms have been widely studied, and various defenses were proposed. However, the possibility and feasibility of such attacks against Deep Reinforcement Learning (DRL) are less explored. As DRL has achieved great success in various complex tasks, designing effective adversarial attacks is an indispensable prerequisite towards building robust DRL algorithms. In this paper, we introduce two novel adversarial attack techniques to \emph{stealthily} and \emph{efficiently} attack the DRL agents. These two techniques enable an adversary to inject adversarial samples in a minimal set of critical moments while causing the most severe damage to the agent. The first technique is the \emph{critical point attack}: the adversary builds a model to predict the future environmental states and agent's actions, assesses the damage of each possible attack strategy, and selects the optimal one. The second technique is the \emph{antagonist attack}: the adversary automatically learns a domain-agnostic model to discover the critical moments of attacking the agent in an episode. Experimental results demonstrate the effectiveness of our techniques. Specifically, to successfully attack the DRL agent, our critical point technique only requires 1 (TORCS) or 2 (Atari Pong and Breakout) steps, and the antagonist technique needs fewer than 5 steps (4 Mujoco tasks), which are significant improvements over state-of-the-art methods.

preprint2016arXiv

Homotopy cartesian diagrams in n-angulated categories

It has been proved by Bergh and Thaule that the higher mapping cone axiom is equivalent to the higher octahedral axiom for n-angulated categories. In this note, we use homotopy cartesian diagrams to give several new equivalent statements of the higher mapping cone axiom, which are applied to explain the higher octahedral axiom.

preprint2015arXiv

Geometric Inference on Kernel Density Estimates

We show that geometric inference of a point cloud can be calculated by examining its kernel density estimate with a Gaussian kernel. This allows one to consider kernel density estimates, which are robust to spatial noise, subsampling, and approximate computation in comparison to raw point sets. This is achieved by examining the sublevel sets of the kernel distance, which isomorphically map to superlevel sets of the kernel density estimate. We prove new properties about the kernel distance, demonstrating stability results and allowing it to inherit reconstruction results from recent advances in distance-based topological reconstruction. Moreover, we provide an algorithm to estimate its topology using weighted Vietoris-Rips complexes.

preprint2015arXiv

Restoration of tetragonal $C_4$ symmetry coexistent with filamentary superconductivity in the pressure induced intermediate phase in the iron-based superconductor Ba$_{1-x}$K$_x$Fe$_2$As$_2$

The hole doped Fe-based superconductors Ba$_{1-x}$A$_x$Fe$_2$As$_2$ (where A=Na or K) show a particular rich phase diagram. It was observed that an intermediate re-entrant tetragonal phase forms within the orthorhombic antiferromagnetically-ordered stripe-type spin density wave state above the superconducting transition [S. Avci et al., Nature Comm. 5, 3845 (2014), A. E. Böhmer et al., arXiv:1412.7038v2]. A similar intermediate phase was reported to appear if pressure is applied to underdoped Ba$_{1-x}$K$_x$Fe$_2$As$_2$ [E. Hassinger et al., Phys. Rev. B 86, 140502(R) (2012)]. Here we report data of the electric resistivity, Hall effect, specific heat, and the thermoelectrical Nernst and Seebeck coefficients measured on a Ba$_{0.85}$K$_{0.15}$Fe$_2$As$_2$ single crystal under pressure up to 5.5 GPa. The data reveals a coexistence of the intermediate phase with filamentary superconductivity. The Nernst coefficient shows a large signature of nematic order that coincides with the stripe-type spin density wave state up to optimal pressure. In the pressure-induced intermediate phase the nematic order is removed, thus confirming that its nature is a re-entrant tetragonal phase.

preprint2015arXiv

Subsampling in Smoothed Range Spaces

We consider smoothed versions of geometric range spaces, so an element of the ground set (e.g. a point) can be contained in a range with a non-binary value in $[0,1]$. Similar notions have been considered for kernels; we extend them to more general types of ranges. We then consider approximations of these range spaces through $\varepsilon $-nets and $\varepsilon $-samples (aka $\varepsilon$-approximations). We characterize when size bounds for $\varepsilon $-samples on kernels can be extended to these more general smoothed range spaces. We also describe new generalizations for $\varepsilon $-nets to these range spaces and show when results from binary range spaces can carry over to these smoothed ones.

preprint2014arXiv

High-Pressure Evolution of the Specific Heat of a Strongly Underdoped Ba(Fe0.963Co0.037)As2 Iron-Based Superconductor

We report specific-heat experiments under the influence of high pressure on a strongly underdoped Co-substituted BaFe2As2 single crystal. This allows us to study the phase diagram of this iron pnictide superconductor with a bulk thermodynamic method and pressure as a clean control parameter. The data show large specific-heat anomalies at the superconducting transition temperature, which proves the bulk nature of pressure-induced superconductivity. The transitions in the specific heat are sharper than in resistivity, which demonstrates the necessity of employing bulk thermodynamic methods to explore the exact phase diagram of pressure-induced Fe-based superconductors. The Tc at optimal pressure and the superconducting condensation energy are found to be larger than in optimally Co-doped samples at ambient pressure, which we attribute to a weak pair breaking effect of the Co ions.

Yan Zheng

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching

TabKDE: Simple and Scalable Tabular Data Generation with Kernel Density Estimates

Enhancing Foundation Models in Transaction Understanding with LLM-based Sentence Embeddings

Embedding Compression with Hashing for Efficient Representation Learning in Large-Scale Graph

Energy Savings When Migrating Workloads to the Cloud

GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis

HIFI-Net: A Novel Network for Enhancement to Underwater Images

HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation

Learning-From-Disagreement: A Model Comparison and Visual Analytics Framework

PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations

Relational Representation Learning in Visually-Rich Documents

Revealing Reliable Signatures by Learning Top-Rank Pairs

Continuous Multiagent Control using Collective Behavior Entropy for Large-Scale Home Energy Management

Diverse Behavior Is What Game AI Needs: Generating Varied Human-Like Playing Styles Using Evolutionary Multi-Objective Deep Reinforcement Learning

Exponential mixing for the fractional Magneto-Hydrodynamic equations with degenerate stochastic forcing

KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge

Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning

Homotopy cartesian diagrams in n-angulated categories

Geometric Inference on Kernel Density Estimates

Restoration of tetragonal $C_4$ symmetry coexistent with filamentary superconductivity in the pressure induced intermediate phase in the iron-based superconductor Ba$_{1-x}$K$_x$Fe$_2$As$_2$

Subsampling in Smoothed Range Spaces

High-Pressure Evolution of the Specific Heat of a Strongly Underdoped Ba(Fe0.963Co0.037)As2 Iron-Based Superconductor