Researcher profile

Kai Ye

Kai Ye contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning

Recent advances in large language models (LLMs) have increasingly relied on reinforcement learning (RL) to improve their reasoning capabilities. Three types of approaches have been widely adopted: The first relies on a deep neural network to estimate the value function of the learning policy in order to reduce the variance of the policy gradient. However, estimating and maintaining such a value network incurs substantial computational and memory overhead. The second avoids training a value network by approximating the value function using sample averages. However, it samples a large number of reasoning traces per prompt for accurate value function approximation, making it computationally expensive. The third samples only a single reasoning trajectory per prompt, which reduces computational cost but suffers from poor sample efficiency. This paper focuses on a practical, resource-constrained setting in which only a small number of reasoning traces can be sampled per prompt, while low-variance gradient estimation remains essential for high-quality policy learning. To address this challenge, we bring classical nonparametric statistical methods, which are both computationally and statistically efficient, to LLM reasoning. We employ kernel smoothing as a concrete example for value function estimation and the subsequent policy optimization. Numerical and theoretical results demonstrate that our proposal achieves accurate value and gradient estimation, leading to improved policy optimization.

preprint2024arXiv

Uncertainty Regularized Evidential Regression

The Evidential Regression Network (ERN) represents a novel approach that integrates deep learning with Dempster-Shafer's theory to predict a target and quantify the associated uncertainty. Guided by the underlying theory, specific activation functions must be employed to enforce non-negative values, which is a constraint that compromises model performance by limiting its ability to learn from all samples. This paper provides a theoretical analysis of this limitation and introduces an improvement to overcome it. Initially, we define the region where the models can't effectively learn from the samples. Following this, we thoroughly analyze the ERN and investigate this constraint. Leveraging the insights from our analysis, we address the limitation by introducing a novel regularization term that empowers the ERN to learn from the whole training set. Our extensive experiments substantiate our theoretical findings and demonstrate the effectiveness of the proposed solution.

preprint2022arXiv

Cross Language Image Matching for Weakly Supervised Semantic Segmentation

It has been widely known that CAM (Class Activation Map) usually only activates discriminative object regions and falsely includes lots of object-related backgrounds. As only a fixed set of image-level object labels are available to the WSSS (weakly supervised semantic segmentation) model, it could be very difficult to suppress those diverse background regions consisting of open set objects. In this paper, we propose a novel Cross Language Image Matching (CLIMS) framework, based on the recently introduced Contrastive Language-Image Pre-training (CLIP) model, for WSSS. The core idea of our framework is to introduce natural language supervision to activate more complete object regions and suppress closely-related open background regions. In particular, we design object, background region and text label matching losses to guide the model to excite more reasonable object regions for CAM of each category. In addition, we design a co-occurring background suppression loss to prevent the model from activating closely-related background regions, with a predefined set of class-related background text descriptions. These designs enable the proposed CLIMS to generate a more complete and compact activation map for the target objects. Extensive experiments on PASCAL VOC2012 dataset show that our CLIMS significantly outperforms the previous state-of-the-art methods.

preprint2022arXiv

Multi-Robot Active Mapping via Neural Bipartite Graph Matching

We study the problem of multi-robot active mapping, which aims for complete scene map construction in minimum time steps. The key to this problem lies in the goal position estimation to enable more efficient robot movements. Previous approaches either choose the frontier as the goal position via a myopic solution that hinders the time efficiency, or maximize the long-term value via reinforcement learning to directly regress the goal position, but does not guarantee the complete map construction. In this paper, we propose a novel algorithm, namely NeuralCoMapping, which takes advantage of both approaches. We reduce the problem to bipartite graph matching, which establishes the node correspondences between two graphs, denoting robots and frontiers. We introduce a multiplex graph neural network (mGNN) that learns the neural distance to fill the affinity matrix for more effective graph matching. We optimize the mGNN with a differentiable linear assignment layer by maximizing the long-term values that favor time efficiency and map completeness via reinforcement learning. We compare our algorithm with several state-of-the-art multi-robot active mapping approaches and adapted reinforcement-learning baselines. Experimental results demonstrate the superior performance and exceptional generalization ability of our algorithm on various indoor scenes and unseen number of robots, when only trained with 9 indoor scenes.

preprint2021arXiv

Manifold Interpolation for Large-Scale Multi-Objective Optimization via Generative Adversarial Networks

Large-scale multiobjective optimization problems (LSMOPs) are characterized as involving hundreds or even thousands of decision variables and multiple conflicting objectives. An excellent algorithm for solving LSMOPs should find Pareto-optimal solutions with diversity and escape from local optima in the large-scale search space. Previous research has shown that these optimal solutions are uniformly distributed on the manifold structure in the low-dimensional space. However, traditional evolutionary algorithms for solving LSMOPs have some deficiencies in dealing with this structural manifold, resulting in poor diversity, local optima, and inefficient searches. In this work, a generative adversarial network (GAN)-based manifold interpolation framework is proposed to learn the manifold and generate high-quality solutions on this manifold, thereby improving the performance of evolutionary algorithms. We compare the proposed algorithm with several state-of-the-art algorithms on large-scale multiobjective benchmark functions. Experimental results have demonstrated the significant improvements achieved by this framework in solving LSMOPs.

preprint2020arXiv

Homotopic Convex Transformation: A New Landscape Smoothing Method for the Traveling Salesman Problem

This paper proposes a novel landscape smoothing method for the symmetric Traveling Salesman Problem (TSP). We first define the Homotopic Convex (HC) transformation of a TSP as a convex combination of a well-constructed simple TSP and the original TSP. The simple TSP, called the convex-hull TSP, is constructed by transforming a known local or global optimum. We observe that controlled by the coefficient of the convex combination, with local or global optimum, (i) the landscape of the HC transformed TSP is smoothed in terms that its number of local optima is reduced compared to the original TSP; (ii) the fitness distance correlation of the HC transformed TSP is increased. Further, we observe that the smoothing effect of the HC transformation depends highly on the quality of the used optimum. A high-quality optimum leads to a better smoothing effect than a low-quality optimum. We then propose an iterative algorithmic framework in which the proposed HC transformation is combined within a heuristic TSP solver. It works as an escaping scheme from local optima aiming to improve the global search ability of the combined heuristic. Case studies using the 3-Opt and the Lin-Kernighan local search as the heuristic solver show that the resultant algorithms significantly outperform their counterparts and two other smoothing-based TSP heuristic solvers on most of the test instances with up to 20,000 cities.

preprint2019arXiv

Active Galactic Nuclei with Ultra-fast Outflows Monitoring Project: The Broad-line Region of Mrk 79 as a Disk Wind

We developed a spectroscopic monitoring project to investigate the kinematics of the broad-line region (BLR) in active galactic nuclei (AGN) with ultra-fast outflows (UFOs). Mrk~79 is a radio-quiet AGN with UFOs and warm absorbers, had been monitored by three reverberation mapping (RM) campaigns, but its BLR kinematics is not understood yet. In this paper, we report the results from a new RM-campaign of Mrk~79, which was undertaken by Lijiang 2.4-m telescope. Mrk~79 is seeming to come out the faint state, the mean flux approximates a magnitude fainter than historical record. We successfully measured the lags of the broad emission lines including H$β~\lambda4861$, H$γ~\lambda4340$, He II $\lambda4686$ and He I $\lambda5876$ with respect to the varying AGN continuum. Based on the broad H$β~\lambda4861$ line, we measured black hole (BH) mass of $M_{\bullet}=5.13^{+1.57}_{-1.55}\times10^{7}M_{\odot}$, estimated accretion rates of ${\dot{M}_{\bullet}}=(0.05\pm0.02)~L_{\rm Edd}~c^{-2}$, indicating that Mrk~79 is a sub-Eddington accretor. We found that Mrk~79 deviates from the canonical Radius$-$Luminosity relationship. The marginal blueshift of the broad He II $\lambda4686$ line detected from rms spectrum indicates outflow of high-ionization gas. The velocity-resolved lag profiles of the broad H$γ~\lambda4340$, H$β~\lambda4861$, and He I $\lambda5876$ lines show similar signatures that the largest lag occurs in the red wing of the lines then the lag decreases to both sides. These signatures should suggest that the BLR of Keplerian motion probably exists the outflow gas motion. All findings including UFOs, warm absorbers, and the kinematics of high- and low-ionization BLR, may provide an indirect evidence that the BLR of Mrk~79 probably originates from disk wind.