Researcher profile

Xinlei Wang

Xinlei Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

EngiAgent: Fully Connected Coordination of LLM Agents for Solving Open-ended Engineering Problems with Feasible Solutions

Engineering problem solving is central to real-world decision-making, requiring mathematical formulations that not only represent complex problems but also produce feasible solutions under data and physical constraints. Unlike mathematical problem solving, which operates on predefined formulations, engineering tasks demand open-ended analysis, feasibility-driven modeling, and iterative refinement. Although large language models (LLMs) have shown strong capabilities in reasoning and code generation, they often fail to ensure feasibility, which limits their applicability to engineering problem solving. To address this challenge, we propose EngiAgent, a multi-agent system with a fully connected coordinator that simulates expert workflows through specialized agents for problem analysis, modeling, verification, solving, and solution evaluation. The fully connected coordinator enables flexible feedback routing, overcoming the rigidity of prior pipeline-based reflection methods and ensuring feasibility at every stage of the process. This design not only improves robustness to diverse failure cases such as data extraction errors, constraint inconsistencies, and solver failures, but also enhances the overall quality of problem solving. Empirical results across four representative domains demonstrate that EngiAgent achieves substantial improvements in feasibility compared to prior approaches, establishing a new paradigm for feasibility-oriented engineering problem solving with LLMs. Our source code and data are available at https://github.com/AI4Engi/EngiAgent.

preprint2025arXiv

Empirical Bayes Method for Large Scale Multiple Testing with Heteroscedastic Errors

In this paper, we address the normal mean inference problem, which involves testing multiple means of normal random variables with heteroscedastic variances. Most existing empirical Bayes methods for this setting are developed under restrictive assumptions, such as the scaled inverse-chi-squared prior for variances and unimodality for the non-null mean distribution. However, when either of these assumptions is violated, these methods often fail to control the false discovery rate (FDR) at the target level or suffer from a substantial loss of power. To overcome these limitations, we propose a new empirical Bayes method, gg-Mix, which assumes only independence between the normal means and variances, without imposing any structural restrictions on their distributions. We thoroughly evaluate the FDR control and power of gg-Mix through extensive numerical studies and demonstrate its superior performance compared to existing methods. Finally, we apply gg-Mix to three real data examples to further illustrate the practical advantages of our approach.

preprint2022arXiv

Empirical Likelihood Inference for Area under the ROC Curve using Ranked Set Samples

The area under a receiver operating characteristic curve (AUC) is a useful tool to assess the performance of continuous-scale diagnostic tests on binary classification. In this article, we propose an empirical likelihood (EL) method to construct confidence intervals for the AUC from data collected by ranked set sampling (RSS). The proposed EL-based method enables inferences without assumptions required in existing nonparametric methods and takes advantage of the sampling efficiency of RSS. We show that for both balanced and unbalanced RSS, the EL-based point estimate is the Mann-Whitney statistic, and confidence intervals can be obtained from a scaled chi-square distribution. Simulation studies and two case studies on diabetes and chronic kidney disease data suggest that using the proposed method and RSS enables more efficient inference on the AUC.

preprint2022arXiv

Multiple Instance Neural Networks Based on Sparse Attention for Cancer Detection using T-cell Receptor Sequences

Early detection of cancers has been much explored due to its paramount importance in biomedical fields. Among different types of data used to answer this biological question, studies based on T cell receptors (TCRs) are under recent spotlight due to the growing appreciation of the roles of the host immunity system in tumor biology. However, the one-to-many correspondence between a patient and multiple TCR sequences hinders researchers from simply adopting classical statistical/machine learning methods. There were recent attempts to model this type of data in the context of multiple instance learning (MIL). Despite the novel application of MIL to cancer detection using TCR sequences and the demonstrated adequate performance in several tumor types, there is still room for improvement, especially for certain cancer types. Furthermore, explainable neural network models are not fully investigated for this application. In this article, we propose multiple instance neural networks based on sparse attention (MINN-SA) to enhance the performance in cancer detection and explainability. The sparse attention structure drops out uninformative instances in each bag, achieving both interpretability and better predictive performance in combination with the skip connection. Our experiments show that MINN-SA yields the highest area under the ROC curve (AUC) scores on average measured across 10 different types of cancers, compared to existing MIL approaches. Moreover, we observe from the estimated attentions that MINN-SA can identify the TCRs that are specific for tumor antigens in the same T cell repertoire.

preprint2021arXiv

Super-Resolution Perception for Industrial Sensor Data

In this paper, we present the problem formulation and methodology framework of Super-Resolution Perception (SRP) on industrial sensor data. Industrial intelligence relies on high-quality industrial sensor data for system control, diagnosis, fault detection, identification, and monitoring. However, the provision of high-quality data may be expensive in some cases. In this paper, we propose a novel machine learning problem -- the SRP problem as reconstructing high-quality data from unsatisfactory sensor data in industrial systems. Advanced generative models are then proposed to solve the SRP problem. This technology makes it possible to empower existing industrial facilities without upgrading existing sensors or deploying additional sensors. We first mathematically formulate the SRP problem under the Maximum a Posteriori (MAP) estimation framework. A case study is then presented, which performs SRP on smart meter data. A network, namely SRPNet, is proposed to generate high-frequency load data from low-frequency data. We further employ a novel recognition-based loss and relativistic adversarial loss to constraint the reconstruction of waveforms explicitly. Experiments demonstrate that our SRP model can reconstruct high-frequency data effectively. Moreover, the reconstructed high-frequency data can lead to better appliance monitoring results without changing the monitoring appliances.

preprint2020arXiv

Hierarchical Optimization Time Integration for CFL-rate MPM Stepping

We propose Hierarchical Optimization Time Integration (HOT) for efficient implicit time-stepping of the Material Point Method (MPM) irrespective of simulated materials and conditions. HOT is an MPM-specialized hierarchical optimization algorithm that solves nonlinear time step problems for large-scale MPM systems near the CFL-limit. HOT provides convergent simulations "out-of-the-box" across widely varying materials and computational resolutions without parameter tuning. As an implicit MPM time stepper accelerated by a custom-designed Galerkin multigrid wrapped in a quasi-Newton solver, HOT is both highly parallelizable and robustly convergent. As we show in our analysis, HOT maintains consistent and efficient performance even as we grow stiffness, increase deformation, and vary materials over a wide range of finite strain, elastodynamic and plastic examples. Through careful benchmark ablation studies, we compare the effectiveness of HOT against seemingly plausible alternative combinations of MPM with standard multigrid and other Newton-Krylov models. We show how these alternative designs result in severe issues and poor performance. In contrast, HOT outperforms the existing state-of-the-art, heavily optimized implicit MPM codes with an up to 10x performance speedup across a wide range of challenging benchmark test simulations.

preprint2020arXiv

Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry

In autonomous driving, monocular sequences contain lots of information. Monocular depth estimation, camera ego-motion estimation and optical flow estimation in consecutive frames are high-profile concerns recently. By analyzing tasks above, pixels in the middle frame are modeled into three parts: the rigid region, the non-rigid region, and the occluded region. In joint unsupervised training of depth and pose, we can segment the occluded region explicitly. The occlusion information is used in unsupervised learning of depth, pose and optical flow, as the image reconstructed by depth-pose and optical flow will be invalid in occluded regions. A less-than-mean mask is designed to further exclude the mismatched pixels interfered with by motion or illumination change in the training of depth and pose networks. This method is also used to exclude some trivial mismatched pixels in the training of the optical flow network. Maximum normalization is proposed for depth smoothness term to restrain depth degradation in textureless regions. In the occluded region, as depth and camera motion can provide more reliable motion estimation, they can be used to instruct unsupervised learning of optical flow. Our experiments in KITTI dataset demonstrate that the model based on three regions, full and explicit segmentation of the occlusion region, the rigid region, and the non-rigid region with corresponding unsupervised losses can improve performance on three tasks significantly. The source code is available at: https://github.com/guangmingw/DOPlearning.