Researcher profile

Wenbo Chen

Wenbo Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

Rethinking Evaluation for LLM Hallucination Detection: A Desiderata, A New RAG-based Benchmark, New Insights

Hallucination, broadly referring to unfaithful, fabricated, or inconsistent content generated by LLMs, has wide-ranging implications. Therefore, a large body of effort has been devoted to detecting LLM hallucinations, as well as designing benchmark datasets for evaluating these detectors. In this work, we first establish a desiderata of properties for hallucination detection benchmarks (HDBs) to exhibit for effective evaluation. A critical look at existing HDBs through the lens of our desiderata reveals that none of them exhibits all the properties. We identify two largest gaps: (1) RAG-based grounded benchmarks with long context are severely lacking (partly because length impedes human annotation); and (2) Existing benchmarks do not make available realistic label noise for stress-testing detectors although real-world use-cases often grapple with label noise due to human or automated/weak annotation. To close these gaps, we build and open-source a new RAG-based HDB called T RIVIA+ that underwent a rigorous human annotation process. Notably, our benchmark exhibits all desirable properties including (1) T RIVIA+ contains samples with the longest context in the literature; and (2) we design and share four sets of noisy labels with different, both sample-dependent and sampleindependent, noise schemes. Finally, we perform experiments on RAG-based HDBs, including our T RIVIA+, using popular SOTA detectors that reveal new insights: (i) ample room remains for current detectors to reach the performance ceiling on RAG-based HDBs, (ii) the basic LLM-as-a-Judge baseline performs competitively, and (iii) label noise hinders detection performance. We expect that our findings, along with our proposed benchmark 1 , will motivate and foster needed research on hallucination detection for RAG-based tasks.

preprint2022arXiv

Risk-Aware Control and Optimization for High-Renewable Power Grids

The transition of the electrical power grid from fossil fuels to renewable sources of energy raises fundamental challenges to the market-clearing algorithms that drive its operations. Indeed, the increased stochasticity in load and the volatility of renewable energy sources have led to significant increases in prediction errors, affecting the reliability and efficiency of existing deterministic optimization models. The RAMC project was initiated to investigate how to move from this deterministic setting into a risk-aware framework where uncertainty is quantified explicitly and incorporated in the market-clearing optimizations. Risk-aware market-clearing raises challenges on its own, primarily from a computational standpoint. This paper reviews how RAMC approaches risk-aware market clearing and presents some of its innovations in uncertainty quantification, optimization, and machine learning. Experimental results on real networks are presented.

preprint2022arXiv

Spurious currents suppression by accurate difference schemes in multiphase lattice Boltzmann method

Spurious currents, which are often observed near a curved interface in the multiphase simulations by diffuse interface methods, are unphysical phenomena and usually damage the computational accuracy and stability. In this paper, the origination and suppression of spurious currents are investigated by using the multiphase lattice Boltzmann method driven by chemical potential. Both the difference error and insufficient isotropy of discrete gradient operator give rise to the directional deviations of nonideal force and then originate the spurious currents. Nevertheless, the high-order finite difference produces far more accurate results than the high-order isotropic difference. We compare several finite difference schemes which have different formal accuracy and resolution. When a large proportional coefficient is used, the transition region is narrow and steep, and the resolution of finite difference indicates the computational accuracy more exactly than the formal accuracy. On the contrary, for a small proportional coefficient, the transition region is wide and gentle, and the formal accuracy of finite difference indicates the computational accuracy better than the resolution. Furthermore, numerical simulations show that the spurious currents calculated in the 3D situation are highly consistent with those in 2D simulations; especially, the two-phase coexistence densities calculated by the high-order accuracy finite difference are in excellent agreement with the theoretical predictions of the Maxwell equal-area construction till the reduced temperature 0.2.