Researcher profile

Yuhao Zhang

Yuhao Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2026arXiv

Robust Uncertainty Quantification for Factual Generation of Large Language Models

The rapid advancement of large language model(LLM) technology has facilitated its integration into various domains of professional and daily life. However, the persistent challenge of LLM hallucination has emerged as a critical limitation, significantly compromising the reliability and trustworthiness of AI-generated content. This challenge has garnered significant attention within the scientific community, prompting extensive research efforts in hallucination detection and mitigation strategies. Current methodological frameworks reveal a critical limitation: traditional uncertainty quantification approaches demonstrate effectiveness primarily within conventional question-answering paradigms, yet exhibit notable deficiencies when confronted with non-canonical or adversarial questioning strategies. This performance gap raises substantial concerns regarding the dependability of LLM responses in real-world applications requiring robust critical thinking capabilities. This study aims to fill this gap by proposing an uncertainty quantification scenario in the task of generating with multiple facts. We have meticulously constructed a set of trap questions contained with fake names. Based on this scenario, we innovatively propose a novel and robust uncertainty quantification method(RU). A series of experiments have been conducted to verify its effectiveness. The results show that the constructed set of trap questions performs excellently. Moreover, when compared with the baseline methods on four different models, our proposed method has demonstrated great performance, with an average increase of 0.1-0.2 in ROCAUC values compared to the best performing baseline method, providing new sights and methods for addressing the hallucination issue of LLMs.

preprint2025arXiv

Ga$_2$O$_3$ TCAD Mobility Parameter Calibration using Simulation Augmented Machine Learning with Physics Informed Neural Network

In this paper, we demonstrate the feasibility of performing automatic Technology Computer Aided Design (TCAD) parameter calibration and extraction using machine learning, with the machine trained solely by TCAD simulation data. The methodology is validated using experimental data. Schottky Barrier Diodes (SBDs) with different effective anode workfunction (WF) are fabricated with emerging ultra-wide bandgap material, Gallium Oxide (Ga2O3), and are measured at various temperatures (T). Their current voltage curves are used for automatic Ga2O3 Philips Unified Mobility (PhuMob) model parameters calibration. Five critical PhuMob parameters were calibrated. The machine consists of an autoencoder and a neural network and is trained solely by TCAD simulation data with variations in WF, T, and the five PhuMob parameters (seven variations in total). Then, Ga2O3 PhuMob parameters are extracted from the noisy experimental curves. Subsequent TCAD simulation using the extracted parameters shows that the quality of the parameters is as good as an expert's calibration at the pre-turned on regime, but not in the on state regime. By using a simple physics-informed neural network, the machine performs as well as the human expert in all regimes.

preprint2025arXiv

Sub-Ensemble Correlations as a Covariance Geometry

Conventional practice of spatially resolved detection in diffusion-coupled thermal atomic vapors implicitly treat localized responses as mutually independent. However, in this study, it is shown that observable correlations are governed by the intrinsic spatiotemporal covariance of a global spin-fluctuation field, such that spatial separation specifies only overlapping statistical projections rather than independent physical components. A unified field-theoretic description is established in which sub-ensembles are defined as measurement-induced statistical projections of a single stochastic field. Within this formulation, sub-ensemble correlations are determined by the covariance operator, inducing a natural geometry in which statistical independence corresponds to orthogonality of the measurement functionals. For collective spin fluctuations described by a diffusion-relaxation Ornstein-Uhlenbeck stochastic field, the covariance spectrum admits only a finite set of fluctuation modes in a bounded domain, imposing an intrinsic, field-level limit on the number of statistically distinguishable sub-ensembles. The loss of sub-ensemble independence is formalized through the notion of spatial sampling overlap, which quantifies the unavoidable statistical coupling arising from shared access to common low-order fluctuation modes. While multi-channel atomic magnetometry provides a concrete physical setting in which these constraints become explicit, the framework applies generically to diffusion-coupled stochastic fields.

preprint2022arXiv

Improved Bounds for Fractional Online Matching Problems

Online bipartite matching with one-sided arrival and its variants have been extensively studied since the seminal work of Karp, Vazirani, and Vazirani (STOC 1990). Motivated by real-life applications with dynamic market structures, e.g. ride-sharing, two generalizations of the classical one-sided arrival model are proposed to allow non-bipartite graphs and to allow all vertices to arrive online. Namely, online matching with general vertex arrival is introduced by Wang and Wong (ICALP 2015), and fully online matching is introduced by Huang et al. (JACM 2020). In this paper, we study the fractional versions of the two models. We improve three out of the four state-of-the-art upper and lower bounds of the two models. For fully online matching, we design a $0.6$-competitive algorithm and prove no algorithm can be $0.613$-competitive. For online matching with general vertex arrival, we prove no algorithm can be $0.584$-competitive. Moreover, we give an arguably more intuitive algorithm for the general vertex arrival model, compared to the algorithm of Wang and Wong, while attaining the same competitive ratio of $0.526$.

preprint2022arXiv

Overwatch: Learning Patterns in Code Edit Sequences

Integrated Development Environments (IDEs) provide tool support to automate many source code editing tasks. Traditionally, IDEs use only the spatial context, i.e., the location where the developer is editing, to generate candidate edit recommendations. However, spatial context alone is often not sufficient to confidently predict the developer's next edit, and thus IDEs generate many suggestions at a location. Therefore, IDEs generally do not actively offer suggestions and instead, the developer is usually required to click on a specific icon or menu and then select from a large list of potential suggestions. As a consequence, developers often miss the opportunity to use the tool support because they are not aware it exists or forget to use it. To better understand common patterns in developer behavior and produce better edit recommendations, we can additionally use the temporal context, i.e., the edits that a developer was recently performing. To enable edit recommendations based on temporal context, we present Overwatch, a novel technique for learning edit sequence patterns from traces of developers' edits performed in an IDE. Our experiments show that Overwatch has 78% precision and that Overwatch not only completed edits when developers missed the opportunity to use the IDE tool support but also predicted new edits that have no tool support in the IDE.

preprint2022arXiv

Robustar: Interactive Toolbox Supporting Precise Data Annotation for Robust Vision Learning

We introduce the initial release of our software Robustar, which aims to improve the robustness of vision classification machine learning models through a data-driven perspective. Building upon the recent understanding that the lack of machine learning model's robustness is the tendency of the model's learning of spurious features, we aim to solve this problem from its root at the data perspective by removing the spurious features from the data before training. In particular, we introduce a software that helps the users to better prepare the data for training image classification models by allowing the users to annotate the spurious features at the pixel level of images. To facilitate this process, our software also leverages recent advances to help identify potential images and pixels worthy of attention and to continue the training with newly annotated data. Our software is hosted at the GitHub Repository https://github.com/HaohanWang/Robustar.

preprint2022arXiv

Survivable Network Design Revisited: Group-Connectivity

In the classical survivable network design problem (SNDP), we are given an undirected graph $G=(V,E)$ with costs on edges and a connectivity requirement $k(s,t)$ for each pair of vertices. The goal is to find a minimum-cost subgraph $H\subseteq V$ such that every pair $(s,t)$ are connected by $k(s,t)$ edge or (openly) vertex disjoint paths, abbreviated as EC-SNDP and VC-SNDP, respectively. The seminal result of Jain [FOCS'98, Combinatorica'01] gives a $2$-approximation algorithm for EC-SNDP, and a decade later, an $O(k^3\log n)$-approximation algorithm for VC-SNDP, where $k$ is the largest connectivity requirement, was discovered by Chuzhoy and Khanna [FOCS'09, Theory Comput.'12]. While there is rich literature on point-to-point settings of SNDP, the viable case of connectivity between subsets is still relatively poorly understood. This paper concerns the generalization of SNDP into the subset-to-subset setting, namely Group EC-SNDP. We develop the framework, which yields the first non-trivial (true) approximation algorithm for Group EC-SNDP. Previously, only a bicriteria approximation algorithm is known for Group EC-SNDP [Chalermsook, Grandoni, and Laekhanukit, SODA'15], and a true approximation algorithm is known only for the single-source variant with connectivity requirement $k(S,T)\in\{0,1,2\}$ [Gupta, Krishnaswamy, and Ravi, SODA'10; Khandekar, Kortsarz, and Nutov, FSTTCS'09 and Theor. Comput. Sci.'12].

preprint2021arXiv

Do Syntax Trees Help Pre-trained Transformers Extract Information?

Much recent work suggests that incorporating syntax information from dependency trees can improve task-specific transformer models. However, the effect of incorporating dependency tree information into pre-trained transformer models (e.g., BERT) remains unclear, especially given recent studies highlighting how these models implicitly encode syntax. In this work, we systematically study the utility of incorporating dependency trees into pre-trained transformers on three representative information extraction tasks: semantic role labeling (SRL), named entity recognition, and relation extraction. We propose and investigate two distinct strategies for incorporating dependency structure: a late fusion approach, which applies a graph neural network on the output of a transformer, and a joint fusion approach, which infuses syntax structure into the transformer attention layers. These strategies are representative of prior work, but we introduce additional model design elements that are necessary for obtaining improved performance. Our empirical analysis demonstrates that these syntax-infused transformers obtain state-of-the-art results on SRL and relation extraction tasks. However, our analysis also reveals a critical shortcoming of these models: we find that their performance gains are highly contingent on the availability of human-annotated dependency parses, which raises important questions regarding the viability of syntax-augmented transformers in real-world applications.

preprint2020arXiv

A Simple 1-1/e Approximation for Oblivious Bipartite Matching

We study the oblivious matching problem, which aims at finding a maximum matching on a graph with unknown edge set. Any algorithm for the problem specifies an ordering of the vertex pairs. The matching is then produced by probing the pairs following the ordering, and including a pair if both of them are unmatched and there exists an edge between them. The unweighted (Chan et al. (SICOMP 2018)) and the vertex-weighted (Chan et al. (TALG 2018)) versions of the problem are well studied. In this paper, we consider the edge-weighted oblivious matching problem on bipartite graphs, which generalizes the stochastic bipartite matching problem. Very recently, Gamlath et al. (SODA 2019) studied the stochastic bipartite matching problem, and proposed an (1-1/e)-approximate algorithm. We give a very simple algorithm adapted from the Ranking algorithm by Karp et al. (STOC 1990), and show that it achieves the same (1-1/e) approximation ratio for the oblivious matching problem on bipartite graph.

preprint2020arXiv

Biomedical and Clinical English Model Packages in the Stanza Python NLP Library

We introduce biomedical and clinical English model packages for the Stanza Python NLP library. These packages offer accurate syntactic analysis and named entity recognition capabilities for biomedical and clinical text, by combining Stanza's fully neural architecture with a wide variety of open datasets as well as large-scale unsupervised biomedical and clinical text data. We show via extensive experiments that our packages achieve syntactic analysis and named entity recognition performance that is on par with or surpasses state-of-the-art results. We further show that these models do not compromise speed compared to existing toolkits when GPU acceleration is available, and are made easy to download and use with Stanza's Python interface. A demonstration of our packages is available at: http://stanza.run/bio.

preprint2020arXiv

Fully Online Matching II: Beating Ranking and Water-filling

Karp, Vazirani, and Vazirani (STOC 1990) initiated the study of online bipartite matching, which has held a central role in online algorithms ever since. Of particular importance are the Ranking algorithm for integral matching and the Water-filling algorithm for fractional matching. Most algorithms in the literature can be viewed as adaptations of these two in the corresponding models. Recently, Huang et al.~(STOC 2018, SODA 2019) introduced a more general model called \emph{fully online matching}, which considers general graphs and allows all vertices to arrive online. They also generalized Ranking and Water-filling to fully online matching and gave some tight analysis: Ranking is $Ω\approx 0.567$-competitive on bipartite graphs where the $Ω$-constant satisfies $Ωe^Ω= 1$, and Water-filling is $2-\sqrt{2} \approx 0.585$-competitive on general graphs. We propose fully online matching algorithms strictly better than Ranking and Water-filling. For integral matching on bipartite graphs, we build on the online primal dual analysis of Ranking and Water-filling to design a $0.569$-competitive hybrid algorithm called Balanced Ranking. To our knowledge, it is the first integral algorithm in the online matching literature that successfully integrates ideas from Water-filling. For fractional matching on general graphs, we give a $0.592$-competitive algorithm called Eager Water-filling, which may match a vertex on its arrival. By contrast, the original Water-filling algorithm always matches vertices at their deadlines. Our result for fractional matching further shows a separation between fully online matching and the general vertex arrival model by Wang and Wong (ICALP 2015), due to an upper bound of $0.5914$ in the latter model by Buchbinder, Segev, and Tkach (ESA 2017).

preprint2020arXiv

Learning Architectures from an Extended Search Space for Language Modeling

Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural language modeling, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task, indicating a promising line of research on large-scale pre-learned architectures.

preprint2020arXiv

Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports

Neural abstractive summarization models are able to generate summaries which have high overlap with human references. However, existing models are not optimized for factual correctness, a critical metric in real-world applications. In this work, we develop a general framework where we evaluate the factual correctness of a generated summary by fact-checking it automatically against its reference using an information extraction module. We further propose a training strategy which optimizes a neural summarization model with a factual correctness reward via reinforcement learning. We apply the proposed method to the summarization of radiology reports, where factual correctness is a key requirement. On two separate datasets collected from hospitals, we show via both automatic and human evaluation that the proposed approach substantially improves the factual correctness and overall quality of outputs over a competitive neural summarization system, producing radiology summaries that approach the quality of human-authored ones.

preprint2020arXiv

Robustness to Programmable String Transformations via Augmented Abstract Training

Deep neural networks for natural language processing tasks are vulnerable to adversarial input perturbations. In this paper, we present a versatile language for programmatically specifying string transformations -- e.g., insertions, deletions, substitutions, swaps, etc. -- that are relevant to the task at hand. We then present an approach to adversarially training models that are robust to such user-defined string transformations. Our approach combines the advantages of search-based techniques for adversarial training with abstraction-based techniques. Specifically, we show how to decompose a set of user-defined string transformations into two component specifications, one that benefits from search and another from abstraction. We use our technique to train models on the AG and SST2 datasets and show that the resulting models are robust to combinations of user-defined transformations mimicking spelling mistakes and other meaning-preserving transformations.

preprint2020arXiv

Stanza: A Python Natural Language Processing Toolkit for Many Human Languages

We introduce Stanza, an open-source Python natural language processing toolkit supporting 66 human languages. Compared to existing widely used toolkits, Stanza features a language-agnostic fully neural pipeline for text analysis, including tokenization, multi-word token expansion, lemmatization, part-of-speech and morphological feature tagging, dependency parsing, and named entity recognition. We have trained Stanza on a total of 112 datasets, including the Universal Dependencies treebanks and other multilingual corpora, and show that the same neural architecture generalizes well and achieves competitive performance on all languages tested. Additionally, Stanza includes a native Python interface to the widely used Java Stanford CoreNLP software, which further extends its functionality to cover other tasks such as coreference resolution and relation extraction. Source code, documentation, and pretrained models for 66 languages are available at https://stanfordnlp.github.io/stanza.

preprint2020arXiv

Towards a Better Understanding of Randomized Greedy Matching

There has been a long history for studying randomized greedy matching algorithms since the work by Dyer and Frieze~(RSA 1991). We follow this trend and consider the problem formulated in the oblivious setting, in which the algorithm makes (random) decisions that are essentially oblivious to the input graph. We revisit the \textsf{Modified Randomized Greedy (MRG)} algorithm by Aronson et al.~(RSA 1995) that is proved to be $(0.5+ε)$-approximate. In particular, we study a weaker version of the algorithm named \textsf{Random Decision Order (RDO)} that in each step, randomly picks an unmatched vertex and matches it to an arbitrary neighbor if exists. We prove the \textsf{RDO} algorithm is $0.639$-approximate and $0.531$-approximate for bipartite graphs and general graphs respectively. As a corollary, we substantially improve the approximation ratio of \textsf{MRG}. Furthermore, we generalize the \textsf{RDO} algorithm to the edge-weighted case and prove that it achieves a $0.501$ approximation ratio. This result solves the open question by Chan et al.~(SICOMP 2018) about the existence of an algorithm that beats greedy in this setting. As a corollary, it also solves the open questions by Gamlath et al.~(SODA 2019) in the stochastic setting.