Source author record

Yan Han

Yan Han appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.AS Machine Learning Sound Computer Vision eess.SP quant-ph Computation and Language eess.IV gr-qc hep-ph hep-th cond-mat.mtrl-sci hep-ex Multiagent Systems physics.atom-ph physics.chem-ph Quantitative Methods

Catalog footprint

What is connected

22works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Rethinking the Value of Multi-Agent Workflow: A Strong Single Agent Baseline

Recent advances in LLM-based multi-agent systems (MAS) show that workflows composed of multiple LLM agents with distinct roles, tools, and communication patterns can outperform single-LLM baselines on complex tasks. However, most frameworks are homogeneous, where all agents share the same base LLM and differ only in prompts, tools, and positions in the workflow. This raises the question of whether such workflows can be simulated by a single agent through multi-turn conversations. We investigate this across seven benchmarks spanning coding, mathematics, general question answering, domain-specific reasoning, and real-world planning and tool use. Our results show that a single agent can reach the performance of homogeneous workflows with an efficiency advantage from KV cache reuse, and can even match the performance of an automatically optimized heterogeneous workflow. Building on this finding, we propose \textbf{OneFlow}, an algorithm that automatically tailors workflows for single-agent execution, reducing inference costs compared to existing automatic multi-agent design frameworks without trading off accuracy. These results position the single-LLM implementation of multi-agent workflows as a strong baseline for MAS research. We also note that single-LLM methods cannot capture heterogeneous workflows due to the lack of KV cache sharing across different LLMs, highlighting future opportunities in developing \textit{truly} heterogeneous multi-agent systems.

preprint2024arXiv

Large-scale data extraction from the UNOS organ donor documents

In this paper we focus on three major task: 1) discussing our methods: Our method captures a portion of the data in DCD flowsheets, kidney perfusion data, and Flowsheet data captured peri-organ recovery surgery. 2) demonstrating the result: We built a comprehensive, analyzable database from 2022 OPTN data. This dataset is by far larger than any previously available even in this preliminary phase; and 3) proving that our methods can be extended to all the past OPTN data and future data. The scope of our study is all Organ Procurement and Transplantation Network (OPTN) data of the USA organ donors since 2008. The data was not analyzable in a large scale in the past because it was captured in PDF documents known as ``Attachments'', whereby every donor's information was recorded into dozens of PDF documents in heterogeneous formats. To make the data analyzable, one needs to convert the content inside these PDFs to an analyzable data format, such as a standard SQL database. In this paper we will focus on 2022 OPTN data, which consists of $\approx 400,000$ PDF documents spanning millions of pages. The entire OPTN data covers 15 years (2008--20022). This paper assumes that readers are familiar with the content of the OPTN data.

preprint2022arXiv

Knowledge-Augmented Contrastive Learning for Abnormality Classification and Localization in Chest X-rays with Radiomics using a Feedback Loop

Building a highly accurate predictive model for classification and localization of abnormalities in chest X-rays usually requires a large number of manually annotated labels and pixel regions (bounding boxes) of abnormalities. However, it is expensive to acquire such annotations, especially the bounding boxes. Recently, contrastive learning has shown strong promise in leveraging unlabeled natural images to produce highly generalizable and discriminative features. However, extending its power to the medical image domain is under-explored and highly non-trivial, since medical images are much less amendable to data augmentations. In contrast, their prior knowledge, as well as radiomic features, is often crucial. To bridge this gap, we propose an end-to-end semi-supervised knowledge-augmented contrastive learning framework, that simultaneously performs disease classification and localization tasks. The key knob of our framework is a unique positive sampling approach tailored for the medical images, by seamlessly integrating radiomic features as a knowledge augmentation. Specifically, we first apply an image encoder to classify the chest X-rays and to generate the image features. We next leverage Grad-CAM to highlight the crucial (abnormal) regions for chest X-rays (even when unannotated), from which we extract radiomic features. The radiomic features are then passed through another dedicated encoder to act as the positive sample for the image features generated from the same chest X-ray. In this way, our framework constitutes a feedback loop for image and radiomic modality features to mutually reinforce each other. Their contrasting yields knowledge-augmented representations that are both robust and interpretable. Extensive experiments on the NIH Chest X-ray dataset demonstrate that our approach outperforms existing baselines in both classification and localization tasks.

preprint2022arXiv

Learning Deep Optimal Embeddings with Sinkhorn Divergences

Deep Metric Learning algorithms aim to learn an efficient embedding space to preserve the similarity relationships among the input data. Whilst these algorithms have achieved significant performance gains across a wide plethora of tasks, they have also failed to consider and increase comprehensive similarity constraints; thus learning a sub-optimal metric in the embedding space. Moreover, up until now; there have been few studies with respect to their performance in the presence of noisy labels. Here, we address the concern of learning a discriminative deep embedding space by designing a novel, yet effective Deep Class-wise Discrepancy Loss (DCDL) function that segregates the underlying similarity distributions (thus introducing class-wise discrepancy) of the embedding points between each and every class. Our empirical results across three standard image classification datasets and two fine-grained image recognition datasets in the presence and absence of noise clearly demonstrate the need for incorporating such class-wise similarity relationships along with traditional algorithms while learning a discriminative embedding space.

preprint2022arXiv

Pneumonia Detection on Chest X-ray using Radiomic Features and Contrastive Learning

Chest X-ray becomes one of the most common medical diagnoses due to its noninvasiveness. The number of chest X-ray images has skyrocketed, but reading chest X-rays still have been manually performed by radiologists, which creates huge burnouts and delays. Traditionally, radiomics, as a subfield of radiology that can extract a large number of quantitative features from medical images, demonstrates its potential to facilitate medical imaging diagnosis before the deep learning era. With the rise of deep learning, the explainability of deep neural networks on chest X-ray diagnosis remains opaque. In this study, we proposed a novel framework that leverages radiomics features and contrastive learning to detect pneumonia in chest X-ray. Experiments on the RSNA Pneumonia Detection Challenge dataset show that our model achieves superior results to several state-of-the-art models (> 10% in F1-score) and increases the model's interpretability.

preprint2020arXiv

Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese

We report upon the results of a research and prototype building project \emph{Worldly~OCR} dedicated to developing new, more accurate image-to-text conversion software for several languages and writing systems. These include the cursive scripts Farsi and Pashto, and Latin cursive scripts. We also describe approaches geared towards Traditional Chinese, which is non-cursive, but features an extremely large character set of 65,000 characters. Our methodology is based on Machine Learning, especially Deep Learning, and Data Science, and is directed towards vast quantities of original documents, exceeding a billion pages. The target audience of this paper is a general audience with interest in Digital Humanities or in retrieval of accurate full-text and metadata from digital images.

preprint2020arXiv

Generating EEG features from Acoustic features

In this paper we demonstrate predicting electroencephalograpgy (EEG) features from acoustic features using recurrent neural network (RNN) based regression model and generative adversarial network (GAN). We predict various types of EEG features from acoustic features. We compare our results with the previously studied problem on speech synthesis using EEG and our results demonstrate that EEG features can be generated from acoustic features with lower root mean square error (RMSE), normalized RMSE values compared to generating acoustic features from EEG features (ie: speech synthesis using EEG) when tested using the same data sets.

preprint2020arXiv

Robust End-to-End Speaker Verification Using EEG

In this paper we demonstrate that performance of a speaker verification system can be improved by concatenating electroencephalography (EEG) signal features with speech signal features or only using EEG signal features. We use state-of-the-art end-to-end deep learning model for performing speaker verification and we demonstrate our results for noisy speech. Our results indicate that EEG signals can improve the robustness of speaker verification systems, especially in noiser environment.

preprint2020arXiv

Speech Recognition With No Speech Or With Noisy Speech Beyond English

In this paper we demonstrate continuous noisy speech recognition using connectionist temporal classification (CTC) model on limited Chinese vocabulary using electroencephalography (EEG) features with no speech signal as input and we further demonstrate single CTC model based continuous noisy speech recognition on limited joint English and Chinese vocabulary using EEG features with no speech signal as input. We demonstrate our results using various EEG feature sets recently introduced in [1] as well as we propose a new deep learning architecture in this paper which can perform continuous speech recognition using raw EEG signals on limited joint English and Chinese vocabulary.

preprint2020arXiv

Speech Synthesis using EEG

In this paper we demonstrate speech synthesis using different electroencephalography (EEG) feature sets recently introduced in [1]. We make use of a recurrent neural network (RNN) regression model to predict acoustic features directly from EEG features. We demonstrate our results using EEG features recorded in parallel with spoken speech as well as using EEG recorded in parallel with listening utterances. We provide EEG based speech synthesis results for four subjects in this paper and our results demonstrate the feasibility of synthesizing speech directly from EEG features.

preprint2020arXiv

Spoken Speech Enhancement using EEG

In this paper we demonstrate spoken speech enhancement using electroencephalography (EEG) signals using a generative adversarial network (GAN) based model, gated recurrent unit (GRU) regression based model, temporal convolutional network (TCN) regression model and finally using a mixed TCN GRU regression model. We compare our EEG based speech enhancement results with traditional log minimum mean-square error (MMSE) speech enhancement algorithm and our proposed methods demonstrate significant improvement in speech enhancement quality compared to the traditional method. Our overall results demonstrate that EEG features can be used to clean speech recorded in presence of background noise. To the best of our knowledge this is the first time a spoken speech enhancement is demonstrated using EEG features recorded in parallel with spoken speech.

preprint2020arXiv

State-of-the-art Speech Recognition using EEG and Towards Decoding of Speech Spectrum From EEG

In this paper we first demonstrate continuous noisy speech recognition using electroencephalography (EEG) signals on English vocabulary using different types of state of the art end-to-end automatic speech recognition (ASR) models, we further provide results obtained using EEG data recorded under different experimental conditions. We finally demonstrate decoding of speech spectrum from EEG signals using a long short term memory (LSTM) based regression model and Generative Adversarial Network (GAN) based model. Our results demonstrate the feasibility of using EEG signals for continuous noisy speech recognition under different experimental conditions and we provide preliminary results for synthesis of speech from EEG features.

preprint2020arXiv

Voice Activity Detection in presence of background noise using EEG

In this paper we demonstrate that performance of voice activity detection (VAD) system operating in presence of background noise can be improved by concatenating acoustic input features with electroencephalography (EEG) features. We also demonstrate that VAD using only EEG features shows better performance than VAD using only acoustic features in presence of background noise. We implemented a recurrent neural network (RNN) based VAD system and we demonstrate our results for two different data sets recorded in presence of different noise conditions in this paper. We finally demonstrate the ability to predict whether a person wish to continue speaking a sentence or not from EEG features.

preprint2015arXiv

Layer-dependent surface potential of phosphorene and anisotropic/layer-dependent charge transfer in phosphorene-gold hybrid system

The surface potential and the efficiency of interfacial charge transfer are extremely important for designing future semiconductor devices based on the emerging two-dimensional (2D) phosphorene. Here, we directly measured the strongly layer-dependent surface potential of mono- and few-layer phosphorene on gold, which confirms with the reported theoretical prediction. At the same time, we used an optical way - photoluminescence (PL) spectroscopy to probe the charge transfer in phosphorene-gold hybrid system. We firstly observed highly anisotropic and layer-dependent PL quenching in the phosphorene-gold hybrid system, which is attributed to the highly anisotropic/layer-dependent interfacial charge transfer.

preprint2015arXiv

Robust fermionic-mode entanglement of a nanoelectronic system in non-Markovian environments

A maximal steady-state fermionic entanglement of a nanoelectronic system is generated in finite temperature non-Markovian environments. The fermionic entanglement dynamics is presented by connecting the exact solution of the system with an appropriate definition of fermionic entanglement. We prove that the two understandings of the dissipationless non-Markovian dynamics, namely the bound state and the modified Laplace transformation are completely equivalent. For comparison, the steady-state entanglement is also studied in the wide-band limit and Born-Markovian approximation. When the environments have a finite band structure, we find that the system presents various kinds of relaxation processes. The final states can be: thermal or thermal-like states, quantum memory states and oscillating quantum memory states. Our study provide an analytical way to explore the non-Markovian entanglement dynamics of identical fermions in a realistic setting, i.e., finite temperature reservoirs with a cutoff spectrum.

preprint2014arXiv

Nonlinearity enhancement in optomechnical system

The nonlinearity is an important feature in the field of optomechanics. Employing atomic coherence, we put forward a scheme to enhance the nonlinearity of the cavity optomechanical system. The effective Hamiltonian is derived, which shows that the nonlinear strength can be enhanced by increasing the number of atoms at certain range of parameters. We also numerically study the nonlinearity enhancement beyond the effective Hamiltonian. Furthermore, we investigate the potential usage of the nonlinearity in performing quantum nondemolition (QND) measurement of the bosonic modes. Our results show that the present system exhibits synchronization, and the nonlinear effects provide us an effective method in performing QND.

preprint2012arXiv

On black hole spectroscopy via adiabatic invariance

In this paper, we obtain the black hole spectroscopy by combining the black hole property of adiabaticity and the oscillating velocity of the black hole horizon. This velocity is obtained in the tunneling framework. In particular, we declare, if requiring canonical invariance, the adiabatic invariant quantity should be of the covariant form $I_{\textrm{adia}}=\oint p_idq_i$. Using it, the horizon area of a Schwarzschild black hole is quantized independent of the choice of coordinates, with an equally spaced spectroscopy always given by $Δ\mathcal{A}=8πl_p^2$ in the Schwarzschild and Painlevé coordinates.

preprint2011arXiv

$W_H/Z_H$ production associated with a T-odd (anti)quark at the LHC in NLO QCD

In the framework of the littlest Higgs model with T parity, we study the $W_H/Z_H$ production in association with a T-odd (anti)quark of the first two generations at the CERN Large Hadron Collider up to the QCD next-to-leading order. The kinematic distributions of final decay products and the theoretical dependence of the cross section on the factorization/renormalization scale are discussed. We apply three schemes in considering the QCD NLO contributions and find that the QCD NLO corrections by adopting the (II) and (III) subtraction schemes can keep the convergence of the perturbative QCD description and reduce the scale uncertainty of the leading order cross section. By using these two subtraction schemes, the QCD NLO corrections to the $W_H(Z_H) q_-$ production process enhance the leading order cross section with a K-factor in the range of $1.00 \sim 1.43$.

preprint2011arXiv

Entanglement of nanomechanical oscillators and two-mode fields induced by atomic coherence

We propose a scheme via three-level cascade atoms to entangle two optomechanical oscillators as well as two-mode fields. We show that two movable mirrors and two-mode fields can be entangled even for bad cavity limit. We also study entanglement of the output two-mode fields in frequency domain. The results show that the frequency of the mirror oscillation and the injected atomic coherence affect the output entanglement of the two-mode fields.

preprint2011arXiv

QCD NLO predictions to $W$-pair production in association with a massive (anti)bottom-jet at the LHC

The $W$-pair production in association with a massive (anti)bottom jet is not only an important background to a number of interesting processes, such as the single top production associated with a W boson, but also a potential background to new physics searches. We present the calculations of the total and differential cross sections for the $W^+W^-+ b(\bar{b})$ jet productions at the LHC up to the QCD next-to-leading order (NLO). Our results by adopting the QCD NLO contribution collection scheme-I show that the K factors can be 1.66 and 1.21 with the inclusive and exclusive two-jet event selection schemes respectively, when we set $m_H=120 GeV$, $μ=m_W+m_b/2$ and take the constraints of $p_{T,b(\bar{b})}>25 GeV$, $|y_{b(\bar{b})}|<2.5$ for $b(\bar{b})$ jet. We find that the stabilization of the theoretical prediction for the integrated cross section for the $pp \to W^+W^-b(\bar b)+X$ up to the QCD NLO requires a veto on a second isolated hard jet and the inclusion of the QCD NLO contribution from the $W^+W^-b\bar{b}(bb, \bar{b}\bar{b})$ production with the final two $b(\bar{b})$ quarks being merged as one jet.

preprint2010arXiv

Quantum corrections and black hole spectroscopy

In the work \cite{BRM,RBE}, black hole spectroscopy has been successfully reproduced in the tunneling picture. As a result, the derived entropy spectrum of black hole in different gravity (including Einstein's gravity, Einstein-Gauss-Bonnet gravity and Hořava-Lifshitz gravity) are all evenly spaced, sharing the same forms as $S_n=n$, where physical process is only confined in the semiclassical framework. However, the real physical picture should go beyond the semiclassical approximation. In this case, the physical quantities would undergo higher-order quantum corrections, whose effect on different gravity shares in different forms. Motivated by these facts, in this paper we aim to observe how quantum corrections affect black hole spectroscopy in different gravity. The result shows that, in the presence of higher-order quantum corrections, black hole spectroscopy in different gravity still shares the same form as $S_n=n$, further confirming the entropy quantum is universal in the sense that it is not only independent of black hole parameters, but also independent of higher-order quantum corrections. This is a desiring result for the forthcoming quantum gravity theory.

preprint2009arXiv

Modulation of Field Emission Resonance on photodetachment of negative ions on surface

The interaction between the field emission resonance states and the photodetached electron in an electric field is studied by semiclassical theory. An analytical expression of the photodetachment cross section is derived in the framework. It is found that the Stark shifted image state modulates the photodetachment cross section by adding irregular staircase or smooth oscillation in the spectrum. When the photodetached electron is trapped in Stark shifted image potential well, the detachment spectrum displays an irregular staircase structure which corresponds to the modified Rydberg series. While the photodetached electron is not bound by the surface potential well, the cross secton contains only a smooth oscillation due to the reflection of electronic wave by the field or the surface.

Yan Han

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

Rethinking the Value of Multi-Agent Workflow: A Strong Single Agent Baseline

Large-scale data extraction from the UNOS organ donor documents

Knowledge-Augmented Contrastive Learning for Abnormality Classification and Localization in Chest X-rays with Radiomics using a Feedback Loop

Learning Deep Optimal Embeddings with Sinkhorn Divergences

Pneumonia Detection on Chest X-ray using Radiomic Features and Contrastive Learning

Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese

Generating EEG features from Acoustic features

Robust End-to-End Speaker Verification Using EEG

Speech Recognition With No Speech Or With Noisy Speech Beyond English

Speech Synthesis using EEG

Spoken Speech Enhancement using EEG

State-of-the-art Speech Recognition using EEG and Towards Decoding of Speech Spectrum From EEG

Voice Activity Detection in presence of background noise using EEG

Layer-dependent surface potential of phosphorene and anisotropic/layer-dependent charge transfer in phosphorene-gold hybrid system

Robust fermionic-mode entanglement of a nanoelectronic system in non-Markovian environments

Nonlinearity enhancement in optomechnical system

On black hole spectroscopy via adiabatic invariance

$W_H/Z_H$ production associated with a T-odd (anti)quark at the LHC in NLO QCD

Entanglement of nanomechanical oscillators and two-mode fields induced by atomic coherence

QCD NLO predictions to $W$-pair production in association with a massive (anti)bottom-jet at the LHC

Quantum corrections and black hole spectroscopy

Modulation of Field Emission Resonance on photodetachment of negative ions on surface