Source author record

Hua Xu

Hua Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language quant-ph Machine Learning Artificial Intelligence cond-mat.supr-con Computer Vision cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.str-el Information Retrieval math.CO math.PR Multimedia physics.optics

Catalog footprint

What is connected

16works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Foundation Models to Unlock Real-World Evidence from Nationwide Medical Claims

Evidence derived from large-scale real-world data (RWD) is increasingly informing regulatory evaluation and healthcare decision-making. Administrative claims provide population-scale, longitudinal records of healthcare utilization, expenditure, and detailed coding of diagnoses, procedures, and medications, yet their potential as a substrate for healthcare foundation models remains largely unexplored. Here we present ReClaim, a generative transformer trained from scratch on 43.8 billion medical events from more than 200 million enrollees in the MarketScan claims data spanning 2008-2022. ReClaim models longitudinal trajectories across diagnoses, procedures, medications, and expenditure, and was scaled to 140 million, 700 million, and 1.7 billion parameters. Across over 1,000 disease-onset prediction tasks, ReClaim achieved a mean AUC of 75.6%, substantially outperforming disease-specific LightGBM (66.3%) and the transformer-based Delphi model (69.4%), with the largest gains for rare diseases. These advantages held across retrospective and prospective evaluations and in external validation on two independent datasets. Performance improved monotonically with scale, and post-training added 13.8 percentage points over pre-training alone. Beyond disease prediction, ReClaim captured financial outcomes and improved real-world evidence (RWE) analyses: for healthcare expenditure forecasting it increased explained variance from 0.28 to 0.37 relative to LightGBM, and in a target trial emulation it reduced systematic bias by 72% on average relative to Delphi. Together, these results establish administrative claims as a scalable substrate for healthcare foundation models and show that learned representations generalize across time periods and data sources, supporting disease surveillance, expenditure forecasting, and RWE generation.

preprint2022arXiv

An Open Natural Language Processing Development Framework for EHR-based Clinical Research: A case demonstration using the National COVID Cohort Collaborative (N3C)

While we pay attention to the latest advances in clinical natural language processing (NLP), we can notice some resistance in the clinical and translational research community to adopt NLP models due to limited transparency, interpretability, and usability. In this study, we proposed an open natural language processing development framework. We evaluated it through the implementation of NLP algorithms for the National COVID Cohort Collaborative (N3C). Based on the interests in information extraction from COVID-19 related clinical notes, our work includes 1) an open data annotation process using COVID-19 signs and symptoms as the use case, 2) a community-driven ruleset composing platform, and 3) a synthetic text data generation workflow to generate texts for information extraction tasks without involving human subjects. The corpora were derived from texts from three different institutions (Mayo Clinic, University of Kentucky, University of Minnesota). The gold standard annotations were tested with a single institution's (Mayo) ruleset. This resulted in performances of 0.876, 0.706, and 0.694 in F-scores for Mayo, Minnesota, and Kentucky test datasets, respectively. The study as a consortium effort of the N3C NLP subgroup demonstrates the feasibility of creating a federated NLP algorithm development and benchmarking platform to enhance multi-institution clinical NLP study and adoption. Although we use COVID-19 as a use case in this effort, our framework is general enough to be applied to other domains of interest in clinical NLP.

preprint2022arXiv

Consistent Representation Learning for Continual Relation Extraction

Continual relation extraction (CRE) aims to continuously train a model on data with new relations while avoiding forgetting old ones. Some previous work has proved that storing a few typical samples of old relations and replaying them when learning new relations can effectively avoid forgetting. However, these memory-based methods tend to overfit the memory samples and perform poorly on imbalanced datasets. To solve these challenges, a consistent representation learning method is proposed, which maintains the stability of the relation embedding by adopting contrastive learning and knowledge distillation when replaying memory. Specifically, supervised contrastive learning based on a memory bank is first used to train each new task so that the model can effectively learn the relation representation. Then, contrastive replay is conducted of the samples in memory and makes the model retain the knowledge of historical relations through memory knowledge distillation to prevent the catastrophic forgetting of the old task. The proposed method can better learn consistent representations to alleviate forgetting effectively. Extensive experiments on FewRel and TACRED datasets show that our method significantly outperforms state-of-the-art baselines and yield strong robustness on the imbalanced dataset.

preprint2022arXiv

Continual Machine Reading Comprehension via Uncertainty-aware Fixed Memory and Adversarial Domain Adaptation

Continual Machine Reading Comprehension aims to incrementally learn from a continuous data stream across time without access the previous seen data, which is crucial for the development of real-world MRC systems. However, it is a great challenge to learn a new domain incrementally without catastrophically forgetting previous knowledge. In this paper, MA-MRC, a continual MRC model with uncertainty-aware fixed Memory and Adversarial domain adaptation, is proposed. In MA-MRC, a fixed size memory stores a small number of samples in previous domain data along with an uncertainty-aware updating strategy when new domain data arrives. For incremental learning, MA-MRC not only keeps a stable understanding by learning both memory and new domain data, but also makes full use of the domain adaptation relationship between them by adversarial learning strategy. The experimental results show that MA-MRC is superior to strong baselines and has a substantial incremental learning ability without catastrophically forgetting under two different continual MRC settings.

preprint2022arXiv

M-SENA: An Integrated Platform for Multimodal Sentiment Analysis

M-SENA is an open-sourced platform for Multimodal Sentiment Analysis. It aims to facilitate advanced research by providing flexible toolkits, reliable benchmarks, and intuitive demonstrations. The platform features a fully modular video sentiment analysis framework consisting of data management, feature extraction, model training, and result analysis modules. In this paper, we first illustrate the overall architecture of the M-SENA platform and then introduce features of the core modules. Reliable baseline results of different modality features and MSA benchmarks are also reported. Moreover, we use model evaluation and analysis tools provided by M-SENA to present intermediate representation visualization, on-the-fly instance test, and generalization ability test results. The source code of the platform is publicly available at https://github.com/thuiar/M-SENA.

preprint2022arXiv

Momentum-dependent oscillator strength crossover of excitons and plasmons in two-dimensional PtSe2

The 1T-phase layered PtX2 chalcogenides has attracted widespread interest due to its thickness dependent metal-semiconductor transition driven by strong interlayer coupling. While the ground state properties of this paradigmatic material system have been widely explored, its fundamental excitation spectrum remains poorly understood. Here we combine first principles calculations with momentum (q) resolved electron energy loss spectroscopy (q-EELS) to study the collective excitations in 1T-PtSe2 from the monolayer limit to the bulk. At finite momentum transfer all the spectra are dominated by two distinct interband plasmons that disperse to higher energy with increasing q. Interestingly, the absence of long-range screening in the two-dimensional (2D) limit, inhibits the formation of long wavelength plasmons. Consequently, in the small-q limit, excitations in monolayer PtSe2 are exclusively of excitonic nature, and the loss spectrum coincides with the optical spectrum. Our work unravels the excited state spectrum of layered 1T-PtSe2 and establishes the qualitatively different momentum dependence of excitons and plasmons in 2D materials.

preprint2022arXiv

RGB Image Classification with Quantum Convolutional Ansaetze

With the rapid growth of qubit numbers and coherence times in quantum hardware technology, implementing shallow neural networks on the so-called Noisy Intermediate-Scale Quantum (NISQ) devices has attracted a lot of interest. Many quantum (convolutional) circuit ansaetze are proposed for grayscale images classification tasks with promising empirical results. However, when applying these ansaetze on RGB images, the intra-channel information that is useful for vision tasks is not extracted effectively. In this paper, we propose two types of quantum circuit ansaetze to simulate convolution operations on RGB images, which differ in the way how inter-channel and intra-channel information are extracted. To the best of our knowledge, this is the first work of a quantum convolutional circuit to deal with RGB images effectively, with a higher test accuracy compared to the purely classical CNNs. We also investigate the relationship between the size of quantum circuit ansatz and the learnability of the hybrid quantum-classical convolutional neural network. Through experiments based on CIFAR-10 and MNIST datasets, we demonstrate that a larger size of the quantum circuit ansatz improves predictive performance in multiclass classification tasks, providing useful insights for near term quantum algorithm developments.

preprint2021arXiv

Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis

Representation Learning is a significant and challenging task in multimodal learning. Effective modality representations should contain two parts of characteristics: the consistency and the difference. Due to the unified multimodal annotation, existing methods are restricted in capturing differentiated information. However, additional uni-modal annotations are high time- and labor-cost. In this paper, we design a label generation module based on the self-supervised learning strategy to acquire independent unimodal supervisions. Then, joint training the multi-modal and uni-modal tasks to learn the consistency and difference, respectively. Moreover, during the training stage, we design a weight-adjustment strategy to balance the learning progress among different subtasks. That is to guide the subtasks to focus on samples with a larger difference between modality supervisions. Last, we conduct extensive experiments on three public multimodal baseline datasets. The experimental results validate the reliability and stability of auto-generated unimodal supervisions. On MOSI and MOSEI datasets, our method surpasses the current state-of-the-art methods. On the SIMS dataset, our method achieves comparable performance than human-annotated unimodal labels. The full codes are available at https://github.com/thuiar/Self-MM.

preprint2021arXiv

Performance of Superconducting Quantum Computing Chips under Different Architecture Design

Existing and near-term quantum computers can only perform two-qubit gates between physically connected qubits. Research has been done on compilers to rewrite quantum programs to match hardware constraints. However, the quantum processor architecture, in particular the qubit connectivity and topology, still lacks enough discussion, while it potentially has a huge impact on the performance of the quantum algorithms. We perform a quantitative and comprehensive study on the quantum processor performance under different qubit connectivity and topology. We select ten representative design models with different connectivities and topologies from quantum architecture design space and benchmark their performance by running a set of standard quantum algorithms. It is shown that a high-performance architecture almost always comes with a design with a large connectivity, while the topology shows a weak influence on the performance in our experiment. Different quantum algorithms show different dependence on quantum chip connectivity and topologies. This work provides quantum computing researchers with a systematic approach to evaluating their processor design.

preprint2021arXiv

TEXTOIR: An Integrated and Visualized Platform for Text Open Intent Recognition

TEXTOIR is the first integrated and visualized platform for text open intent recognition. It is composed of two main modules: open intent detection and open intent discovery. Each module integrates most of the state-of-the-art algorithms and benchmark intent datasets. It also contains an overall framework connecting the two modules in a pipeline scheme. In addition, this platform has visualized tools for data and model management, training, evaluation and analysis of the performance from different aspects. TEXTOIR provides useful toolkits and convenient visualized interfaces for each sub-module (Toolkit code: https://github.com/thuiar/TEXTOIR), and designs a framework to implement a complete process to both identify known intents and discover open intents (Demo code: https://github.com/thuiar/TEXTOIR-DEMO).

preprint2020arXiv

A Post-processing Method for Detecting Unknown Intent of Dialogue System via Pre-trained Deep Neural Network Classifier

With the maturity and popularity of dialogue systems, detecting user's unknown intent in dialogue systems has become an important task. It is also one of the most challenging tasks since we can hardly get examples, prior knowledge or the exact numbers of unknown intents. In this paper, we propose SofterMax and deep novelty detection (SMDN), a simple yet effective post-processing method for detecting unknown intent in dialogue systems based on pre-trained deep neural network classifiers. Our method can be flexibly applied on top of any classifiers trained in deep neural networks without changing the model architecture. We calibrate the confidence of the softmax outputs to compute the calibrated confidence score (i.e., SofterMax) and use it to calculate the decision boundary for unknown intent detection. Furthermore, we feed the feature representations learned by the deep neural networks into traditional novelty detection algorithm to detect unknown intents from different perspectives. Finally, we combine the methods above to perform the joint prediction. Our method classifies examples that differ from known intents as unknown and does not require any examples or prior knowledge of it. We have conducted extensive experiments on three benchmark dialogue datasets. The results show that our method can yield significant improvements compared with the state-of-the-art baselines

preprint2020arXiv

Robustly Pre-trained Neural Model for Direct Temporal Relation Extraction

Background: Identifying relationships between clinical events and temporal expressions is a key challenge in meaningfully analyzing clinical text for use in advanced AI applications. While previous studies exist, the state-of-the-art performance has significant room for improvement. Methods: We studied several variants of BERT (Bidirectional Encoder Representations using Transformers) some involving clinical domain customization and the others involving improved architecture and/or training strategies. We evaluated these methods using a direct temporal relations dataset which is a semantically focused subset of the 2012 i2b2 temporal relations challenge dataset. Results: Our results show that RoBERTa, which employs better pre-training strategies including using 10x larger corpus, has improved overall F measure by 0.0864 absolute score (on the 1.00 scale) and thus reducing the error rate by 24% relative to the previous state-of-the-art performance achieved with an SVM (support vector machine) model. Conclusion: Modern contextual language modeling neural networks, pre-trained on a large corpus, achieve impressive performance even on highly-nuanced clinical temporal relation tasks.

preprint2012arXiv

On the Limiting Shape of Young Tableaux Associated With Inhomogeneous Random Words

The limiting shape of the random Young diagrams associated with an inhomogeneous random word is identified as a multidimensional Brownian functional. This functional is identical in law to the spectrum of a random matrix. The Poissonized word problem is also briefy studied, and the asymptotic behavior of the shape analyzed.

preprint2010arXiv

Active switching in metamaterials using polarization control of light

We demonstrate on-demand control of localized surface plasmons in metamaterials by means of incident light polarization. An asymmetric mode, selectively excited by s-polarized light, interfere destructively with a bright element, thereby allowing the incident light to propagate at a fairly low loss, corresponding to electromagnetically induced transparency (EIT) in an atomic system. In contrast, a symmetric mode, excited by p-polarized light, directly couples with the incident light, which is analogous to the switch-off of EIT. The light polarization-dependent excitation of asymmetric and symmetric plasmon modes holds potential for active switching applications of plasmon hybridization.

preprint2009arXiv

Phase-sensitive Harmonic Measurements of Microwave Nonlinearities in Cuprate Thin Films

Investigations of the intrinsic electromagnetic nonlinearity of superconductors give insight into the fundamental physics of these materials. Phase-sensitive third-order harmonic voltage data $\tilde{u}_{3f}=|u_{3f}|exp(iϕ_{3f})$ are acquired with a near-field microwave microscope on homogeneous YBa$_2$Cu$_3$O$_{7-δ}$ (YBCO) thin films in a temperature range close to the critical temperature T$_c$. As temperature is increased from below T$_c$, the harmonic magnitude exhibits a maximum, while the phase, $π/2$ in the superconducting state, goes through a minimum. It is found that samples with doping ranges from near optimal ($δ=0.16$) to underdoped ($δ=0.47$) exhibit different behavior in terms of both the harmonic magnitude and phase. In optimally-doped samples, the harmonic magnitude reaches its maximum at a temperature $T_M$ slightly lower than that associated with the minimum of phase $T_m$ and drops into the noisefloor as soon as $T_m$ is exceeded. In underdoped samples $T_M$ is shifted toward lower temperatures with respect to $T_m$ and the harmonic voltage magnitude decreases slower with temperature than in the case of optimally-doped samples. A field-based analytical model of $\tilde{u}_{3f}$ is presented, where the nonlinear behavior is introduced as corrections to the low-field, linear-response complex conductivity. The model reproduces the low-temperature regime where the $σ_2$ nonlinearity dominates, in agreement with published theoretical and experimental results. Additionally the model identifies $T_m$ as the temperature where the order parameter relaxation time becomes comparable to the microwave probing period and reproduces semi-quantitatively the experimental data.

preprint2008arXiv

Dynamical scaling of $YBa_2Cu_3O_{7-δ}$ thin film conductivity in zero field

We study dynamic fluctuation effects of $YBa_2Cu_3O_{7-δ}$ thin films in zero field around $T_c$ by doing frequency-dependent microwave conductivity measurements at different powers. The length scales probed in the experiments are varied systematically allowing us to analyze data which are not affected by the finite thickness of the films, and to observe single-parameter scaling. DC current-voltage characteristics have also been measured to independently probe fluctuations in the same samples. The combination of DC and microwave measurements allows us to precisely determine critical parameters. Our results give a dynamical scaling exponent $z=1.55\pm0.15$, which is consistent with model E-dynamics.

Hua Xu

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Foundation Models to Unlock Real-World Evidence from Nationwide Medical Claims

An Open Natural Language Processing Development Framework for EHR-based Clinical Research: A case demonstration using the National COVID Cohort Collaborative (N3C)

Consistent Representation Learning for Continual Relation Extraction

Continual Machine Reading Comprehension via Uncertainty-aware Fixed Memory and Adversarial Domain Adaptation

M-SENA: An Integrated Platform for Multimodal Sentiment Analysis

Momentum-dependent oscillator strength crossover of excitons and plasmons in two-dimensional PtSe2

RGB Image Classification with Quantum Convolutional Ansaetze

Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis

Performance of Superconducting Quantum Computing Chips under Different Architecture Design

TEXTOIR: An Integrated and Visualized Platform for Text Open Intent Recognition

A Post-processing Method for Detecting Unknown Intent of Dialogue System via Pre-trained Deep Neural Network Classifier

Robustly Pre-trained Neural Model for Direct Temporal Relation Extraction

On the Limiting Shape of Young Tableaux Associated With Inhomogeneous Random Words

Active switching in metamaterials using polarization control of light

Phase-sensitive Harmonic Measurements of Microwave Nonlinearities in Cuprate Thin Films

Dynamical scaling of $YBa_2Cu_3O_{7-δ}$ thin film conductivity in zero field