Source author record

Jin Cao

Jin Cao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Computation and Language Data Structures and Algorithms cond-mat.mtrl-sci Databases math.CO physics.atom-ph Computation Computational Complexity Computational Engineering, Finance, and Science cond-mat.mes-hall cond-mat.quant-gas math.AG

Catalog footprint

What is connected

15works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Wearable-informed generative digital avatars predict task-conditioned post-stroke locomotion

Dynamic prediction of locomotor capacity after stroke could enable more individualized rehabilitation, yet current assessments largely provide static impairment scores and do not indicate whether patients can perform specific tasks such as slope walking or stair climbing. Here, we present a wearable-informed data-physics hybrid generative framework that reconstructs a stroke survivor's locomotor control from wearable inertial sensing and predicts task-conditioned post-stroke locomotion in new environments. From a single 20 m level-ground walking trial recorded by five IMUs, the framework personalizes a physics-based digital avatar using a healthy-motion prior and hybrid imitation learning, generating dynamically feasible, patient-specific movements for inclined walking and stair negotiation. Across 11 stroke inpatients, predicted postures reached 82.2% similarity for slopes and 69.9% for stairs, substantially exceeding a physics-only baseline. In a multicentre pilot randomized study (n = 21; 28 days), access to scenario-specific locomotion predictions to support task selection and difficulty titration was associated with larger gains in Fugl-Meyer lower-extremity scores than standard care (mean change 6.0 vs 3.7 points; $p < 0.05$). These results suggest that wearable-informed generative digital avatars may augment individualized gait rehabilitation planning and provide a pathway toward dynamically personalized post-stroke motor recovery strategies.

preprint2025arXiv

Intrinsic nonlinear valley Nernst effect

We investigate the intrinsic nonlinear valley Nernst effect, which induces a transverse valley current via a second-order thermoelectric response to a longitudinal temperature gradient. The effect arises from the Berry connection polarizability dipole of valley electrons and is permissible in both inversion-symmetric and inversion-asymmetric materials. We demonstrate that the response tensor is connected to the intrinsic nonlinear valley Hall conductivity through a generalized Mott relation, with the two being directly proportional at low temperatures, scaled by the Lorenz number. We elucidate the symmetry constraints governing this effect and develop a theory for its nonlocal measurement, revealing a nonlocal second-harmonic signal with a distinct $ρ^2$ scaling. This signal comprises two scaling terms, with their ratio corresponding to the square of the thermopower normalized by the Lorenz number. Key characteristics are demonstrated using a tilted Dirac model and first-principles calculations on bilayer WTe$_2$. Possible extrinsic contributions and alternative experimental detection methods, e.g., by valley pumping and by nonreciprocal directional dichroism, are discussed. These findings underscore the significance of band quantum geometry on electron dynamics and establish a theoretical foundation for nonlinear valley caloritronics.

preprint2022arXiv

Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems

We present results from a large-scale experiment on pretraining encoders with non-embedding parameter counts ranging from 700M to 9.3B, their subsequent distillation into smaller models ranging from 17M-170M parameters, and their application to the Natural Language Understanding (NLU) component of a virtual assistant system. Though we train using 70% spoken-form data, our teacher models perform comparably to XLM-R and mT5 when evaluated on the written-form Cross-lingual Natural Language Inference (XNLI) corpus. We perform a second stage of pretraining on our teacher models using in-domain data from our system, improving error rates by 3.86% relative for intent classification and 7.01% relative for slot filling. We find that even a 170M-parameter model distilled from our Stage 2 teacher model has 2.88% better intent classification and 7.69% better slot filling error rates when compared to the 2.3B-parameter teacher trained only on public data (Stage 1), emphasizing the importance of in-domain data for pretraining. When evaluated offline using labeled NLU data, our 17M-parameter Stage 2 distilled model outperforms both XLM-R Base (85M params) and DistillBERT (42M params) by 4.23% to 6.14%, respectively. Finally, we present results from a full virtual assistant experimentation platform, where we find that models trained using our pretraining and distillation pipeline outperform models distilled from 85M-parameter teachers by 3.74%-4.91% on an automatic measurement of full-system user dissatisfaction.

preprint2022arXiv

DAME: Domain Adaptation for Matching Entities

Entity matching (EM) identifies data records that refer to the same real-world entity. Despite the effort in the past years to improve the performance in EM, the existing methods still require a huge amount of labeled data in each domain during the training phase. These methods treat each domain individually, and capture the specific signals for each dataset in EM, and this leads to overfitting on just one dataset. The knowledge that is learned from one dataset is not utilized to better understand the EM task in order to make predictions on the unseen datasets with fewer labeled samples. In this paper, we propose a new domain adaptation-based method that transfers the task knowledge from multiple source domains to a target domain. Our method presents a new setting for EM where the objective is to capture the task-specific knowledge from pretraining our model using multiple source domains, then testing our model on a target domain. We study the zero-shot learning case on the target domain, and demonstrate that our method learns the EM task and transfers knowledge to the target domain. We extensively study fine-tuning our model on the target dataset from multiple domains, and demonstrate that our model generalizes better than state-of-the-art methods in EM.

preprint2022arXiv

Instilling Type Knowledge in Language Models via Multi-Task QA

Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge -- their types. Previous methods to learn entity types rely on training classifiers on datasets with coarse, noisy, and incomplete labels. We introduce a method to instill fine-grained type knowledge in language models with text-to-text pre-training on type-centric questions leveraging knowledge base documents and knowledge graphs. We create the WikiWiki dataset: entities and passages from 10M Wikipedia articles linked to the Wikidata knowledge graph with 41K types. Models trained on WikiWiki achieve state-of-the-art performance in zero-shot dialog state tracking benchmarks, accurately infer entity types in Wikipedia articles, and can discover new types deemed useful by human judges.

preprint2022arXiv

Resonant control of elastic collisions between $^{23}$Na$^{40}$K molecules and $^{40}$K atoms

We have demonstrated the resonant control of the elastic scattering cross sections in the vicinity of Feshbach resonances between $^{23}$Na$^{40}$K molecules and $^{40}$K atoms by studying the thermalization between them. The elastic scattering cross sections vary by more than two orders of magnitude close to the resonance, and can be well described by an asymmetric Fano profile. The parameters that characterize the magnetically tunable s-wave scattering length are determined from the elastic scattering cross sections. The observation of resonantly controlled elastic scattering cross sections opens up the possibility to study strongly interacting atom-molecule mixtures and improve our understanding of the complex atom-molecule Feshbach resonances.

preprint2021arXiv

Evidence for association of triatomic molecule in ultracold $^{23}$Na$^{40}$K and $^{40}$K mixture

Ultracold assembly of diatomic molecules has enabled great advances in controlled chemistry, ultracold chemical physics, and quantum simulation with molecules. Extending the ultracold association to triatomic molecules will offer many new research opportunities and challenges in these fields. A possible approach is to form triatomic molecules in the ultracold atom and diatomic molecule mixture by employing the Feshbach resonance between them. Although the ultracold atom-diatomic-molecule Feshbach resonances have been observed recently, utilizing these resonances to form triatomic molecules remains challenging. Here we report on the evidence of the association of triatomic molecules near the Feshbach resonances between $^{23}$Na$^{40}$K molecules in the rovibrational ground state and $^{40}$K atoms. We apply a radio-frequency pulse to drive the free-bound transition and monitor the loss of $^{23}$Na$^{40}$K molecules. The association of triatomic molecules manifests itself as an additional loss feature in the radio-frequency spectra, which can be distinguished from the atomic loss feature.The binding energy of triatomic molecule is estimated from the measurement. Our work is helpful to understand the complex ultracold atom-molecule Feshbach resonance and may open up an avenue towards the preparation and control of ultracold triatomic molecules.

preprint2021arXiv

Zero-shot Generalization in Dialog State Tracking through Generative Question Answering

Dialog State Tracking (DST), an integral part of modern dialog systems, aims to track user preferences and constraints (slots) in task-oriented dialogs. In real-world settings with constantly changing services, DST systems must generalize to new domains and unseen slot types. Existing methods for DST do not generalize well to new slot names and many require known ontologies of slot types and values for inference. We introduce a novel ontology-free framework that supports natural language queries for unseen constraints and slots in multi-domain task-oriented dialogs. Our approach is based on generative question-answering using a conditional language model pre-trained on substantive English sentences. Our model improves joint goal accuracy in zero-shot domain adaptation settings by up to 9% (absolute) over the previous state-of-the-art on the MultiWOZ 2.1 dataset.

preprint2020arXiv

A Fast Randomized Algorithm for Finding the Maximal Common Subsequences

Finding the common subsequences of $L$ multiple strings has many applications in the area of bioinformatics, computational linguistics, and information retrieval. A well-known result states that finding a Longest Common Subsequence (LCS) for $L$ strings is NP-hard, e.g., the computational complexity is exponential in $L$. In this paper, we develop a randomized algorithm, referred to as {\em Random-MCS}, for finding a random instance of Maximal Common Subsequence ($MCS$) of multiple strings. A common subsequence is {\em maximal} if inserting any character into the subsequence no longer yields a common subsequence. A special case of MCS is LCS where the length is the longest. We show the complexity of our algorithm is linear in $L$, and therefore is suitable for large $L$. Furthermore, we study the occurrence probability for a single instance of MCS and demonstrate via both theoretical and experimental studies that the longest subsequence from multiple runs of {\em Random-MCS} often yields a solution to $LCS$.

preprint2020arXiv

A Lightweight Algorithm to Uncover Deep Relationships in Data Tables

Many data we collect today are in tabular form, with rows as records and columns as attributes associated with each record. Understanding the structural relationship in tabular data can greatly facilitate the data science process. Traditionally, much of this relational information is stored in table schema and maintained by its creators, usually domain experts. In this paper, we develop automated methods to uncover deep relationships in a single data table without expert or domain knowledge. Our method can decompose a data table into layers of smaller tables, revealing its deep structure. The key to our approach is a computationally lightweight forward addition algorithm that we developed to recursively extract the functional dependencies between table columns that are scalable to tables with many columns. With our solution, data scientists will be provided with automatically generated, data-driven insights when exploring new data sets.

preprint2020arXiv

Experimental evidence of monolayer AlB$_2$ with symmetry-protected Dirac cones

Monolayer AlB$_2$ is composed of two atomic layers: honeycomb borophene and triangular aluminum. In contrast with the bulk phase, monolayer AlB$_2$ is predicted to be a superconductor with a high critical temperature. Here, we demonstrate that monolayer AlB$_2$ can be synthesized on Al(111) via molecular beam epitaxy. Our theoretical calculations revealed that the monolayer AlB$_2$ hosts several Dirac cones along the $Γ$--M and $Γ$--K directions; these Dirac cones are protected by crystal symmetries and are thus resistant to external perturbations. The extraordinary electronic structure of the monolayer AlB$_2$ was confirmed via angle-resolved photoemission spectroscopy measurements. These results are likely to stimulate further research interest to explore the exotic properties arising from the interplay of Dirac fermions and superconductivity in two-dimensional materials.

preprint2016arXiv

Torsion and divisibility for reciprocity sheaves and 0-cycles with modulus

The notion of modulus is a striking feature of Rosenlicht-Serre's theory of generalized Jacobian varieties of curves. It was carried over to algebraic cycles on general varieties by Bloch-Esnault, Park, Rülling, Krishna-Levine. Recently, Kerz-Saito introduced a notion of Chow group of $0$-cycles with modulus in connection with geometric class field theory with wild ramification for varieties over finite fields. We study the non-homotopy invariant part of the Chow group of $0$-cycles with modulus and show their torsion and divisibility properties. Modulus is being brought to sheaf theory by Kahn-Saito-Yamazaki in their attempt to construct a generalization of Voevodsky-Suslin-Friedlander's theory of homotopy invariant presheaves with transfers. We prove parallel results about torsion and divisibility properties for them.

preprint2012arXiv

Cycles and Paths Embedded in Varietal Hypercubes

The varietal hypercube $VQ_n$ is a variant of the hypercube $Q_n$ and has better properties than $Q_n$ with the same number of edges and vertices. This paper shows that every edge of $VQ_n$ is contained in cycles of every length from 4 to $2^n$ except 5, and every pair of vertices with distance $d$ is connected by paths of every length from $d$ to $2^n-1$ except 2 and 4 if $d=1$.

preprint2012arXiv

Transitivity of Varietal Hypercube Networks

The varietal hypercube $VQ_n$ is a variant of the hypercube $Q_n$ and has better properties than $Q_n$ with the same number of edges and vertices. This paper proves that $VQ_n$ is vertex-transitive. This property shows that when $VQ_n$ is used to model an interconnection network, it is high symmetrical and obviously superior to other variants of the hypercube such as the crossed cube.

preprint2011arXiv

Distinct counting with a self-learning bitmap

Counting the number of distinct elements (cardinality) in a dataset is a fundamental problem in database management. In recent years, due to many of its modern applications, there has been significant interest to address the distinct counting problem in a data stream setting, where each incoming data can be seen only once and cannot be stored for long periods of time. Many probabilistic approaches based on either sampling or sketching have been proposed in the computer science literature, that only require limited computing and memory resources. However, the performances of these methods are not scale-invariant, in the sense that their relative root mean square estimation errors (RRMSE) depend on the unknown cardinalities. This is not desirable in many applications where cardinalities can be very dynamic or inhomogeneous and many cardinalities need to be estimated. In this paper, we develop a novel approach, called self-learning bitmap (S-bitmap) that is scale-invariant for cardinalities in a specified range. S-bitmap uses a binary vector whose entries are updated from 0 to 1 by an adaptive sampling process for inferring the unknown cardinality, where the sampling rates are reduced sequentially as more and more entries change from 0 to 1. We prove rigorously that the S-bitmap estimate is not only unbiased but scale-invariant. We demonstrate that to achieve a small RRMSE value of $ε$ or less, our approach requires significantly less memory and consumes similar or less operations than state-of-the-art methods for many common practice cardinality scales. Both simulation and experimental studies are reported.

Jin Cao

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Wearable-informed generative digital avatars predict task-conditioned post-stroke locomotion

Intrinsic nonlinear valley Nernst effect

Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems

DAME: Domain Adaptation for Matching Entities

Instilling Type Knowledge in Language Models via Multi-Task QA

Resonant control of elastic collisions between $^{23}$Na$^{40}$K molecules and $^{40}$K atoms

Evidence for association of triatomic molecule in ultracold $^{23}$Na$^{40}$K and $^{40}$K mixture

Zero-shot Generalization in Dialog State Tracking through Generative Question Answering

A Fast Randomized Algorithm for Finding the Maximal Common Subsequences

A Lightweight Algorithm to Uncover Deep Relationships in Data Tables

Experimental evidence of monolayer AlB$_2$ with symmetry-protected Dirac cones

Torsion and divisibility for reciprocity sheaves and 0-cycles with modulus

Cycles and Paths Embedded in Varietal Hypercubes

Transitivity of Varietal Hypercube Networks

Distinct counting with a self-learning bitmap