Source author record

Tianyi Li

Tianyi Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

20works

26topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Reliable AI Needs to Externalize Implicit Knowledge: A Human-AI Collaboration Perspective

This position paper argues that reliable AI requires infrastructure for human validation of implicit knowledge. AI learns from both explicit knowledge (papers, documentation, structured databases) and implicit knowledge (reasoning patterns, debugging processes, intermediate steps). Implicit knowledge remains unexternalized because documentation cost exceeds perceived value -- yet AI learns from it indiscriminately, acquiring both beneficial patterns and harmful biases. Current reliability methods can only verify explicit knowledge against sources, creating a fundamental gap: the most valuable AI capabilities (reasoning, judgment, intuition) are precisely those we cannot verify. We propose Knowledge Objects (KOs) -- structured artifacts that externalize implicit knowledge into forms humans can inspect, verify, and endorse. KOs transform verification economics: what was previously too costly to verify becomes feasible, enabling accumulated human validation to improve reliability over time.

preprint2025arXiv

Efficient Quantum Simulation of Non-Adiabatic Molecular Dynamics with Precise Electronic Structure

In the study of non-adiabatic chemical processes such as photocatalysis and photosynthesis, non-adiabatic molecular dynamics (NAMD) is an indispensable theoretical tool, which requires precise potential energy surfaces (PESs) of ground and excited states. Quantum computing offers promising potential for calculating PESs that are intractable for classical computers. However, its realistic application poses significant challenges to the development of quantum algorithms that are sufficiently general to enable efficient and precise PES calculations across chemical systems with diverse properties, as well as to seamlessly adapt existing NAMD theories to quantum computing. In this work, we introduce a quantum-adapted extension to the Landau-Zener-Surface-Hopping (LZSH) NAMD. This extension incorporates curvature-driven hopping corrections that protect the population evolution while maintaining the efficiency gained from avoiding the computation of non-adiabatic couplings (NACs), as well as preserving the trajectory independence that enables parallelization. Furthermore, to ensure the high-precision PESs required for surface hopping dynamics, we develop a sub-microhartree-accurate PES calculation protocol. This protocol supports active space selection, enables parallel acceleration either on quantum or classical clusters, and demonstrates adaptability to diverse chemical systems - including the charged H3+ ion and the C2H4 molecule, a prototypical multi-reference benchmark. This work paves the way for practical application of quantum computing in NAMD, showcasing the potential of parallel simulation on quantum-classical heterogeneous clusters for ab-initio computational chemistry.

preprint2022arXiv

ClusterEA: Scalable Entity Alignment with Stochastic Training and Normalized Mini-batch Similarities

Entity alignment (EA) aims at finding equivalent entities in different knowledge graphs (KGs). Embedding-based approaches have dominated the EA task in recent years. Those methods face problems that come from the geometric properties of embedding vectors, including hubness and isolation. To solve these geometric problems, many normalization approaches have been adopted for EA. However, the increasing scale of KGs renders it hard for EA models to adopt the normalization processes, thus limiting their usage in real-world applications. To tackle this challenge, we present ClusterEA, a general framework that is capable of scaling up EA models and enhancing their results by leveraging normalization methods on mini-batches with a high entity equivalent rate. ClusterEA contains three components to align entities between large-scale KGs, including stochastic training, ClusterSampler, and SparseFusion. It first trains a large-scale Siamese GNN for EA in a stochastic fashion to produce entity embeddings. Based on the embeddings, a novel ClusterSampler strategy is proposed for sampling highly overlapped mini-batches. Finally, ClusterEA incorporates SparseFusion, which normalizes local and global similarity and then fuses all similarity matrices to obtain the final similarity matrix. Extensive experiments with real-life datasets on EA benchmarks offer insight into the proposed framework, and suggest that it is capable of outperforming the state-of-the-art scalable EA framework by up to 8 times in terms of Hits@1.

preprint2022arXiv

Cross-lingual Inference with A Chinese Entailment Graph

Predicate entailment detection is a crucial task for question-answering from text, where previous work has explored unsupervised learning of entailment graphs from typed open relation triples. In this paper, we present the first pipeline for building Chinese entailment graphs, which involves a novel high-recall open relation extraction (ORE) method and the first Chinese fine-grained entity typing dataset under the FIGER type ontology. Through experiments on the Levy-Holt dataset, we verify the strength of our Chinese entailment graph, and reveal the cross-lingual complementarity: on the parallel Levy-Holt dataset, an ensemble of Chinese and English entailment graphs outperforms both monolingual graphs, and raises unsupervised SOTA by 4.7 AUC points.

preprint2022arXiv

Edge Augmentation on Disconnected Graphs via Eigenvalue Elevation

The graph-theoretical task of determining most likely inter-community edges based on disconnected subgraphs' intra-community connectivity is proposed. An algorithm is developed for this edge augmentation task, based on elevating the zero eigenvalues of graph's spectrum. Upper bounds for eigenvalue elevation amplitude and for the corresponding augmented edge density are derived and are authenticated with simulation on random graphs. The algorithm works consistently across synthetic and real networks, yielding desirable performance at connecting graph components. Edge augmentation reverse-engineers graph partition under different community detection methods (Girvan-Newman method, greedy modularity maximization, label propagation, Louvain method, and fluid community), in most cases producing inter-community edges at >50% frequency.

preprint2022arXiv

Event Detection Explorer: An Interactive Tool for Event Detection Exploration

Event Detection (ED) is an important task in natural language processing. In the past few years, many datasets have been introduced for advancing ED machine learning models. However, most of these datasets are under-explored because not many tools are available for people to study events, trigger words, and event mention instances systematically and efficiently. In this paper, we present an interactive and easy-to-use tool, namely ED Explorer, for ED dataset and model exploration. ED Explorer consists of an interactive web application, an API, and an NLP toolkit, which can help both domain experts and non-experts to better understand the ED task. We use ED Explorer to analyze a recent proposed large-scale ED datasets (referred to as MAVEN), and discover several underlying problems, including sparsity, label bias, label imbalance, and debatable annotations, which provide us with directions to improve the MAVEN dataset. The ED Explorer can be publicly accessed through http://edx.leafnlp.org/. The demonstration video is available here https://www.youtube.com/watch?v=6QPnxPwxg50.

preprint2022arXiv

Study of Nonlinear Interaction between Waves and Ocean Currents Using High-Fidelity Simulation and Machine Learning

Modeling ocean surface waves under complex ocean current conditions is of crucial importance to many naval applications. For example, traveling ships and underwater vehicles generate spatially heterogeneous currents behind them through their drag and propeller motions. The strong currents can influence the surface wave pattern in the ship wake. In this study, the nonlinear interactions between waves and complex wake currents are investigated using numerical simulations. An in-house code is developed for high-fidelity simulations of a nonlinear phase-resolved ocean wavefield interacting with subsurface currents. Several typical wake patterns are simulated using the present numerical method, and the influence of complex currents on the waves is analyzed quantitatively using theoretical solutions of wave-current interactions. We also present a method for solving the inverse problem of deducing the current field based on surface-wave data using machine-learning techniques. A deep neural network is designed for processing spatial-temporal surface wave data. Detailed analyses on the distributions of regression errors and the training dataset-dependency show that the proposed neural network can effectively deduce the current field.

preprint2022arXiv

Task-specific Pre-training and Prompt Decomposition for Knowledge Graph Population with Language Models

We present a system for knowledge graph population with Language Models, evaluated on the Knowledge Base Construction from Pre-trained Language Models (LM-KBC) challenge at ISWC 2022. Our system involves task-specific pre-training to improve LM representation of the masked object tokens, prompt decomposition for progressive generation of candidate objects, among other methods for higher-quality retrieval. Our system is the winner of track 1 of the LM-KBC challenge, based on BERT LM; it achieves 55.0% F-1 score on the hidden test set of the challenge.

preprint2022arXiv

Toward Systematic Considerations of Missingness in Visual Analytics

Data-driven decision making has been a common task in today's big data era, from simple choices such as finding a fast way to drive home, to complex decisions on medical treatment. It is often supported by visual analytics. For various reasons (e.g., system failure, interrupted network, intentional information hiding, or bias), visual analytics for sensemaking of data involves missingness (e.g., data loss and incomplete analysis), which impacts human decisions. For example, missing data can cost a business millions of dollars, and failing to recognize key evidence can put an innocent person in jail. Being aware of missingness is critical to avoid such catastrophes. To fulfill this, as an initial step, we consider missingness in visual analytics from two aspects: data-centric and human-centric. The former emphasizes missingness in three data-related categories: data composition, data relationship, and data usage. The latter focuses on the human-perceived missingness at three levels: observed-level, inferred-level, and ignored-level. Based on them, we discuss possible roles of visualizations for handling missingness, and conclude our discussion with future research opportunities.

preprint2021arXiv

Defect Extremal Surface for Reflected Entropy

Defect extremal surface is defined by extremizing the Ryu-Takayanagi formula corrected by the quantum defect theory. This is interesting when the AdS bulk contains a defect brane (or string). We introduce a defect extremal surface formula for reflected entropy, which is a mixed state generalization of entanglement entropy measure. Based on a decomposition procedure of an AdS bulk with a brane, we demonstrate the equivalence between defect extremal surface formula and island formula for reflected entropy in AdS$_3$/BCFT$_2$. We also compute the evolution of reflected entropy in evaporating black hole model and find that defect extremal surface formula agrees with island formula.

preprint2021arXiv

Urban Epidemic Hazard Index for Chinese Cities: Why Did Small Cities Become Epidemic Hotspots?

Multiple small- to middle-scale cities, mostly located in northern China, became epidemic hotspots during the second wave of the spread of COVID-19 in early 2021. Despite qualitative discussions of potential social-economic causes, it remains unclear how this pattern could be accounted for from a quantitative approach. Through the development of an urban epidemic hazard index (EpiRank), we came up with a mathematical explanation for this phenomenon. The index is constructed from epidemic simulations on a multi-layer transportation network model on top of local SEIR transmission dynamics, which characterizes intra- and inter-city compartment population flow with a detailed mathematical description. Essentially, we argue that these highlighted cities possess greater epidemic hazards due to the combined effect of large regional population and small inter-city transportation. The proposed index, dynamic and applicable to different epidemic settings, could be a useful indicator for the risk assessment and response planning of urban epidemic hazards in China; the model framework is modularized and can be adapted for other nations without much difficulty.

preprint2020arXiv

Broadband Tunable Phase Shifter For Microwaves

We implement a broadly tunable phase shifter for microwaves based on superconducting quantum interference devices (SQUIDs) and study it both experimentally and theoretically. At different frequencies, a unit transmission coefficient, $|S_{21}|=1$, can be theoretically achieved along a curve where the phase shift is controllable by magnetic flux. The fabricated device consists of three equidistant SQUIDs interrupting a transmission line. We model each SQUID embedded at different positions along the transmission line with two parameters, capacitance and inductance, the values of which we extract from the experiments. In our experiments, the tunability of the phase shift varies from from $0.07\timesπ$ to $0.14\timesπ$ radians along the full-transmission curve with the input frequency ranging from 6.00 to 6.28~GHz. The reported measurements are in good agreement with simulations, which is promising for future design work of phase shifters for different applications.

preprint2020arXiv

Detecting Problem Statements in Peer Assessments

Effective peer assessment requires students to be attentive to the deficiencies in the work they rate. Thus, their reviews should identify problems. But what ways are there to check that they do? We attempt to automate the process of deciding whether a review comment detects a problem. We use over 18,000 review comments that were labeled by the reviewees as either detecting or not detecting a problem with the work. We deploy several traditional machine-learning models, as well as neural-network models using GloVe and BERT embeddings. We find that the best performer is the Hierarchical Attention Network classifier, followed by the Bidirectional Gated Recurrent Units (GRU) Attention and Capsule model with scores of 93.1% and 90.5% respectively. The best non-neural network model was the support vector machine with a score of 89.71%. This is followed by the Stochastic Gradient Descent model and the Logistic Regression model with 89.70% and 88.98%.

preprint2020arXiv

Effective routing design for remote entanglement generation on quantum networks

Quantum network is a promising platform for many ground-breaking applications that lie beyond the capability of its classical counterparts. Efficient entanglement generation on quantum networks with relatively limited resources such as quantum memories is essential to fully realize the network's capabilities, the solution to which calls for delicate network design and is currently at the primitive stage. In this study we propose an effective routing scheme to enable automatic responses for multiple requests of entanglement generation between source-terminal stations on a quantum lattice network with finite edge capacities. Multiple connection paths are exploited for each connection request while entanglement fidelity is ensured for each path by performing entanglement purification. The routing scheme is highly modularized with a flexible nature, embedding quantum operations within the algorithmic workflow, whose performance is evaluated from multiple perspectives. In particular, three algorithms are proposed and compared for the scheduling of capacity allocation on the edges of quantum network. Embodying the ideas of proportional share and progressive filling that have been well-studied in classical routing problems, we design a new scheduling algorithm, the propagatory update method, which in certain aspects overrides the two algorithms based on classical heuristics in scheduling performances. The general solution scheme paves the road for effective design of efficient routing and flow control protocols on applicational quantum networks.

preprint2020arXiv

Reflected Entropy for an Evaporating Black Hole

We study reflected entropy as a correlation measure in black hole evaporation. As a measure for bipartite mixed states, reflected entropy can be computed between black hole and radiation, radiation and radiation. We compute reflected entropy curves in three different models: 3-side wormhole model, End-of-the-World (EOW) brane model in three dimensions and two-dimensional eternal black hole plus CFT model. For 3-side wormhole model, we find that reflected entropy is dual to island cross sections. The reflected entropy between radiation and black hole increases at early time and then decreases to zero, similar to Page curve, but with a later transition time. The reflected entropy between radiation and radiation first increases and then saturates. For the EOW brane model, similar behaviors of reflected entropy are found. We propose a quantum extremal surface for reflected entropy, which we call quantum extremal cross section. In the eternal black hole plus CFT model, we find a generalized formula for reflected entropy with island cross section as its area term by considering the right half as the canonical purification of the left. Interestingly, the reflected entropy curve between the left black hole and the left radiation is nothing but the Page curve. We also find that reflected entropy between the left black hole and the right black hole decreases and goes to zero at late time. The reflected entropy between radiation and radiation increases at early time and saturates at late time.

preprint2020arXiv

Simulating the Spread of Epidemics in China on the Multi-layer Transportation Network: Beyond the Coronavirus in Wuhan

Based on the SEIR model and the modeling of urban transportation networks, a general-purpose simulator for the spread of epidemics in Chinese cities is built. The Chinese public transportation system between over 340 prefectural-level cities is modeled as a multi-layer bi-partite network, with layers representing different means of transportation (airlines, railways, sail routes and buses), and nodes divided into two categories (central cities, peripheral cities). At each city, an open-system SEIR model tracks the local spread of the disease, with population in- and out-flow exchanging with the overlying transportation network. The model accounts for (1) different transmissivities of the epidemic on different transportation media, (2) the transit of inbound flow at cities, (3) cross-infection on public transportation vehicles due to path overlap, and the realistic considerations that (4) the infected population are not entering public transportation and (5) the recovered population are not subject to repeated infections. The model could be used to simulate the city-level spread in China (and potentially other countries) of an arbitrary epidemic, characterized by its basic reproduction number, incubation period, infection period and zoonotic force, originated from any Chinese prefectural-level city(s), during the period before effective government interventions are implemented. Flowmaps are input into the system to trigger inter-city dynamics, assuming different flow strength, determined from empirical observation, within/between the bi-partite divisions of nodes. The model is used to simulate the 2019 Coronavirus epidemic in Wuhan; it shows that the framework is robust and reliable, and simulated results match public city-level datasets to an extraordinary extent.

preprint2020arXiv

Structural Control Analysis of System Dynamics Models

Structural control theory could be applied to study the control principles of social, economic and managerial systems. System Dynamics (SD) is the target field in social-economic sciences for endogenizing this theory, a subject that provides modeling solutions to real-world problems. SD models adopt diagrammatic representations, making it an ideal ground for transplanting structural control theory which utilizes similar graphic representations. This study sets up the theoretical ground for conducting structural control analysis (SCA) on SD models, summarized as a post-modeling workflow for SD practitioners, which serves as a specific application of the general structural control theory in social-economic sciences. Theoretical and practical establishments for SCA components are developed coordinately. Specifically, this study addresses the following questions: (1) How do SD models differ from physical control systems in graphic representations, and how do these differences affect the way of applying structural control theories to SD? (2) How could one identify control inputs in SD models, and how could different levels of system control in SD models be conceptualized? (3) What are the structural control properties for important SD components, and how could these properties and control principles help justify modeling heuristics in SD practice? (4) What are the procedures for conducting Structural Control Analysis (SCA) in SD models, and what are the implications of SCA results for model calibration and decision making? Overall, this study provides general insights for system control analysis of nonlinear dynamic simulation models, which may go beyond SD and extend to various disciplines in social-economic sciences.

preprint2019arXiv

Josephson penetration depth in coplanar junctions based on 2D materials

Josephson junctions and SQUIDs with graphene or other 2D materials as the weak link between superconductors have become a hot topic of research in recent years, with respect to both fundamental physics and potential applications. We have previously reported ultra-wide Josephson junctions (up to 80 μm wide) based on CVD graphene where the critical current was found to be uniformly distributed in the direction perpendicular to the current. In this paper, we demonstrate that the unusually large Josephson penetration depth λ_J that this corresponds to is enabled by the unique geometric structure of Josephson junctions based on 2D materials. We derive a new expression for the Josephson penetration depth of such junctions and verify our assumptions by numerical simulations.

preprint2019arXiv

Phase transitions and optimal algorithms for semi-supervised classifications on graphs: from belief propagation to graph convolution network

We perform theoretical and algorithmic studies for the problem of clustering and semi-supervised classification on graphs with both pairwise relational information and single-point feature information, upon a joint stochastic block model for generating synthetic graphs with both edges and node features. Asymptotically exact analysis based on the Bayesian inference of the underlying model are conducted, using the cavity method in statistical physics. Theoretically, we identify a phase transition of the generative model, which puts fundamental limits on the ability of all possible algorithms in the clustering task of the underlying model. Algorithmically, we propose a belief propagation algorithm that is asymptotically optimal on the generative model, and can be further extended to a belief propagation graph convolution neural network (BPGCN) for semi-supervised classification on graphs. For the first time, well-controlled benchmark datasets with asymptotially exact properties and optimal solutions could be produced for the evaluation of graph convolution neural networks, and for the theoretical understanding of their strengths and weaknesses. In particular, on these synthetic benchmark networks we observe that existing graph convolution neural networks are subject to an sparsity issue and an ovefitting issue in practice, both of which are successfully overcome by our BPGCN. Moreover, when combined with classic neural network methods, BPGCN yields extraordinary classification performances on some real-world datasets that have never been achieved before.

preprint2019arXiv

Self-falsifiable Hierarchical Detection of Overlapping Communities On Social Networks

No community detection algorithm can be optimal for all possible networks, thus it is important to identify whether the algorithm is suitable for a given network. We propose a multi-step algorithmic solution scheme for overlapping community detection based on an advanced label propagation process, which imitates the community formation process on social networks. Our algorithm is parameter-free and is able to reveal the hierarchical order of communities in the graph. The unique property of our solution scheme is self-falsifiability; an automatic quality check of the results is conducted after the detection, and the fitness of the algorithm for the specific network is reported. Extensive experiments show that our algorithm is self-consistent, reliable on networks of a wide range of size and different sorts, and is more robust than existing algorithms on both sparse and large-scale social networks. Results further suggest that our solution scheme may uncover features of networks' intrinsic community structures.

Tianyi Li

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Reliable AI Needs to Externalize Implicit Knowledge: A Human-AI Collaboration Perspective

Efficient Quantum Simulation of Non-Adiabatic Molecular Dynamics with Precise Electronic Structure

ClusterEA: Scalable Entity Alignment with Stochastic Training and Normalized Mini-batch Similarities

Cross-lingual Inference with A Chinese Entailment Graph

Edge Augmentation on Disconnected Graphs via Eigenvalue Elevation

Event Detection Explorer: An Interactive Tool for Event Detection Exploration

Study of Nonlinear Interaction between Waves and Ocean Currents Using High-Fidelity Simulation and Machine Learning

Task-specific Pre-training and Prompt Decomposition for Knowledge Graph Population with Language Models

Toward Systematic Considerations of Missingness in Visual Analytics

Defect Extremal Surface for Reflected Entropy

Urban Epidemic Hazard Index for Chinese Cities: Why Did Small Cities Become Epidemic Hotspots?

Broadband Tunable Phase Shifter For Microwaves

Detecting Problem Statements in Peer Assessments

Effective routing design for remote entanglement generation on quantum networks

Reflected Entropy for an Evaporating Black Hole

Simulating the Spread of Epidemics in China on the Multi-layer Transportation Network: Beyond the Coronavirus in Wuhan

Structural Control Analysis of System Dynamics Models

Josephson penetration depth in coplanar junctions based on 2D materials

Phase transitions and optimal algorithms for semi-supervised classifications on graphs: from belief propagation to graph convolution network

Self-falsifiable Hierarchical Detection of Overlapping Communities On Social Networks