Source author record

Tengfei Ma

Tengfei Ma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computation and Language Cryptography and Security Computer Vision eess.IV Human-Computer Interaction math.AP math.DG Numerical Analysis Other Quantitative Biology physics.optics quant-ph Quantitative Methods Social and Information Networks

Catalog footprint

What is connected

16works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DeRelayL: Sustainable Decentralized Relay Learning

In the era of big data, large-scale machine learning models have revolutionized various fields, driving significant advancements. However, large-scale model training demands high financial and computational resources, which are only affordable by a few technological giants and well-funded institutions. In this case, common users like mobile users, the real creators of valuable data, are often excluded from fully benefiting due to the barriers, while the current methods for accessing large-scale models either limit user ownership or lack sustainability. This growing gap highlights the urgent need for a collaborative model training approach, allowing common users to train and share models. However, existing collaborative model training paradigms, especially federated learning (FL), primarily focus on data privacy and group-based model aggregation. To this end, this paper intends to address this issue by proposing a novel training paradigm named decentralized relay learning (DeRelayL), a sustainable learning system where permissionless participants can contribute to model training in a relay-like manner and share the model. In detail, this paper presents the architecture and workflow of DeRelayL, designs incentive mechanisms to ensure sustainability, and conducts theoretical analysis and numerical simulations to demonstrate its effectiveness.

preprint2026arXiv

GraphPL: Leveraging GNN for Efficient and Robust Modalities Imputation in Patchwork Learning

Current research on distributed multi-modal learning typically assumes that clients can access complete information across all modalities, which may not hold in practice. In this paper, we explore patchwork learning, in which the modalities available to different clients vary, and the objective is to impute the missing modalities for each client in an unsupervised manner. Existing methods are shown not to fully utilize the modality information as they tend to rely on only a subset of the observed modalities. To address this issue, we propose GraphPL, which combines graph neural networks with patchwork learning to flexibly integrate all observed modalities and remains robust with noisy inputs. Experimental results show that GraphPL achieves SOTA performance on benchmark datasets. Our results on real-world distributed electronic health record dataset show GraphPL learns strong downstream features and enables tasks like disease prediction via superior modality imputation.

preprint2026arXiv

New Calabi-Yau Metrics of Taub-NUT Type on C^{N+1}

We construct a class of complete non-flat Calabi-Yau metrics on C^{N+1} for every N >= 3, which generalize the Taub-NUT metrics from C^2 and C^3 and whose tangent cone at infinity is R^N. The construction relies on the generalized Gibbons-Hawking ansatz. A key obstacle is that the volume-form defect of the ansatz fails to decay near certain components of the discriminant locus, producing singularities more severe than those encountered in dimension three, we resolve this by a gluing procedure.

preprint2026arXiv

SpecX: A Large-Scale Benchmark for Multi-Modal Spectroscopy and Cross-Paradigm Evaluation

Existing spectral benchmarks are limited in scale, modality alignment, and evaluation scope, and typically focus on either specialized models or multimodal language models (MLLMs). We introduce SpecX, a large-scale benchmark for multi-modal spectroscopy with cross-paradigm evaluation. SpecX contains 1.7M molecules with diverse spectral modalities, including NMR (1H, 13C, HSQC), IR, MS,UV,Raman and FL, and is organized into three tiers: a large-scale dataset for pretraining, an aligned multi-spectral subset for benchmarking, and a high-quality experimental subset for evaluation. SpecX supports a range of tasks such as molecular elucidation, spectrum simulation, and spectral understanding, and enables unified evaluation across both specialized spectral models and MLLMs. Experiments show that specialized models excel at signal-level modeling, while MLLMs exhibit strengths in high-level reasoning but lack precise spectral grounding. SpecX establishes a unified benchmark for spectral intelligence and highlights the need for spectrum-native foundation models.

preprint2022arXiv

A Study of the Attention Abnormality in Trojaned BERTs

Trojan attacks raise serious security concerns. In this paper, we investigate the underlying mechanism of Trojaned BERT models. We observe the attention focus drifting behavior of Trojaned models, i.e., when encountering an poisoned input, the trigger token hijacks the attention focus regardless of the context. We provide a thorough qualitative and quantitative analysis of this phenomenon, revealing insights into the Trojan mechanism. Based on the observation, we propose an attention-based Trojan detector to distinguish Trojaned models from clean ones. To the best of our knowledge, this is the first paper to analyze the Trojan mechanism and to develop a Trojan detector based on the transformer's attention.

preprint2022arXiv

Attention Hijacking in Trojan Transformers

Trojan attacks pose a severe threat to AI systems. Recent works on Transformer models received explosive popularity and the self-attentions are now indisputable. This raises a central question: Can we reveal the Trojans through attention mechanisms in BERTs and ViTs? In this paper, we investigate the attention hijacking pattern in Trojan AIs, \ie, the trigger token ``kidnaps'' the attention weights when a specific trigger is present. We observe the consistent attention hijacking pattern in Trojan Transformers from both Natural Language Processing (NLP) and Computer Vision (CV) domains. This intriguing property helps us to understand the Trojan mechanism in BERTs and ViTs. We also propose an Attention-Hijacking Trojan Detector (AHTD) to discriminate the Trojan AIs from the clean ones.

preprint2022arXiv

Cycle Representation Learning for Inductive Relation Prediction

In recent years, algebraic topology and its modern development, the theory of persistent homology, has shown great potential in graph representation learning. In this paper, based on the mathematics of algebraic topology, we propose a novel solution for inductive relation prediction, an important learning task for knowledge graph completion. To predict the relation between two entities, one can use the existence of rules, namely a sequence of relations. Previous works view rules as paths and primarily focus on the searching of paths between entities. The space of rules is huge, and one has to sacrifice either efficiency or accuracy. In this paper, we consider rules as cycles and show that the space of cycles has a unique structure based on the mathematics of algebraic topology. By exploring the linear structure of the cycle space, we can improve the searching efficiency of rules. We propose to collect cycle bases that span the space of cycles. We build a novel GNN framework on the collected cycles to learn the representations of cycles, and to predict the existence/non-existence of a relation. Our method achieves state-of-the-art performance on benchmarks.

preprint2022arXiv

GNNLens: A Visual Analytics Approach for Prediction Error Diagnosis of Graph Neural Networks

Graph Neural Networks (GNNs) aim to extend deep learning techniques to graph data and have achieved significant progress in graph analysis tasks (e.g., node classification) in recent years. However, similar to other deep neural networks like Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), GNNs behave like a black box with their details hidden from model developers and users. It is therefore difficult to diagnose possible errors of GNNs. Despite many visual analytics studies being done on CNNs and RNNs, little research has addressed the challenges for GNNs. This paper fills the research gap with an interactive visual analysis tool, GNNLens, to assist model developers and users in understanding and analyzing GNNs. Specifically, Parallel Sets View and Projection View enable users to quickly identify and validate error patterns in the set of wrong predictions; Graph View and Feature Matrix View offer a detailed analysis of individual nodes to assist users in forming hypotheses about the error patterns. Since GNNs jointly model the graph structure and the node features, we reveal the relative influences of the two types of information by comparing the predictions of three models: GNN, Multi-Layer Perceptron (MLP), and GNN Without Using Features (GNNWUF). Two case studies and interviews with domain experts demonstrate the effectiveness of GNNLens in facilitating the understanding of GNN models and their errors.

preprint2022arXiv

Improving Long Tailed Document-Level Relation Extraction via Easy Relation Augmentation and Contrastive Learning

Towards real-world information extraction scenario, research of relation extraction is advancing to document-level relation extraction(DocRE). Existing approaches for DocRE aim to extract relation by encoding various information sources in the long context by novel model architectures. However, the inherent long-tailed distribution problem of DocRE is overlooked by prior work. We argue that mitigating the long-tailed distribution problem is crucial for DocRE in the real-world scenario. Motivated by the long-tailed distribution problem, we propose an Easy Relation Augmentation(ERA) method for improving DocRE by enhancing the performance of tailed relations. In addition, we further propose a novel contrastive learning framework based on our ERA, i.e., ERACL, which can further improve the model performance on tailed relations and achieve competitive overall DocRE performance compared to the state-of-arts.

preprint2022arXiv

Wasserstein Graph Neural Networks for Graphs with Missing Attributes

Missing node attributes is a common problem in real-world graphs. Graph neural networks have been demonstrated power in graph representation learning while their performance is affected by the completeness of graph information. Most of them are not specified for missing-attribute graphs and fail to leverage incomplete attribute information effectively. In this paper, we propose an innovative node representation learning framework, Wasserstein Graph Neural Network (WGNN), to mitigate the problem. To make the most of limited observed attribute information and capture the uncertainty caused by missing values, we express nodes as low-dimensional distributions derived from the decomposition of the attribute matrix. Furthermore, we strengthen the expressiveness of representations by developing a novel message passing schema that aggregates distributional information from neighbors in the Wasserstein space. We test WGNN in node classification tasks under two missing-attribute cases on both synthetic and real-world datasets. In addition, we find WGNN suitable to recover missing values and adapt them to tackle matrix completion problems with graphs of users and items. Experimental results on both tasks demonstrate the superiority of our method.

preprint2022arXiv

When Does A Spectral Graph Neural Network Fail in Node Classification?

Spectral Graph Neural Networks (GNNs) with various graph filters have received extensive affirmation due to their promising performance in graph learning problems. However, it is known that GNNs do not always perform well. Although graph filters provide theoretical foundations for model explanations, it is unclear when a spectral GNN will fail. In this paper, focusing on node classification problems, we conduct a theoretical analysis of spectral GNNs performance by investigating their prediction error. With the aid of graph indicators including homophily degree and response efficiency we proposed, we establish a comprehensive understanding of complex relationships between graph structure, node labels, and graph filters. We indicate that graph filters with low response efficiency on label difference are prone to fail. To enhance GNNs performance, we provide a provably better strategy for filter design from our theoretical analysis - using data-driven filter banks, and propose simple models for empirical validation. Experimental results show consistency with our theoretical results and support our strategy.

preprint2020arXiv

CHEER: Rich Model Helps Poor Model via Knowledge Infusion

There is a growing interest in applying deep learning (DL) to healthcare, driven by the availability of data with multiple feature channels in rich-data environments (e.g., intensive care units). However, in many other practical situations, we can only access data with much fewer feature channels in a poor-data environments (e.g., at home), which often results in predictive models with poor performance. How can we boost the performance of models learned from such poor-data environment by leveraging knowledge extracted from existing models trained using rich data in a related environment? To address this question, we develop a knowledge infusion framework named CHEER that can succinctly summarize such rich model into transferable representations, which can be incorporated into the poor model to improve its performance. The infused model is analyzed theoretically and evaluated empirically on several datasets. Our empirical results showed that CHEER outperformed baselines by 5.60% to 46.80% in terms of the macro-F1 score on multiple physiological datasets.

preprint2020arXiv

Long-lived and multiplexed atom-photon entanglement interface with feed-forward-controlled readouts

The quantum interface (QI) that generates entanglement between photonic and spin-wave (atomic memory) qubits is a basic building block for quantum repeaters. Realizing ensemble-based repeaters in practice requires quantum memory providing long lifetime and multimode capacity. Significant progresses have been achieved on these separate goals. The remaining challenge is to combine long-lived and multimode memories into a single QI. Here, by establishing multimode, magnetic-field-insensitive and long-wavelength spin-wave storage in laser-cooled atoms that are placed inside a phase-passively-stabilized polarization interferometer, we constructed a multiplexed QI that stores up to three long-lived spin-wave qubits. Using a feed-forward-controlled system, we demonstrated that the multiplexed QI gives rise to a 3-fold increase in the atom-photon (photon-photon) entanglement-generation probability compared to single-mode QIs. The measured Bell parameter is 2.5+/-0.1 combined with a memory lifetime up to 1ms. The presented work represents a key step forward in realizing fiber-based long-distance quantum communications.

preprint2020arXiv

Repurpose Open Data to Discover Therapeutics for COVID-19 using Deep Learning

There have been more than 850,000 confirmed cases and over 48,000 deaths from the human coronavirus disease 2019 (COVID-19) pandemic, caused by novel severe acute respiratory syndrome coronavirus (SARS-CoV-2), in the United States alone. However, there are currently no proven effective medications against COVID-19. Drug repurposing offers a promising way for the development of prevention and treatment strategies for COVID-19. This study reports an integrative, network-based deep learning methodology to identify repurposable drugs for COVID-19 (termed CoV-KGE). Specifically, we built a comprehensive knowledge graph that includes 15 million edges across 39 types of relationships connecting drugs, diseases, genes, pathways, and expressions, from a large scientific corpus of 24 million PubMed publications. Using Amazon AWS computing resources, we identified 41 repurposable drugs (including indomethacin, toremifene and niclosamide) whose therapeutic association with COVID-19 were validated by transcriptomic and proteomic data in SARS-CoV-2 infected human cells and data from ongoing clinical trials. While this study, by no means recommends specific drugs, it demonstrates a powerful deep learning methodology to prioritize existing drugs for further investigation, which holds the potential of accelerating therapeutic development for COVID-19.

preprint2016arXiv

Accelerated Kaczmarz Algorithms using History Information

The Kaczmarz algorithm is a well known iterative method for solving overdetermined linear systems. Its randomized version yields provably exponential convergence in expectation. In this paper, we propose two new methods to speed up the randomized Kaczmarz algorithm by utilizing the past estimates in the iterations. The first one utilize the past estimates to get a preconditioner. The second one combines the stochastic average gradient (SAG) method with the randomized Kaczmarz algorithm. It takes advantage of past gradients to improve the convergence speed. Numerical experiments indicate that the new algorithms can dramatically outperform the standard randomized Kaczmarz algorithm.

preprint2016arXiv

Learning Crosslingual Word Embeddings without Bilingual Corpora

Crosslingual word embeddings represent lexical items from different languages in the same vector space, enabling transfer of NLP tools. However, previous attempts had expensive resource requirements, difficulty incorporating monolingual data or were unable to handle polysemy. We address these drawbacks in our method which takes advantage of a high coverage dictionary in an EM style training algorithm over monolingual corpora in two languages. Our model achieves state-of-the-art performance on bilingual lexicon induction task exceeding models using large bilingual corpora, and competitive results on the monolingual word similarity and cross-lingual document classification task.

Tengfei Ma

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

DeRelayL: Sustainable Decentralized Relay Learning

GraphPL: Leveraging GNN for Efficient and Robust Modalities Imputation in Patchwork Learning

New Calabi-Yau Metrics of Taub-NUT Type on C^{N+1}

SpecX: A Large-Scale Benchmark for Multi-Modal Spectroscopy and Cross-Paradigm Evaluation

A Study of the Attention Abnormality in Trojaned BERTs

Attention Hijacking in Trojan Transformers

Cycle Representation Learning for Inductive Relation Prediction

GNNLens: A Visual Analytics Approach for Prediction Error Diagnosis of Graph Neural Networks

Improving Long Tailed Document-Level Relation Extraction via Easy Relation Augmentation and Contrastive Learning

Wasserstein Graph Neural Networks for Graphs with Missing Attributes

When Does A Spectral Graph Neural Network Fail in Node Classification?

CHEER: Rich Model Helps Poor Model via Knowledge Infusion

Long-lived and multiplexed atom-photon entanglement interface with feed-forward-controlled readouts

Repurpose Open Data to Discover Therapeutics for COVID-19 using Deep Learning

Accelerated Kaczmarz Algorithms using History Information

Learning Crosslingual Word Embeddings without Bilingual Corpora