Source author record

Tong Wang

Tong Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

44works

39topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Forecasting Medium-Horizon Alzheimer's Disease Progression: Residual Gap-Aware Transformers for 24-Month CDR-SB Change from ADNI Clinical and Biomarker Histories

Medium-horizon Alzheimer's disease progression prediction is difficult because future clinical scores can remain tied to baseline severity, while biomarker histories are irregular and incompletely observed. We develop an anchor-based analysis of 24-month Clinical Dementia Rating Sum of Boxes (CDR-SB) change using harmonized Alzheimer's Disease Neuroimaging Initiative (ADNI) tables. Each labeled sample is anchored at a mild cognitive impairment visit, uses only clinical and biomarker history observed at or before that anchor, and defines the response as CDR-SB at the future visit closest to 24 months within an 18--30 month window minus anchor CDR-SB. The analytic cohort contains 2,600 labeled anchors from 858 participants and 7,276 longitudinal rows. We propose a residual gap-aware transformer that combines a mixed-effects statistical reference with transformer-based residual learning from pre-anchor clinical and biomarker histories. The model uses participant-level random intercepts in the mixed-effects reference, observation-level triplet tokenization for irregular histories, and a learned nonnegative time-gap penalty inside self-attention. We compare the proposed model with a Bayesian-information-criterion-selected linear mixed-effects baseline, GRU-D, and STraTS under repeated participant-level train--test splits. Across five participant-level random seeds, the proposed model achieves the best mean test performance across all reported metrics, reducing MSE by 13.1% and increasing prediction--observation correlation by 26.4% relative to the mixed-effects baseline. It also improves over both GRU-D and STraTS in mean error and correlation. These results show that statistical anchoring and gap-aware residual learning provide a useful structure for medium-horizon Alzheimer's disease progression prediction.

preprint2026arXiv

MiVE: Multiscale Vision-language features for reference-guided video Editing

Reference-guided video editing takes a source video, a text instruction, and a reference image as inputs, requiring the model to faithfully apply the instructed edits while preserving original motion and unedited content. Existing methods fall into two paradigms, each with inherent limitations: decoupled encoders suffer from modality gaps when processing instructions and visual content independently, while unified vision-language encoders lose fine-grained spatial details by relying solely on final-layer representations. We observe that VLM layers encode complementary information hierarchically -- early layers capture localized spatial details essential for precise editing, while deeper layers encode global semantics for instruction comprehension. Building on this insight, we present MiVE (Multiscale Vision-language features for reference-guided video Editing), a framework that repurposes VLMs as multiscale feature extractors. MiVE extracts hierarchical features from Qwen3-VL and integrates them into a unified self-attention Diffusion Transformer, eliminating the modality mismatch inherent in cross-attention designs. Experiments demonstrate that MiVE achieves state-of-the-art performance by ranking highest in human preference, outperforming both academic methods and commercial systems.

preprint2026arXiv

Self-Prompting Diffusion Transformer for Open-Vocabulary Scene Text Editing via In-Context Learning

Scene text editing aims to modify text in a target region of an image while preserving surrounding background style and texture. Existing methods rely solely on image background information while neglecting the visual details of target regions, which discards stylistic features in the original text and essentially degrades the task to text rendering. Moreover, the conditions imposed by pre-trained glyph encoder limit the scope of editable text. To address these issues, this paper proposes a self-prompting scene text editing method that constructs style and glyph prompts directly from the original image, without introducing additional style or glyph encoders. We employ a two-stage training strategy: the diffusion transformer is first trained on large-scale self-supervised data and then refined using a small set of paired images. By leveraging the in-context learning capability of the Multi-Modal Diffusion Transformer (MM-DiT), it achieves open-vocabulary and style-consistent text editing. Experimental results on various languages demonstrate that our method achieves the state-of-the-art performance in both text accuracy and style consistency. Our project page: \href{https://hongxiii.github.io/mstedit}{hongxiii.github.io/mstedit}.

preprint2026arXiv

SpecX: A Large-Scale Benchmark for Multi-Modal Spectroscopy and Cross-Paradigm Evaluation

Existing spectral benchmarks are limited in scale, modality alignment, and evaluation scope, and typically focus on either specialized models or multimodal language models (MLLMs). We introduce SpecX, a large-scale benchmark for multi-modal spectroscopy with cross-paradigm evaluation. SpecX contains 1.7M molecules with diverse spectral modalities, including NMR (1H, 13C, HSQC), IR, MS,UV,Raman and FL, and is organized into three tiers: a large-scale dataset for pretraining, an aligned multi-spectral subset for benchmarking, and a high-quality experimental subset for evaluation. SpecX supports a range of tasks such as molecular elucidation, spectrum simulation, and spectral understanding, and enables unified evaluation across both specialized spectral models and MLLMs. Experiments show that specialized models excel at signal-level modeling, while MLLMs exhibit strengths in high-level reasoning but lack precise spectral grounding. SpecX establishes a unified benchmark for spectral intelligence and highlights the need for spectrum-native foundation models.

preprint2026arXiv

UHR-Micro: Diagnosing and Mitigating the Resolution Illusion in Earth Observation VLMs

Vision-Language Models (VLMs) increasingly operate on ultra-high-resolution (UHR) Earth observation imagery, yet they remain vulnerable to a severe scale mismatch between large-scale scene context and micro-scale targets. We refer to this empirical gap as a "resolution illusion": higher input resolution provides the appearance of richer visual detail, but does not necessarily yield reliable perception of spatially small, task-relevant evidence. To benchmark this challenge, we introduce UHR-Micro, a benchmark comprising 11,253 instructions grounded in 1,212 UHR images, designed to evaluate VLMs at the spatial limits of native Earth observation imagery. UHR-Micro spans diverse micro-target scales, context requirements, task families, and visual conditions, and provides diagnostic annotations that support controlled evaluation and fine-grained error attribution. Experiments with representative high-resolution VLMs show substantial failures in spatial grounding and evidence parsing, despite access to high-resolution inputs. Further analysis suggests that these failures are not fully resolved by increasing model capacity, but are closely tied to insufficient guidance in locating and using task-relevant micro-evidence. Motivated by this finding, we propose Micro-evidence Active Perception (MAP), a reference agent that decomposes queries into evidence-seeking steps, actively inspects candidate regions, and grounds its answers in localized observations. MAP-Agent improves micro-level perception by making high-resolution reasoning evidence-centered rather than image-centered. Together, UHR-Micro and MAP-Agent provide a diagnostic platform for evaluating, understanding, and advancing high-resolution reasoning in Earth observation VLMs. Datasets and source code were released at https://github.com/MiliLab/UHR-Micro.

preprint2022arXiv

A SOM-based Gradient-Free Deep Learning Method with Convergence Analysis

As gradient descent method in deep learning causes a series of questions, this paper proposes a novel gradient-free deep learning structure. By adding a new module into traditional Self-Organizing Map and introducing residual into the map, a Deep Valued Self-Organizing Map network is constructed. And analysis about the convergence performance of such a deep Valued Self-Organizing Map network is proved in this paper, which gives an inequality about the designed parameters with the dimension of inputs and the loss of prediction.

preprint2022arXiv

Better Language Model with Hypernym Class Prediction

Class-based language models (LMs) have been long devised to address context sparsity in $n$-gram LMs. In this study, we revisit this approach in the context of neural LMs. We hypothesize that class-based prediction leads to an implicit context aggregation for similar words and thus can improve generalization for rare words. We map words that have a common WordNet hypernym to the same class and train large neural LMs by gradually annealing from predicting the class to token prediction during training. Empirically, this curriculum learning strategy consistently improves perplexity over various large, highly-performant state-of-the-art Transformer-based models on two datasets, WikiText-103 and Arxiv. Our analysis shows that the performance improvement is achieved without sacrificing performance on rare words. Finally, we document other attempts that failed to yield empirical gains, and discuss future directions for the adoption of class-based LMs on a larger scale.

preprint2022arXiv

Confidence Matters: Inspecting Backdoors in Deep Neural Networks via Distribution Transfer

Backdoor attacks have been shown to be a serious security threat against deep learning models, and detecting whether a given model has been backdoored becomes a crucial task. Existing defenses are mainly built upon the observation that the backdoor trigger is usually of small size or affects the activation of only a few neurons. However, the above observations are violated in many cases especially for advanced backdoor attacks, hindering the performance and applicability of the existing defenses. In this paper, we propose a backdoor defense DTInspector built upon a new observation. That is, an effective backdoor attack usually requires high prediction confidence on the poisoned training samples, so as to ensure that the trained model exhibits the targeted behavior with a high probability. Based on this observation, DTInspector first learns a patch that could change the predictions of most high-confidence data, and then decides the existence of backdoor by checking the ratio of prediction changes after applying the learned patch on the low-confidence data. Extensive evaluations on five backdoor attacks, four datasets, and three advanced attacking types demonstrate the effectiveness of the proposed defense.

preprint2022arXiv

Direct Molecular Conformation Generation

Molecular conformation generation aims to generate three-dimensional coordinates of all the atoms in a molecule and is an important task in bioinformatics and pharmacology. Previous methods usually first predict the interatomic distances, the gradients of interatomic distances or the local structures (e.g., torsion angles) of a molecule, and then reconstruct its 3D conformation. How to directly generate the conformation without the above intermediate values is not fully explored. In this work, we propose a method that directly predicts the coordinates of atoms: (1) the loss function is invariant to roto-translation of coordinates and permutation of symmetric atoms; (2) the newly proposed model adaptively aggregates the bond and atom information and iteratively refines the coordinates of the generated conformation. Our method achieves the best results on GEOM-QM9 and GEOM-Drugs datasets. Further analysis shows that our generated conformations have closer properties (e.g., HOMO-LUMO gap) with the groundtruth conformations. In addition, our method improves molecular docking by providing better initial conformations. All the results demonstrate the effectiveness of our method and the great potential of the direct approach. The code is released at https://github.com/DirectMolecularConfGen/DMCG

preprint2022arXiv

Does Order Matter? An Empirical Study on Generating Multiple Keyphrases as a Sequence

Recently, concatenating multiple keyphrases as a target sequence has been proposed as a new learning paradigm for keyphrase generation. Existing studies concatenate target keyphrases in different orders but no study has examined the effects of ordering on models' behavior. In this paper, we propose several orderings for concatenation and inspect the important factors for training a successful keyphrase generation model. By running comprehensive comparisons, we observe one preferable ordering and summarize a number of empirical findings and challenges, which can shed light on future research on this line of work.

preprint2022arXiv

Efficient Estimation of the Additive Risks Model for Interval-Censored Data

In contrast to the popular Cox model which presents a multiplicative covariate effect specification on the time to event hazards, the semiparametric additive risks model (ARM) offers an attractive additive specification, allowing for direct assessment of the changes or the differences in the hazard function for changing value of the covariates. The ARM is a flexible model, allowing the estimation of both time-independent and time-varying covariates. It has a nonparametric component and a regression component identified by a finite-dimensional parameter. This chapter presents an efficient approach for maximum-likelihood (ML) estimation of the nonparametric and the finite-dimensional components of the model via the minorize-maximize (MM) algorithm for case-II interval-censored data. The operating characteristics of our proposed MM approach are assessed via simulation studies, with illustration on a breast cancer dataset via the R package MMIntAdd. It is expected that the proposed computational approach will not only provide scalability to the ML estimation scenario but may also simplify the computational burden of other complex likelihoods or models.

preprint2022arXiv

Functional universality in slow-growing microbial communities arises from thermodynamic constraints

The dynamics of microbial communities is incredibly complex, determined by competition for metabolic substrates and cross-feeding of byproducts. Species in the community grow by harvesting energy from chemical reactions that transform substrates to products. In many anoxic environments, these reactions are close to thermodynamic equilibrium and growth is slow. To understand the community structure in these energy-limited environments, we developed a microbial community consumer-resource model incorporating energetic and thermodynamic constraints on an interconnected metabolic network. The central ingredient of the model is product inhibition, meaning that microbial growth may be limited not only by depletion of metabolic substrates but also by accumulation of products. We demonstrate that these additional constraints on microbial growth cause a convergence in the structure and function of the community metabolic network -- independent of species composition and biochemical details -- providing a possible explanation for convergence of community function despite taxonomic variation observed in many natural and industrial environments. Furthermore, we discovered that the structure of community metabolic network is governed by the thermodynamic principle of maximum heat dissipation. Overall, the work demonstrates how universal thermodynamic principles may constrain community metabolism and explain observed functional convergence in microbial communities.

preprint2022arXiv

Multi-View Substructure Learning for Drug-Drug Interaction Prediction

Drug-drug interaction (DDI) prediction provides a drug combination strategy for systemically effective treatment. Previous studies usually model drug information constrained on a single view such as the drug itself, leading to incomplete and noisy information, which limits the accuracy of DDI prediction. In this work, we propose a novel multi- view drug substructure network for DDI prediction (MSN-DDI), which learns chemical substructures from both the representations of the single drug (intra-view) and the drug pair (inter-view) simultaneously and utilizes the substructures to update the drug representation iteratively. Comprehensive evaluations demonstrate that MSN-DDI has almost solved DDI prediction for existing drugs by achieving a relatively improved accuracy of 19.32% and an over 99% accuracy under the transductive setting. More importantly, MSN-DDI exhibits better generalization ability to unseen drugs with a relatively improved accuracy of 7.07% under more challenging inductive scenarios. Finally, MSN-DDI improves prediction performance for real-world DDI applications to new drugs.

preprint2022arXiv

Research on Creative Thinking Mode Based on Category Theory

The research on the brain mechanism of creativity mainly has two aspects, one is the creative thinking process, and the other is the brain structure and functional connection characteristics of highly creative people. The billions of nerve cells in the brain connect and interact with each other. The hundreds of millions of nerve cells in the brain connect and interact with each other. The human brain has a high degree of complexity at the biological level, especially the rational thinking ability of the human brain. Starting from the connection of molecules, cells, neural networks and the neural function structure of the brain, it may be fundamentally impossible to study the rational thinking mode of human beings. Human's rational thinking mode has a high degree of freedom and transcendence, and such problems cannot be expected to be studied by elaborating the realization of the nervous system. The rational thinking of the brain is mainly based on the structured thinking mode, and the structured thinking mode shows the great scientific power. This paper studies the theoretical model of innovative thinking based on of category theory, and analyzes the creation process of two scientific theories which are landmarks in the history of science, and provides an intuitive, clear interpretation model and rigorous mathematical argument for the creative thinking. The structured thinking way have great revelation and help to create new scientific theories.

preprint2022arXiv

Tailoring Molecules for Protein Pockets: a Transformer-based Generative Solution for Structured-based Drug Design

Structure-based drug design is drawing growing attentions in computer-aided drug discovery. Compared with the virtual screening approach where a pre-defined library of compounds are computationally screened, de novo drug design based on the structure of a target protein can provide novel drug candidates. In this paper, we present a generative solution named TamGent (Target-aware molecule generator with Transformer) that can directly generate candidate drugs from scratch for a given target, overcoming the limits imposed by existing compound libraries. Following the Transformer framework (a state-of-the-art framework in deep learning), we design a variant of Transformer encoder to process 3D geometric information of targets and pre-train the Transformer decoder on 10 million compounds from PubChem for candidate drug generation. Systematical evaluation on candidate compounds generated for targets from DrugBank shows that both binding affinity and drugability are largely improved. TamGent outperforms previous baselines in terms of both effectiveness and efficiency. The method is further verified by generating candidate compounds for the SARS-CoV-2 main protease and the oncogenic mutant KRAS G12C. The results show that our method not only re-discovers previously verified drug molecules , but also generates novel molecules with better docking scores, expanding the compound pool and potentially leading to the discovery of novel drugs.

preprint2022arXiv

Task-adaptive Asymmetric Deep Cross-modal Hashing

Supervised cross-modal hashing aims to embed the semantic correlations of heterogeneous modality data into the binary hash codes with discriminative semantic labels. Because of its advantages on retrieval and storage efficiency, it is widely used for solving efficient cross-modal retrieval. However, existing researches equally handle the different tasks of cross-modal retrieval, and simply learn the same couple of hash functions in a symmetric way for them. Under such circumstance, the uniqueness of different cross-modal retrieval tasks are ignored and sub-optimal performance may be brought. Motivated by this, we present a Task-adaptive Asymmetric Deep Cross-modal Hashing (TA-ADCMH) method in this paper. It can learn task-adaptive hash functions for two sub-retrieval tasks via simultaneous modality representation and asymmetric hash learning. Unlike previous cross-modal hashing approaches, our learning framework jointly optimizes semantic preserving that transforms deep features of multimedia data into binary hash codes, and the semantic regression which directly regresses query modality representation to explicit label. With our model, the binary codes can effectively preserve semantic correlations across different modalities, meanwhile, adaptively capture the query semantics. The superiority of TA-ADCMH is proved on two standard datasets from many aspects.

preprint2022arXiv

Towards Exploring the Code Reuse from Stack Overflow during Software Development

As one of the most well-known programmer Q&A websites, Stack Overflow (i.e., SO) is serving tens of thousands of developers every day. Previous work has shown that many developers reuse the code snippets on SO when they find an answer (from SO) that functionally matches the programming problem they encounter in their development activities. To study how programmers reuse code on SO during project development, we conduct a comprehensive empirical study. First, to capture the development activities of programmers, we collect 342,148 modified code snippets in commits from 793 open-source Java projects, and these modified code can reflect the programming problems encountered during development. We also collect the code snippets from 1,355,617 posts on SO. Then, we employ CCFinder to detect the code clone between the modified code from commits and the code from SO, and further analyze the code reuse when programmer solves a programming problem during development. We count the code reuse ratios of the modified code snippets in the commits of each project in different years, the results show that the average code reuse ratio is 6.32%, and the maximum is 8.38%. The code reuse ratio in project commits has increased year by year, and the proportion of code reuse in the newly established project is higher than that of old projects. We also find that some projects reuse the code snippets from many years ago. Additionally, we find that experienced developers seem to be more likely to reuse the knowledge on SO. Moreover, we find that the code reuse ratio in bug-related commits (6.67%) is slightly higher than that of in non-bug-related commits (6.59%). Furthermore, we also find that the code reuse ratio (14.44%) in Java class files that have undergone multiple modifications is more than double the overall code reuse ratio (6.32%).

preprint2022arXiv

Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis

This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system, where each language was seen as an individual task and was learned sequentially and continually. It does not require pooled data from all languages altogether, and thus alleviates the storage and computation burden. One of the challenges of lifelong learning methods is "catastrophic forgetting": in TTS scenario it means that model performance quickly degrades on previous languages when adapted to a new language. We approach this problem via a data-replay-based lifelong learning method. We formulate the replay process as a supervised learning problem, and propose a simple yet effective dual-sampler framework to tackle the heavily language-imbalanced training samples. Through objective and subjective evaluations, we show that this supervised learning formulation outperforms other gradient-based and regularization-based lifelong learning methods, achieving 43% Mel-Cepstral Distortion reduction compared to a fine-tuning baseline.

preprint2021arXiv

Exploring the Regulatory Function of the N-terminal Domain of SARS-CoV-2 Spike Protein Through Molecular Dynamics Simulation

SARS-CoV-2 is what has caused the COVID-19 pandemic. Early viral infection is mediated by the SARS-CoV-2 homo-trimeric Spike (S) protein with its receptor binding domains (RBDs) in the receptor-accessible state. We performed molecular dynamics simulation on the S protein with a focus on the function of its N-terminal domains (NTDs). Our study reveals that the NTD acts as a "wedge" and plays a crucial regulatory role in the conformational changes of the S protein. The complete RBD structural transition is allowed only when the neighboring NTD that typically prohibits the RBD's movements as a wedge detaches and swings away. Based on this NTD "wedge" model, we propose that the NTD-RBD interface should be a potential drug target.

preprint2021arXiv

Quantum Transport in Two-Dimensional WS$_2$ with High-Efficiency Carrier Injection Through Indium Alloy Contacts

Two-dimensional transition metal dichalcogenides (TMDCs) have properties attractive for optoelectronic and quantum applications. A crucial element for devices is the metal-semiconductor interface. However, high contact resistances have hindered progress. Quantum transport studies are scant as low-quality contacts are intractable at cryogenic temperatures. Here, temperature-dependent transfer length measurements are performed on chemical vapour deposition grown single-layer and bilayer WS$_2$ devices with indium alloy contacts. The devices exhibit low contact resistances and Schottky barrier heights (\sim10 k$Ω$\si{\micro\metre} at 3 K and 1.7 meV). Efficient carrier injection enables high carrier mobilities ($\sim$190 cm$^2$V$^{-1}$s$^{-1}$) and observation of resonant tunnelling. Density functional theory calculations provide insights into quantum transport and properties of the WS$_2$-indium interface. Our results reveal significant advances towards high-performance WS$_2$ devices using indium alloy contacts.

preprint2021arXiv

Stochastic social behavior coupled to COVID-19 dynamics leads to waves, plateaus and an endemic state

It is well recognized that population heterogeneity plays an important role in the spread of epidemics. While individual variations in social activity are often assumed to be persistent, i.e. constant in time, here we discuss the consequences of dynamic heterogeneity. By integrating the stochastic dynamics of social activity into traditional epidemiological models we demonstrate the emergence of a new long timescale governing the epidemic in broad agreement with empirical data. Our model captures multiple features of real-life epidemics such as COVID-19, including prolonged plateaus and multiple waves, which are transiently suppressed due to the dynamic nature of social activity. The existence of the long timescale due to the interplay between epidemic and social dynamics provides a unifying picture of how a fast-paced epidemic typically will transition to the endemic state.

preprint2020arXiv

A Novel Method to Design Controller Parameters by Using Uniform Design Algorithm

Parameter selection is one of the most important parts for nearly all the control strategies. Traditionally, controller parameters are chosen by utilizing trial and error, which is always tedious and time consuming. Moreover, such method is highly dependent on the experience of researchers, which means that it is hard to be popularized. In this light, this paper proposes a novel parameter searching approach by utilizing uniform design (UD) algorithm. By which the satisfactory controller parameters under a performance index could be selected. In this end, two simulation examples are conducted to verify the effectiveness of proposed scheme. Simulation results show that this novel approach, as compared to other intelligent tuning algorithms, excels in efficiency and time saving.

preprint2020arXiv

Crowd-MECS: A Novel Crowdsourcing Framework for Mobile Edge Caching and Sharing

Crowdsourced mobile edge caching and sharing (Crowd-MECS) is emerging as a promising content delivery paradigm by employing a large crowd of existing edge devices (EDs) to cache and share popular contents. The successful technology adoption of Crowd-MECS relies on a comprehensive understanding of the complicated economic interactions and strategic decision-making of different stakeholders. In this paper, we focus on studying the economic and strategic interactions between one content provider (CP) and a large crowd of EDs, where the EDs can decide whether to cache and share contents for the CP, and the CP can decide to share a certain revenue with EDs as the incentive of caching and sharing contents. We formulate such an interaction as a two-stage Stackelberg game. In Stage I, the CP aims to maximize its own profit by deciding the ratio of revenue shared with EDs. In Stage II, EDs aim to maximize their own payoffs by choosing to be agents who cache and share contents, and meanwhile gain a certain revenue from the CP, or requesters who do not cache but request contents in the on-demand fashion. We first analyze the EDs' best responses and prove the existence and uniqueness of the equilibrium in Stage II by using the non-atomic game theory. Then, we identify the piece-wise structure and the unimodal feature of the CP's profit function, based on which we design a tailored low-complexity one-dimensional search algorithm to achieve the optimal revenue sharing ratio for the CP in Stage I. Simulation results show that both the CP's profit and the EDs' total welfare can be improved significantly (e.g., by 120% and 50%, respectively) by using the proposed Crowd-MECS, comparing with the Non-MEC system where the CP serves all EDs directly.

preprint2020arXiv

Efficient Estimation of Mixture Cure Frailty Model for Clustered Current Status Data

Current status data abounds in the field of epidemiology and public health, where the only observable data for a subject is the random inspection time and the event status at inspection. Motivated by such a current status data from a periodontal study where data are inherently clustered, we propose a unified methodology to analyze such complex data. We allow the time-to-event to follow the semiparametric GOR model with a cure fraction, and develop a unified estimation scheme powered by the EM algorithm. The within-subject correlation is accounted for by a random (frailty) effect, and the non-parametric component of the GOR model is approximated via penalized splines, with a set of knot points that increases with the sample size. Proposed methodology is accompanied by a rigorous asymptotic theory, and the related semiparametric efficiency. The finite sample performance of our model parameters are assessed via simulation studies. Furthermore, the proposed methodology is illustrated via application to the oral health data, accompanied by diagnostic checks to identify influential observations. An easy to use R package CRFCSD is also available for implementation.

preprint2020arXiv

Interpretable Companions for Black-Box Models

We present an interpretable companion model for any pre-trained black-box classifiers. The idea is that for any input, a user can decide to either receive a prediction from the black-box model, with high accuracy but no explanations, or employ a companion rule to obtain an interpretable prediction with slightly lower accuracy. The companion model is trained from data and the predictions of the black-box model, with the objective combining area under the transparency--accuracy curve and model complexity. Our model provides flexible choices for practitioners who face the dilemma of choosing between always using interpretable models and always using black-box models for a predictive task, so users can, for any given input, take a step back to resort to an interpretable prediction if they find the predictive performance satisfying, or stick to the black-box model if the rules are unsatisfying. To show the value of companion models, we design a human evaluation on more than a hundred people to investigate the tolerable accuracy loss to gain interpretability for humans.

preprint2020arXiv

Modeling microbial cross-feeding at intermediate scale portrays community dynamics and species coexistence

Social interaction between microbes can be described at many levels of details, ranging from the biochemistry of cell-cell interactions to the ecological dynamics of populations. Choosing the best level to model microbial communities without losing generality remains a challenge. Here we propose to model cross-feeding interactions at an intermediate level between genome-scale metabolic models of individual species and consumer-resource models of ecosystems, which is suitable to empirical data. We applied our method to three published examples of multi-strain Escherichia coli communities with increasing complexity consisting of uni-, bi-, and multi-directional cross-feeding of either substitutable metabolic byproducts or essential nutrients. The intermediate-scale model accurately described empirical data and could quantify exchange rates elusive by other means, such as the byproduct secretions, even for a complex community of 14 amino acid auxotrophs. We used the three models to study each community's limits of robustness to perturbations such as variations in resource supply, antibiotic treatments and invasion by other "cheaters" species. Our analysis provides a foundation to quantify cross-feeding interactions from experimental data, and highlights the importance of metabolic exchanges in the dynamics and stability of microbial communities.

preprint2020arXiv

Monetizing Edge Service in Mobile Internet Ecosystem

In mobile Internet ecosystem, Mobile Users (MUs) purchase wireless data services from Internet Service Provider (ISP) to access to Internet and acquire the interested content services (e.g., online game) from Content Provider (CP). The popularity of intelligent functions (e.g., AI and 3D modeling) increases the computation-intensity of the content services, leading to a growing computation pressure for the MUs' resource-limited devices. To this end, edge computing service is emerging as a promising approach to alleviate the MUs' computation pressure while keeping their quality-of-service, via offloading some computation tasks of MUs to edge (computing) servers deployed at the local network edge. Thus, Edge Service Provider (ESP), who deploys the edge servers and offers the edge computing service, becomes an upcoming new stakeholder in the ecosystem. In this work, we study the economic interactions of MUs, ISP, CP, and ESP in the new ecosystem with edge computing service, where MUs can acquire the computation-intensive content services (offered by CP) and offload some computation tasks, together with the necessary raw input data, to edge servers (deployed by ESP) through ISP. We first study the MU's Joint Content Acquisition and Task Offloading (J-CATO) problem, which aims to maximize his long-term payoff. We derive the off-line solution with crucial insights, based on which we design an online strategy with provable performance. Then, we study the ESP's edge service monetization problem. We propose a pricing policy that can achieve a constant fraction of the ex-post optimal revenue with an extra constant loss for the ESP. Numerical results show that the edge computing service can stimulate the MUs' content acquisition and improve the payoffs of MUs, ISP, and CP.

preprint2020arXiv

One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases

Different texts shall by nature correspond to different number of keyphrases. This desideratum is largely missing from existing neural keyphrase generation models. In this study, we address this problem from both modeling and evaluation perspectives. We first propose a recurrent generative model that generates multiple keyphrases as delimiter-separated sequences. Generation diversity is further enhanced with two novel techniques by manipulating decoder hidden states. In contrast to previous approaches, our model is capable of generating diverse keyphrases and controlling number of outputs. We further propose two evaluation metrics tailored towards the variable-number generation. We also introduce a new dataset StackEx that expands beyond the only existing genre (i.e., academic writing) in keyphrase generation tasks. With both previous and new evaluation metrics, our model outperforms strong baselines on all datasets.

preprint2020arXiv

Optimizing Traffic Lights with Multi-agent Deep Reinforcement Learning and V2X communication

We consider a system to optimize duration of traffic signals using multi-agent deep reinforcement learning and Vehicle-to-Everything (V2X) communication. This system aims at analyzing independent and shared rewards for multi-agents to control duration of traffic lights. A learning agent traffic light gets information along its lanes within a circular V2X coverage. The duration cycles of traffic light are modeled as Markov decision Processes. We investigate four variations of reward functions. The first two are unshared-rewards: based on waiting number, and waiting time of vehicles between two cycles of traffic light. The third and fourth functions are: shared-rewards based on waiting cars, and waiting time for all agents. Each agent has a memory for optimization through target network and prioritized experience replay. We evaluate multi-agents through the Simulation of Urban MObility (SUMO) simulator. The results prove effectiveness of the proposed system to optimize traffic signals and reduce average waiting cars to 41.5 % as compared to the traditional periodic traffic control system.

preprint2019arXiv

Evidence for a multi-level trophic organization of the human gut microbiome

The human gut microbiome is a complex ecosystem, in which hundreds of microbial species and metabolites coexist, in part due to an extensive network of cross-feeding interactions. However, both the large-scale trophic organization of this ecosystem, and its effects on the underlying metabolic flow, remain unexplored. Here, using a simplified model, we provide quantitative support for a multi-level trophic organization of the human gut microbiome, where microbes consume and secrete metabolites in multiple iterative steps. Using a manually-curated set of metabolic interactions between microbes, our model suggests about four trophic levels, each characterized by a high level-to-level metabolic transfer of byproducts. It also quantitatively predicts the typical metabolic environment of the gut (fecal metabolome) in approximate agreement with the real data. To understand the consequences of this trophic organization, we quantify the metabolic flow and biomass distribution, and explore patterns of microbial and metabolic diversity in different levels. The hierarchical trophic organization suggested by our model can help mechanistically establish causal links between the abundances of microbes and metabolites in the human gut.

preprint2016arXiv

A micro-scale simulation of red blood cell passage through symmetric and asymmetric bifurcated vessels

Blood exhibits a heterogeneous nature of hematocrit, velocity, and effective viscosity in microcapillaries. Microvascular bifurcations have a significant influence on the distribution of the blood cells and blood flow behavior. This paper presents a simulation study performed on the two-dimensionalmotions and deformation of multiple red blood cells in microvessels with diverging and converging bifurcations. Fluid dynamics and membrane mechanics were incorporated. Effects of cell shape, hematocrit, and deformability of the cell membrane on rheological behavior of the red blood cells and the hemodynamics have been investigated. It was shown that the blood entering the daughter branch with a higher flow rate tended to receive disproportionally more cells. The results also demonstrate that red blood cells in microvessels experienced lateral migration in the parent channel and blunted velocity profiles in both straight section and daughter branches, and this effect was influenced by the shape and the initial position of the cells, the hematocrit, and the membrane deformability. In addition, a cell free region around the tip of the confluence was observed. The simulation results are qualitatively consistent with existing experimental findings. This study may provide fundamental knowledge for a better understanding of hemodynamic behavior of micro-scale blood flow.

preprint2016arXiv

An Experimental Study of LSTM Encoder-Decoder Model for Text Simplification

Text simplification (TS) aims to reduce the lexical and structural complexity of a text, while still retaining the semantic meaning. Current automatic TS techniques are limited to either lexical-level applications or manually defining a large amount of rules. Since deep neural networks are powerful models that have achieved excellent performance over many difficult tasks, in this paper, we propose to use the Long Short-Term Memory (LSTM) Encoder-Decoder model for sentence level TS, which makes minimal assumptions about word sequence. We conduct preliminary experiments to find that the model is able to learn operation rules such as reversing, sorting and replacing from sequence pairs, which shows that the model may potentially discover and apply rules such as modifying sentence structure, substituting words, and removing words for TS.

preprint2016arXiv

An ultrafast polarised single photon source at 220 K

A crucial requirement for the realisation of efficient and scalable on-chip quantum communication is an ultrafast polarised single photon source operating beyond the Peltier cooling barrier of 200 K. While a few systems based on different materials and device structures have achieved single photon generation above this threshold, there has been no report of single quantum emitters with deterministic polarisation properties at the same high temperature conditions. Here, we report the first device to simultaneously achieve single photon emission with a g(2)(0) of only 0.21, a high polarisation degree of 0.80, a fixed polarisation axis determined by the underlying crystallography, and a GHz repetition rate with a radiative lifetime of 357 ps at 220 K. The temperature insensitivity of these properties, together with the simple planar growth method, and absence of complex device geometries, makes this system an excellent candidate for on-chip applications in integrated systems.

preprint2016arXiv

Experimental and theoretical analyses of strongly polarized photon emission from non-polar InGaN quantum dots

We present a comprehensive investigation of the polarization properties of non-polar a-plane InGaN quantum dots (QDs) and their origin with statistically significant experimental data and rigorous k.p modelling. The unbiased selection and study of 180 individual QDs allow us to compute an average polarization degree of 0.90, with a standard deviation of only 0.08. When coupled with theoretical insights, we show that a-plane InGaN QDs are highly insensitive to size differences, shape anisotropies, and indium content fluctuations. Furthermore, 91% of the studied QDs exhibit a polarization axis along the crystal [1-100] axis, with the other 9% polarized orthogonal to this direction. When coupled with their ability to emit single-photons, a-plane QDs are good candidates for the generation of linearly polarized single-photons, a feature attractive for quantum cryptography protocols.

preprint2016arXiv

Link Prediction in evolving networks based on the popularity of nodes

Link prediction aims to uncover the underlying relationship behind networks, which could be utilized to predict the missing edges or identify the spurious edges, and attracts much attention from various fields. The key issue of link prediction is to estimate the likelihood of two nodes in networks. Most current approaches of link prediction base on static structural analysis and ignore the temporal aspects of evolving networks. Unlike previous work, in this paper, we propose a popularity based structural perturbation method (PBSPM) that characterizes the similarity of an edge not only from existing connections of networks, but also from the popularity of its two endpoints, since popular nodes have much more probability to form links between themselves. By taking popularity of nodes into account, PBSPM could suppress nodes that have high importance, but gradually become inactive. Therefore the proposed method is inclined to predict potential edges between active nodes, rather than edges between inactive nodes. Experimental results on four real networks show that the proposed method outperforms the state-of-the-art methods both in accuracy and robustness in evolving networks.

preprint2016arXiv

Topic Modeling over Short Texts by Incorporating Word Embeddings

Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task for many content analysis tasks, such as content charactering, user interest profiling, and emerging topic detecting. Existing methods such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA) cannot solve this prob- lem very well since only very limited word co-occurrence information is available in short texts. This paper studies how to incorporate the external word correlation knowledge into short texts to improve the coherence of topic modeling. Based on recent results in word embeddings that learn se- mantically representations for words from a large corpus, we introduce a novel method, Embedding-based Topic Model (ETM), to learn latent topics from short texts. ETM not only solves the problem of very limited word co-occurrence information by aggregating short texts into long pseudo- texts, but also utilizes a Markov Random Field regularized model that gives correlated words a better chance to be put into the same topic. The experiments on real-world datasets validate the effectiveness of our model comparing with the state-of-the-art models.

preprint2015arXiv

Antiferromagnetic Ordering in MnF(salen)

Antiferromagnetic order at $T_{\mathrm{N}} = 23$ K has been identified in Mn(III)F(salen), salen = H$_{14}$C$_{16}$N$_2$O$_2$, an $S = 2$ linear-chain system. Using single crystals, specific heat studies performed in magnetic fields up to 9 T revealed the presence of a field-independent cusp at the same temperature where $^1$H NMR studies conducted at 42 MHz observed dramatic changes in the spin-lattice relaxation time, $T_1$, and in the linewidths. Neutron powder diffraction performed on a randomly-oriented, as-grown, deuterated (12 of 14 H replaced by d) sample of 2.2 g at 10 K and 100 K did not resolve the magnetic ordering, while low-field (less than 0.1 T) magnetic susceptibility studies of single crystals and randomly-arranged microcrystalline samples reveal subtle features associated with the transition. Ensemble these data suggest a magnetic signature previously detected at 3.8 T for temperatures below nominally 500 mK is a spin-flop field of small net moments arising from alternating subsets of three Mn spins along the chains.

preprint2015arXiv

Learning Optimized Or's of And's

Or's of And's (OA) models are comprised of a small number of disjunctions of conjunctions, also called disjunctive normal form. An example of an OA model is as follows: If ($x_1 = $ `blue' AND $x_2=$ `middle') OR ($x_1 = $ `yellow'), then predict $Y=1$, else predict $Y=0$. Or's of And's models have the advantage of being interpretable to human experts, since they are a set of conditions that concisely capture the characteristics of a specific subset of data. We present two optimization-based machine learning frameworks for constructing OA models, Optimized OA (OOA) and its faster version, Optimized OA with Approximations (OOAx). We prove theoretical bounds on the properties of patterns in an OA model. We build OA models as a diagnostic screening tool for obstructive sleep apnea, that achieves high accuracy with a substantial gain in interpretability over other methods.

preprint2015arXiv

Or's of And's for Interpretable Classification, with Application to Context-Aware Recommender Systems

We present a machine learning algorithm for building classifiers that are comprised of a small number of disjunctions of conjunctions (or's of and's). An example of a classifier of this form is as follows: If X satisfies (x1 = 'blue' AND x3 = 'middle') OR (x1 = 'blue' AND x2 = '<15') OR (x1 = 'yellow'), then we predict that Y=1, ELSE predict Y=0. An attribute-value pair is called a literal and a conjunction of literals is called a pattern. Models of this form have the advantage of being interpretable to human experts, since they produce a set of conditions that concisely describe a specific class. We present two probabilistic models for forming a pattern set, one with a Beta-Binomial prior, and the other with Poisson priors. In both cases, there are prior parameters that the user can set to encourage the model to have a desired size and shape, to conform with a domain-specific definition of interpretability. We provide two scalable MAP inference approaches: a pattern level search, which involves association rule mining, and a literal level search. We show stronger priors reduce computation. We apply the Bayesian Or's of And's (BOA) model to predict user behavior with respect to in-vehicle context-aware personalized recommender systems.

preprint2014arXiv

Alternating Optimization Techniques for Power Allocation and Receiver Design in Multihop Wireless Sensor Networks

In this paper, we consider a multihop wireless sensor network with multiple relay nodes for each hop where the amplify-and-forward scheme is employed. We present algorithmic strategies to jointly design linear receivers and the power allocation parameters via an alternating optimization approach subject to different power constraints which include global, local and individual ones. Two design criteria are considered: the first one minimizes the mean-square error and the second one maximizes the sum-rate of the wireless sensor network. We derive constrained minimum mean-square error and constrained maximum sum-rate expressions for the linear receivers and the power allocation parameters that contain the optimal complex amplification coefficients for each relay node. An analysis of the computational complexity and the convergence of the algorithms is also presented. Computer simulations show good performance of our proposed methods in terms of bit error rate and sum-rate compared to the method with equal power allocation and an existing power allocation scheme.

preprint2013arXiv

Resolution-aware network coded storage

In this paper, we show that coding can be used in storage area networks (SANs) to improve various quality of service metrics under normal SAN operating conditions, without requiring additional storage space. For our analysis, we develop a model which captures modern characteristics such as constrained I/O access bandwidth limitations. Using this model, we consider two important cases: single-resolution (SR) and multi-resolution (MR) systems. For SR systems, we use blocking probability as the quality of service metric and propose the network coded storage (NCS) scheme as a way to reduce blocking probability. The NCS scheme codes across file chunks in time, exploiting file striping and file duplication. Under our assumptions, we illustrate cases where SR NCS provides an order of magnitude savings in blocking probability. For MR systems, we introduce saturation probability as a quality of service metric to manage multiple user types, and we propose the uncoded resolution- aware storage (URS) and coded resolution-aware storage (CRS) schemes as ways to reduce saturation probability. In MR URS, we align our MR layout strategy with traffic requirements. In MR CRS, we code videos across MR layers. Under our assumptions, we illustrate that URS can in some cases provide an order of magnitude gain in saturation probability over classic non-resolution aware systems. Further, we illustrate that CRS provides additional saturation probability savings over URS.

preprint2012arXiv

An Ehrenfeucht-Fraïssé Game for $L_{ω_1ω}$

Ehrenfeucht-Fraisse games are very useful in studying separation and equivalence results in logic. The standard finite Ehrenfeucht-Fraisse game characterizes equivalence in first order logic. The standard Ehrenfeucht-Fraisse game in infinitary logic characterizes equivalence in $L_{\inftyω}$. The logic $L_{ω_1ω}$ is the extension of first order logic with countable conjunctions and disjunctions. There was no Ehrenfeucht-Fraisse game for $L_{ω_1ω}$ in the literature. In this paper we develop an Ehrenfeucht-Fraisse Game for $L_{ω_1ω}$. This game is based on a game for propositional and first order logic introduced by Hella and Vaananen. Unlike the standard Ehrenfeucht-Fraisse games which are modeled solely after the behavior of quantifiers, this new game also takes into account the behavior of connectives in logic. We prove the adequacy theorem for this game. We also apply the new game to prove complexity results about infinite binary strings.

preprint2010arXiv

Software Design Document, Testing, Deployment and Configuration Management of the UUIS--a Team 2 COMP5541-W10 Project Approach

The Software Design Document of UUIS describes the prototype design details of the system architecture, database layer, deployment and configuration details as well as test cases produced while working the design and implementation of the prototype. The requirements specification of UUIS are detailed in arXiv:1005.0783.

preprint2010arXiv

Software Requirements Specification of the IUfA's UUIS -- a Team 2 COMP5541-W10 Project Approach

In the 52-page document, we describe our approach to the Software Requirements Specification of the IUfA's UUIS prototype. This includes the overall system description, functional requirements, non-functional requirements, use cases, the corresponding data dictionary for all entities involved, mock user interface (UI) design, and the overall projected cost estimate. The design specification of UUIS can be found in arXiv:1005.0665.

Tong Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

44 published item(s)

Forecasting Medium-Horizon Alzheimer's Disease Progression: Residual Gap-Aware Transformers for 24-Month CDR-SB Change from ADNI Clinical and Biomarker Histories

MiVE: Multiscale Vision-language features for reference-guided video Editing

Self-Prompting Diffusion Transformer for Open-Vocabulary Scene Text Editing via In-Context Learning

SpecX: A Large-Scale Benchmark for Multi-Modal Spectroscopy and Cross-Paradigm Evaluation

UHR-Micro: Diagnosing and Mitigating the Resolution Illusion in Earth Observation VLMs

A SOM-based Gradient-Free Deep Learning Method with Convergence Analysis

Better Language Model with Hypernym Class Prediction

Confidence Matters: Inspecting Backdoors in Deep Neural Networks via Distribution Transfer

Direct Molecular Conformation Generation

Does Order Matter? An Empirical Study on Generating Multiple Keyphrases as a Sequence

Efficient Estimation of the Additive Risks Model for Interval-Censored Data

Functional universality in slow-growing microbial communities arises from thermodynamic constraints

Multi-View Substructure Learning for Drug-Drug Interaction Prediction

Research on Creative Thinking Mode Based on Category Theory

Tailoring Molecules for Protein Pockets: a Transformer-based Generative Solution for Structured-based Drug Design

Task-adaptive Asymmetric Deep Cross-modal Hashing

Towards Exploring the Code Reuse from Stack Overflow during Software Development

Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis

Exploring the Regulatory Function of the N-terminal Domain of SARS-CoV-2 Spike Protein Through Molecular Dynamics Simulation

Quantum Transport in Two-Dimensional WS$_2$ with High-Efficiency Carrier Injection Through Indium Alloy Contacts

Stochastic social behavior coupled to COVID-19 dynamics leads to waves, plateaus and an endemic state

A Novel Method to Design Controller Parameters by Using Uniform Design Algorithm

Crowd-MECS: A Novel Crowdsourcing Framework for Mobile Edge Caching and Sharing

Efficient Estimation of Mixture Cure Frailty Model for Clustered Current Status Data

Interpretable Companions for Black-Box Models

Modeling microbial cross-feeding at intermediate scale portrays community dynamics and species coexistence

Monetizing Edge Service in Mobile Internet Ecosystem

One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases

Optimizing Traffic Lights with Multi-agent Deep Reinforcement Learning and V2X communication

Evidence for a multi-level trophic organization of the human gut microbiome

A micro-scale simulation of red blood cell passage through symmetric and asymmetric bifurcated vessels

An Experimental Study of LSTM Encoder-Decoder Model for Text Simplification

An ultrafast polarised single photon source at 220 K

Experimental and theoretical analyses of strongly polarized photon emission from non-polar InGaN quantum dots

Link Prediction in evolving networks based on the popularity of nodes

Topic Modeling over Short Texts by Incorporating Word Embeddings

Antiferromagnetic Ordering in MnF(salen)

Learning Optimized Or's of And's

Or's of And's for Interpretable Classification, with Application to Context-Aware Recommender Systems

Alternating Optimization Techniques for Power Allocation and Receiver Design in Multihop Wireless Sensor Networks

Resolution-aware network coded storage

An Ehrenfeucht-Fraïssé Game for $L_{ω_1ω}$

Software Design Document, Testing, Deployment and Configuration Management of the UUIS--a Team 2 COMP5541-W10 Project Approach

Software Requirements Specification of the IUfA's UUIS -- a Team 2 COMP5541-W10 Project Approach