Source author record

Hao Xue

Hao Xue appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Artificial Intelligence math.PR

Catalog footprint

What is connected

8works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

TrajDLM: Topology-Aware Block Diffusion Language Model for Trajectory Generation

Generating high-fidelity synthetic GPS trajectories is increasingly important for applications in transportation, urban planning, and what-if scenario simulation, especially as privacy concerns limit access to real-world mobility data. Existing trajectory generation models face a trade-off between efficiency and faithfulness to road network topology: continuous-space methods enable fast generation but ignore the road network, while topology-aware approaches rely on search-based autoregressive decoding that limits generation speed. We propose TrajDLM, a topology-aware trajectory generation framework based on block diffusion language models that bridges this gap. TrajDLM models trajectories as sequences of discrete road segments, combining a block diffusion backbone for efficient denoising, topology-aware embeddings from a road network encoder, and topology-constrained sampling to ensure coherent and realistic trajectories. Across three city-scale datasets, TrajDLM achieves strong performance on fine-grained local similarity metrics while being up to $2.8\times$ faster than prior work, and demonstrates strong zero-shot transfer across domains, including unseen transportation modes. These results highlight the effectiveness of block-wise discrete diffusion as a scalable approach to accurate and efficient trajectory generation. Our code is available at https://github.com/cruiseresearchgroup/TrajDLM/

preprint2022arXiv

Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Recently, Self-Supervised Representation Learning (SSRL) has attracted much attention in the field of computer vision, speech, natural language processing (NLP), and recently, with other types of modalities, including time series from sensors. The popularity of self-supervised learning is driven by the fact that traditional models typically require a huge amount of well-annotated data for training. Acquiring annotated data can be a difficult and costly process. Self-supervised methods have been introduced to improve the efficiency of training data through discriminative pre-training of models using supervisory signals that have been freely obtained from the raw data. Unlike existing reviews of SSRL that have pre-dominately focused upon methods in the fields of CV or NLP for a single modality, we aim to provide the first comprehensive review of multimodal self-supervised learning methods for temporal data. To this end, we 1) provide a comprehensive categorization of existing SSRL methods, 2) introduce a generic pipeline by defining the key components of a SSRL framework, 3) compare existing models in terms of their objective function, network architecture and potential applications, and 4) review existing multimodal techniques in each category and various modalities. Finally, we present existing weaknesses and future opportunities. We believe our work develops a perspective on the requirements of SSRL in domains that utilise multimodal and/or temporal data

preprint2022arXiv

COCOA: Cross Modality Contrastive Learning for Sensor Data

Self-Supervised Learning (SSL) is a new paradigm for learning discriminative representations without labelled data and has reached comparable or even state-of-the-art results in comparison to supervised counterparts. Contrastive Learning (CL) is one of the most well-known approaches in SSL that attempts to learn general, informative representations of data. CL methods have been mostly developed for applications in computer vision and natural language processing where only a single sensor modality is used. A majority of pervasive computing applications, however, exploit data from a range of different sensor modalities. While existing CL methods are limited to learning from one or two data sources, we propose COCOA (Cross mOdality COntrastive leArning), a self-supervised model that employs a novel objective function to learn quality representations from multisensor data by computing the cross-correlation between different data modalities and minimizing the similarity between irrelevant instances. We evaluate the effectiveness of COCOA against eight recently introduced state-of-the-art self-supervised models, and two supervised baselines across five public datasets. We show that COCOA achieves superior classification performance to all other approaches. Also, COCOA is far more label-efficient than the other baselines including the fully supervised model using only one-tenth of available labelled data.

preprint2022arXiv

Leveraging Language Foundation Models for Human Mobility Forecasting

In this paper, we propose a novel pipeline that leverages language foundation models for temporal sequential pattern mining, such as for human mobility forecasting tasks. For example, in the task of predicting Place-of-Interest (POI) customer flows, typically the number of visits is extracted from historical logs, and only the numerical data are used to predict visitor flows. In this research, we perform the forecasting task directly on the natural language input that includes all kinds of information such as numerical values and contextual semantic information. Specific prompts are introduced to transform numerical temporal sequences into sentences so that existing language models can be directly applied. We design an AuxMobLCast pipeline for predicting the number of visitors in each POI, integrating an auxiliary POI category classification task with the encoder-decoder architecture. This research provides empirical evidence of the effectiveness of the proposed AuxMobLCast pipeline to discover sequential patterns in mobility forecasting tasks. The results, evaluated on three real-world datasets, demonstrate that pre-trained language foundation models also have good performance in forecasting temporal sequences. This study could provide visionary insights and lead to new research directions for predicting human mobility.

preprint2021arXiv

TERMCast: Temporal Relation Modeling for Effective Urban Flow Forecasting

Urban flow forecasting is a challenging task, given the inherent periodic characteristics of urban flow patterns. To capture the periodicity, existing urban flow prediction approaches are often designed with closeness, period, and trend components extracted from the urban flow sequence. However, these three components are often considered separately in the prediction model. These components have not been fully explored together and simultaneously incorporated in urban flow forecasting models. We introduce a novel urban flow forecasting architecture, TERMCast. A Transformer based long-term relation prediction module is explicitly designed to discover the periodicity and enable the three components to be jointly modeled This module predicts the periodic relation which is then used to yield the predicted urban flow tensor. To measure the consistency of the predicted periodic relation vector and the relation vector inferred from the predicted urban flow tensor, we propose a consistency module. A consistency loss is introduced in the training process to further improve the prediction performance. Through extensive experiments on three real-world datasets, we demonstrate that TERMCast outperforms multiple state-of-the-art methods. The effectiveness of each module in TERMCast has also been investigated.

preprint2021arXiv

Time Series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Change Point Detection (CPD) methods identify the times associated with changes in the trends and properties of time series data in order to describe the underlying behaviour of the system. For instance, detecting the changes and anomalies associated with web service usage, application usage or human behaviour can provide valuable insights for downstream modelling tasks. We propose a novel approach for self-supervised Time Series Change Point detection method based onContrastivePredictive coding (TS-CP^2). TS-CP^2 is the first approach to employ a contrastive learning strategy for CPD by learning an embedded representation that separates pairs of embeddings of time adjacent intervals from pairs of interval embeddings separated across time. Through extensive experiments on three diverse, widely used time series datasets, we demonstrate that our method outperforms five state-of-the-art CPD methods, which include unsupervised and semi-supervisedapproaches. TS-CP^2 is shown to improve the performance of methods that use either handcrafted statistical or temporal features by 79.4% and deep learning-based methods by 17.0% with respect to the F1-score averaged across the three datasets.

preprint2020arXiv

Critical scaling for an anisotropic percolation system on $\mathbb{Z}^2$

In this article, we consider an anisotropic finite-range bond percolation model on $\mathbb{Z}^2$. On each horizontal layer $\{(x,i): x \in \mathbb{Z}\}$ we have edges $\langle(x, i),(y, i)\rangle$ for $1 \leq |x - y| \leq N$. There are also vertical edges connecting two nearest neighbor vertices on distinct layers $\langle(x, i), (x, i+1)\rangle$ for $x, i \in\mathbb{Z}$. On this graph we consider the following anisotropic independent percolation model: horizontal edges are open with probability $1/(2N)$, while vertical edges are open with probability $ε$ to be suitably tuned as $N$ grows to infinity. The main result tells that if $ε=κN^{-2/5}$, we see a phase transition in $κ$: positive and finite constants $C_1, C_2$ exist so that there is no percolation if $κ< C_1$ while percolation occurs for $κ> C_2$. The question is motivated by a result on the analogous layered ferromagnetic Ising model at mean field critical temperature [J. Stat. Phys. (2015), 161, 91-123] where the authors showed the existence of multiple Gibbs measures for a fixed value of the vertical interaction and conjectured a change of behavior in $κ$ when the vertical interaction suitably vanishes as $κγ^b$, where $1/γ$ is the range of the horizontal interaction. For the product percolation model we have a value of $b$ that differs from what was conjectured in that paper. The proof relies on the analysis of the scaling limit of the critical branching random walk that dominates the growth process restricted to each horizontal layer and a careful analysis of the true horizontal growth process. This is inspired by works on the long range contact process [Probab. Th. Rel. Fields (1995), 102, 519-545]. A renormalization scheme is used for the percolative regime.

preprint2020arXiv

Take a NAP: Non-Autoregressive Prediction for Pedestrian Trajectories

Pedestrian trajectory prediction is a challenging task as there are three properties of human movement behaviors which need to be addressed, namely, the social influence from other pedestrians, the scene constraints, and the multimodal (multiroute) nature of predictions. Although existing methods have explored these key properties, the prediction process of these methods is autoregressive. This means they can only predict future locations sequentially. In this paper, we present NAP, a non-autoregressive method for trajectory prediction. Our method comprises specifically designed feature encoders and a latent variable generator to handle the three properties above. It also has a time-agnostic context generator and a time-specific context generator for non-autoregressive prediction. Through extensive experiments that compare NAP against several recent methods, we show that NAP has state-of-the-art trajectory prediction performance.

Hao Xue

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

TrajDLM: Topology-Aware Block Diffusion Language Model for Trajectory Generation

Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

COCOA: Cross Modality Contrastive Learning for Sensor Data

Leveraging Language Foundation Models for Human Mobility Forecasting

TERMCast: Temporal Relation Modeling for Effective Urban Flow Forecasting

Time Series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Critical scaling for an anisotropic percolation system on $\mathbb{Z}^2$

Take a NAP: Non-Autoregressive Prediction for Pedestrian Trajectories