Researcher profile

Lijun Sun

Lijun Sun contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2026arXiv

Bridge: Retrieval-Augmented Spatiotemporal Modeling for Urban Delivery Demand

Forecasting urban delivery demand becomes substantially more challenging when newly added service regions lack historical records. Existing spatiotemporal forecasters effectively model spatial dependence once sufficient node histories are available. Still, they remain parametric and therefore struggle to recover short-term operational dynamics in cold-start regions. Geospatial embeddings help identify where a region is and what function it serves, yet they do not directly reveal how a similar region behaves under a comparable temporal context. We propose Bridge, a retrieval-augmented spatiotemporal graph framework that combines an inductive contextual graph backbone with a time-aware memory of region-time windows. For each target region, Bridge retrieves future demand patterns from the memory using both regional context and recent dynamics, and refines the backbone forecast through a gated fusion mechanism. To align retrieval with forecasting utility, we further train the retriever with a future-aware objective that favors entries whose future trajectories best match the target. Experiments on four real-world delivery datasets show that Bridge consistently improves over competitive spatiotemporal baselines in both within-city cold-start and cross-city transfer with partial observations. The results show that retrieval augmentation provides a useful operational memory for cold-start urban demand forecasting when parametric graph generalization alone is insufficient.

preprint2022arXiv

Bayesian calibration of traffic flow fundamental diagrams using Gaussian processes

Modeling the relationship between vehicle speed and density on the road is a fundamental problem in traffic flow theory. Recent research found that using the least-squares (LS) method to calibrate single-regime speed-density models is biased because of the uneven distribution of samples. This paper explains the issue of the LS method from a statistical perspective: the biased calibration is caused by the correlations/dependencies in regression residuals. Based on this explanation, we propose a new calibration method for single-regime speed-density models by modeling the covariance of residuals via a zero-mean Gaussian Process (GP). Our approach can be viewed as a generalized least-squares (GLS) method with a specific covariance structure (i.e., kernel function) and is a generalization of the existing LS and the weighted least-squares (WLS) methods. Next, we use a sparse approximation to address the scalability issue of GPs and apply a Markov chain Monte Carlo (MCMC) sampling scheme to obtain the posterior distributions of the parameters for speed-density models and the hyperparameters (i.e., length scale and variance) of the GP kernel. Finally, we calibrate six well-known single-regime speed-density models with the proposed method. Results show that the proposed GP-based methods (1) significantly reduce the biases in the LS calibration, (2) achieve a similar effect as the WLS method, (3) can be used as a non-parametric speed-density model, and (4) provide a Bayesian solution to estimate posterior distributions of parameters and speed-density functions.

preprint2022arXiv

Low-Rank Hankel Tensor Completion for Traffic Speed Estimation

This paper studies the traffic state estimation (TSE) problem using sparse observations from mobile sensors. Most existing TSE methods either rely on well-defined physical traffic flow models or require large amounts of simulation data as input to train machine learning models. Different from previous studies, we propose a purely data-driven and model-free solution in this paper. We consider the TSE as a spatiotemporal matrix completion/interpolation problem, and apply spatiotemporal delay embedding to transform the original incomplete matrix into a fourth-order Hankel structured tensor. By imposing a low-rank assumption on this tensor structure, we can approximate and characterize both global and local spatiotemporal patterns in a data-driven manner. We use the truncated nuclear norm of a balanced spatiotemporal unfolding -- in which each column represents the vectorization of a small patch in the original matrix -- to approximate the tensor rank. An efficient solution algorithm based on the Alternating Direction Method of Multipliers (ADMM) is developed for model learning. The proposed framework only involves two hyperparameters, spatial and temporal window lengths, which are easy to set given the degree of data sparsity. We conduct numerical experiments on real-world high-resolution trajectory data, and our results demonstrate the effectiveness and superiority of the proposed model in some challenging scenarios.

preprint2022arXiv

Probabilistic forecasting of bus travel time with a Bayesian Gaussian mixture model

Accurate forecasting of bus travel time and its uncertainty is critical to service quality and operation of transit systems; for example, it can help passengers make better decisions on departure time, route choice, and even transport mode choice and also support transit operators to make informed decisions on tasks such as crew/vehicle scheduling and timetabling. However, most existing approaches in bus travel time forecasting are based on deterministic models that provide only point estimation. To this end, we develop in this paper a Bayesian probabilistic forecasting model for bus travel time. To characterize the strong dependencies/interactions between consecutive buses, we concatenate the link travel time vectors and the headway vector from a pair of two adjacent buses as a new augmented variable and model it with a constrained Multivariate Gaussian mixture distributions. This approach can naturally capture the interactions between adjacent buses (e.g., correlated speed and smooth variation of headway), handle missing values in data, and depict the multimodality in bus travel time distributions. Next, we assume different periods in a day share the same set of Gaussian components but different mixing coefficients to characterize the systematic temporal variations in bus operation. For model inference, we develop an efficient Markov chain Monte Carlo (MCMC) sampling algorithm to obtain the posterior distributions of model parameters and make probabilistic forecasting. We test the proposed model using the data from a twenty-link bus route in Guangzhou, China. Results show our approach significantly outperforms baseline models that overlook bus-to-bus interactions in terms of both predictive means and distributions. Besides forecasting, the parameters of the proposed model contain rich information for understanding/improving the bus service.

preprint2022arXiv

Real-time forecasting of metro origin-destination matrices with high-order weighted dynamic mode decomposition

Forecasting the short-term ridership among origin-destination pairs (OD matrix) of a metro system is crucial in real-time metro operation. However, this problem is notoriously difficult due to the high-dimensional, sparse, noisy, and skewed nature of OD matrices. This paper proposes a High-order Weighted Dynamic Mode Decomposition (HW-DMD) model for short-term metro OD matrices forecasting. DMD uses Singular Value Decomposition (SVD) to extract low-rank approximation from OD data, and a low-rank high-order vector autoregression model is established for forecasting. To address a practical issue that metro OD matrices cannot be observed in real-time, we use the boarding demand to replace the unavailable OD matrices. Particularly, we consider the time-evolving feature of metro systems and improve the forecast by exponentially reducing the weights for old data. Moreover, we develop a tailored online update algorithm for HW-DMD to update the model coefficients daily without storing historical data or retraining. Experiments on data from a large-scale metro system show the proposed HW-DMD is robust to the noisy and sparse data and significantly outperforms baseline models in forecasting both OD matrices and boarding flow. The online update algorithm also shows consistent accuracy over a long time when maintaining an HW-DMD model at low costs.

preprint2021arXiv

Development of water extraction system for liquid scintillatorpurification of JUNO

The Jiangmen Underground Neutrino Observatory (JUNO) uses 20k tons of liquid scintillator (LS)to detect neutrinos. The content of radioactive substances in the liquid scintillator will affect theexperimental results. JUNO will use counter current water extraction to reduce the radioactive metalions inside the LS. In this article, The factors that affect the final water extraction like the partitioncoefficient and the factors that affect it, the optimal mass transfer droplet size, the flow rate ratio,theoretical stage and different working mode (LS as continuous phase or dispersed phase) have beenstudied. We built counter current extraction prototype in the laboratory and a pilot plant in Daya Bay.We not only study the factors above, but also get a lot of engineering experience that worth sharing.

preprint2021arXiv

Dynamic Spatiotemporal Graph Convolutional Neural Networks for Traffic Data Imputation with Complex Missing Patterns

Missing data is an inevitable and ubiquitous problem for traffic data collection in intelligent transportation systems. Despite extensive research regarding traffic data imputation, there still exist two limitations to be addressed: first, existing approaches fail to capture the complex spatiotemporal dependencies in traffic data, especially the dynamic spatial dependencies evolving with time; second, prior studies mainly focus on randomly missing patterns while other more complex missing scenarios are less discussed. To fill these research gaps, we propose a novel deep learning framework called Dynamic Spatiotemporal Graph Convolutional Neural Networks (DSTGCN) to impute missing traffic data. The model combines the recurrent architecture with graph-based convolutions to model the spatiotemporal dependencies. Moreover, we introduce a graph structure estimation technique to model the dynamic spatial dependencies from real-time traffic information and road network structure. Extensive experiments based on two public traffic speed datasets are conducted to compare our proposed model with state-of-the-art deep learning approaches in four types of missing patterns. The results show that our proposed model outperforms existing deep learning models in all kinds of missing scenarios and the graph structure estimation technique contributes to the model performance. We further compare our proposed model with a tensor factorization model and find distinct behaviors across different model families under different training schemes and data availability.

preprint2021arXiv

Low-Rank Autoregressive Tensor Completion for Spatiotemporal Traffic Data Imputation

Spatiotemporal traffic time series (e.g., traffic volume/speed) collected from sensing systems are often incomplete with considerable corruption and large amounts of missing values, preventing users from harnessing the full power of the data. Missing data imputation has been a long-standing research topic and critical application for real-world intelligent transportation systems. A widely applied imputation method is low-rank matrix/tensor completion; however, the low-rank assumption only preserves the global structure while ignores the strong local consistency in spatiotemporal data. In this paper, we propose a low-rank autoregressive tensor completion (LATC) framework by introducing \textit{temporal variation} as a new regularization term into the completion of a third-order (sensor $\times$ time of day $\times$ day) tensor. The third-order tensor structure allows us to better capture the global consistency of traffic data, such as the inherent seasonality and day-to-day similarity. To achieve local consistency, we design the temporal variation by imposing an AR($p$) model for each time series with coefficients as learnable parameters. Different from previous spatial and temporal regularization schemes, the minimization of temporal variation can better characterize temporal generative mechanisms beyond local smoothness, allowing us to deal with more challenging scenarios such "blackout" missing. To solve the optimization problem in LATC, we introduce an alternating minimization scheme that estimates the low-rank tensor and autoregressive coefficients iteratively. We conduct extensive numerical experiments on several real-world traffic data sets, and our results demonstrate the effectiveness of LATC in diverse missing scenarios.

preprint2021arXiv

Quantifying out-of-station waiting time in oversaturated urban metro systems

Metro systems in megacities such as Beijing, Shenzhen and Guangzhou are under great passenger demand pressure. During peak hours, it is common to see oversaturated conditions (i.e., passenger demand exceeds network capacity), which bring significant operational risks and safety issues. A popular control intervention is to restrict the entering rate during peak hours by setting up out-of-station queueing with crowd control barriers. The \textit{out-of-station waiting} can make up a substantial proportion of total travel time but is not well-studied in the literature. Accurate quantification of out-of-station waiting time is important to evaluating the social benefit and cost of service scheduling/optimization plans; however, out-of-station waiting time is difficult to estimate because it is not a part of smart card transactions. In this study, we propose an innovative method to estimate the out-of-station waiting time by leveraging the information from a small group of transfer passengers -- those who transfer from nearby bus routes to the metro station. Based on the estimated transfer time for this small group, we first infer the out-of-station waiting time for all passengers by developing a Gaussian Process regression with a Student-$t$ likelihood and then use the estimated out-of-station waiting time to build queueing diagrams. We apply our method to the Tiantongyuan North station of Beijing metro as a case study; our results show that the maximum out-of-station waiting time can reach 15 minutes, and the maximum queue length can be over 3000 passengers. Our results suggest that out-of-station waiting can cause significant travel costs and thus should be considered in analyzing transit performance, mode choice, and social benefits. To the best of our knowledge, this paper is the first quantitative study for out-of-station waiting time.

preprint2021arXiv

Robust Dynamic Bus Control: A Distributional Multi-agent Reinforcement Learning Approach

Bus system is a critical component of sustainable urban transportation. However, the operation of a bus fleet is unstable in nature, and bus bunching has become a common phenomenon that undermines the efficiency and reliability of bus systems. Recently research has demonstrated the promising application of multi-agent reinforcement learning (MARL) to achieve efficient vehicle holding control to avoid bus bunching. However, existing studies essentially overlook the robustness issue resulting from various events, perturbations and anomalies in a transit system, which is of utmost importance when transferring the models for real-world deployment/application. In this study, we integrate implicit quantile network and meta-learning to develop a distributional MARL framework -- IQNC-M -- to learn continuous control. The proposed IQNC-M framework achieves efficient and reliable control decisions through better handling various uncertainties/events in real-time transit operations. Specifically, we introduce an interpretable meta-learning module to incorporate global information into the distributional MARL framework, which is an effective solution to circumvent the credit assignment issue in the transit system. In addition, we design a specific learning procedure to train each agent within the framework to pursue a robust control policy. We develop simulation environments based on real-world bus services and passenger demand data and evaluate the proposed framework against both traditional holding control models and state-of-the-art MARL models. Our results show that the proposed IQNC-M framework can effectively handle the various extreme events, such as traffic state perturbations, service interruptions, and demand surges, thus improving both efficiency and reliability of the system.

preprint2020arXiv

A Nonconvex Low-Rank Tensor Completion Model for Spatiotemporal Traffic Data Imputation

Sparsity and missing data problems are very common in spatiotemporal traffic data collected from various sensing systems. Making accurate imputation is critical to many applications in intelligent transportation systems. In this paper, we formulate the missing data imputation problem in spatiotemporal traffic data in a low-rank tensor completion (LRTC) framework and define a novel truncated nuclear norm (TNN) on traffic tensors of location$\times$day$\times$time of day. In particular, we introduce an universal rate parameter to control the degree of truncation on all tensor modes in the proposed LRTC-TNN model, and this allows us to better characterize the hidden patterns in spatiotemporal traffic data. Based on the framework of the Alternating Direction Method of Multipliers (ADMM), we present an efficient algorithm to obtain the optimal solution for each variable. We conduct numerical experiments on four spatiotemporal traffic data sets, and our results show that the proposed LRTC-TNN model outperforms many state-of-the-art imputation models with missing rates/patterns. Moreover, the proposed model also outperforms other baseline models in extreme missing scenarios.

preprint2020arXiv

Efficient Motion Planning for Automated Lane Change based on Imitation Learning and Mixed-Integer Optimization

Intelligent motion planning is one of the core components in automated vehicles, which has received extensive interests. Traditional motion planning methods suffer from several drawbacks in terms of optimality, efficiency and generalization capability. Sampling based methods cannot guarantee the optimality of the generated trajectories. Whereas the optimization-based methods are not able to perform motion planning in real-time, and limited by the simplified formalization. In this work, we propose a learning-based approach to handle those shortcomings. Mixed Integer Quadratic Problem based optimization (MIQP) is used to generate the optimal lane-change trajectories which served as the training dataset for learning-based action generation algorithms. A hierarchical supervised learning model is devised to make the fast lane-change decision. Numerous experiments have been conducted to evaluate the optimality, efficiency, and generalization capability of the proposed approach. The experimental results indicate that the proposed model outperforms several commonly used motion planning baselines.

preprint2020arXiv

Incremental Bayesian tensor learning for structural monitoring data imputation and response forecasting

There has been increased interest in missing sensor data imputation, which is ubiquitous in the field of structural health monitoring (SHM) due to discontinuous sensing caused by sensor malfunction. To address this fundamental issue, this paper presents an incremental Bayesian tensor learning method for reconstruction of spatiotemporal missing data in SHM and forecasting of structural response. In particular, a spatiotemporal tensor is first constructed followed by Bayesian tensor factorization that extracts latent features for missing data imputation. To enable structural response forecasting based on incomplete sensing data, the tensor decomposition is further integrated with vector autoregression in an incremental learning scheme. The performance of the proposed approach is validated on continuous field-sensing data (including strain and temperature records) of a concrete bridge, based on the assumption that strain time histories are highly correlated to temperature recordings. The results indicate that the proposed probabilistic tensor learning approach is accurate and robust even in the presence of large rates of random missing, structured missing and their combination. The effect of rank selection on the imputation and prediction performance is also investigated. The results show that a better estimation accuracy can be achieved with a higher rank for random missing whereas a lower rank for structured missing.

preprint2020arXiv

Industrial Topics in Urban Labor System

Categorization is an essential component for us to understand the world for ourselves and to communicate it collectively. It is therefore important to recognize that classification system are not necessarily static, especially for economic systems, and even more so in urban areas where most innovation takes place and is implemented. Out-of-date classification systems would potentially limit further understanding of the current economy because things constantly change. Here, we develop an occupation-based classification system for the US labor economy, called industrial topics, that satisfy adaptability and representability. By leveraging the distributions of occupations across the US urban areas, we identify industrial topics - clusters of occupations based on their co-existence pattern. Industrial topics indicate the mechanisms under the systematic allocation of different occupations. Considering the densely connected occupations as an industrial topic, our approach characterizes regional economies by their topical composition. Unlike the existing survey-based top-down approach, our method provides timely information about the underlying structure of the regional economy, which is critical for policymakers and business leaders, especially in our fast-changing economy.

preprint2020arXiv

Low-Rank Autoregressive Tensor Completion for Multivariate Time Series Forecasting

Time series prediction has been a long-standing research topic and an essential application in many domains. Modern time series collected from sensor networks (e.g., energy consumption and traffic flow) are often large-scale and incomplete with considerable corruption and missing values, making it difficult to perform accurate predictions. In this paper, we propose a low-rank autoregressive tensor completion (LATC) framework to model multivariate time series data. The key of LATC is to transform the original multivariate time series matrix (e.g., sensor$\times$time point) to a third-order tensor structure (e.g., sensor$\times$time of day$\times$day) by introducing an additional temporal dimension, which allows us to model the inherent rhythms and seasonality of time series as global patterns. With the tensor structure, we can transform the time series prediction and missing data imputation problems into a universal low-rank tensor completion problem. Besides minimizing tensor rank, we also integrate a novel autoregressive norm on the original matrix representation into the objective function. The two components serve different roles. The low-rank structure allows us to effectively capture the global consistency and trends across all the three dimensions (i.e., similarity among sensors, similarity of different days, and current time v.s. the same time of historical days). The autoregressive norm can better model the local temporal trends. Our numerical experiments on three real-world data sets demonstrate the superiority of the integration of global and local trends in LATC in both missing data imputation and rolling prediction tasks.

preprint2020arXiv

Scaling of contact networks for epidemic spreading in urban transit systems

Improved mobility not only contributes to more intensive human activities but also facilitates the spread of communicable disease, thus constituting a major threat to billions of urban commuters. In this study, we present a multi-city investigation of communicable diseases percolating among metro travelers. We use smart card data from three megacities in China to construct individual-level contact networks, based on which the spread of disease is modeled and studied. We observe that, though differing in urban forms, network layouts, and mobility patterns, the metro systems of the three cities share similar contact network structures. This motivates us to develop a universal generation model that captures the distributions of the number of contacts as well as the contact duration among individual travelers. This model explains how the structural properties of the metro contact network are associated with the risk level of communicable diseases. Our results highlight the vulnerability of urban mass transit systems during disease outbreaks and suggest important planning and operation strategies for mitigating the risk of communicable diseases.