Source author record

Junwei Ma

Junwei Ma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language Computer Vision eess.SY Machine Learning physics.soc-ph Social and Information Networks Systems and Control

Catalog footprint

What is connected

5works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DisastQA: A Comprehensive Benchmark for Evaluating Question Answering in Disaster Management

Accurate question answering (QA) in disaster management requires reasoning over uncertain and conflicting information, a setting poorly captured by existing benchmarks built on clean evidence. We introduce DisastQA, a large-scale benchmark of 3,000 rigorously verified questions (2,000 multiple-choice and 1,000 open-ended) spanning eight disaster types. The benchmark is constructed via a human-LLM collaboration pipeline with stratified sampling to ensure balanced coverage. Models are evaluated under varying evidence conditions, from closed-book to noisy evidence integration, enabling separation of internal knowledge from reasoning under imperfect information. For open-ended QA, we propose a human-verified keypoint-based evaluation protocol emphasizing factual completeness over verbosity. Experiments with 20 models reveal substantial divergences from general-purpose leaderboards such as MMLU-Pro. While recent open-weight models approach proprietary systems in clean settings, performance degrades sharply under realistic noise, exposing critical reliability gaps for disaster response. All code, data, and evaluation resources are available at https://github.com/TamuChen18/DisastQA_open.

preprint2026arXiv

TabDPT: Scaling Tabular Foundation Models on Real Data

Tabular data is one of the most ubiquitous sources of information worldwide, spanning a wide variety of domains. This inherent heterogeneity has slowed the development of Tabular Foundation Models (TFMs) capable of fast generalization to unseen datasets. In-Context Learning (ICL) has recently emerged as a promising solution for TFMs, enabling dynamic adaptation to new tasks without additional tuning. While many studies have attempted to re-purpose large language models for tabular ICL, they have had limited success, so recent works have focused on developing tabular-specific foundation models. In this work, we propose an approach to combine ICL-based retrieval with self supervised learning to train tabular foundation models. We also investigate the utility of real vs. synthetic data for model pre-training, and show that real data can contain useful signal not easily captured in synthetic training. Specifically, we show that incorporating real data during the pre-training phase can lead to significantly faster training and better downstream generalization to unseen data. Our resulting model, TabDPT, achieves strong performance on both regression (CTR23) and classification (CC18) benchmarks. Importantly, we also demonstrate that with our pre-training procedure, scaling both model and data size leads to consistent performance improvements that follow power laws. This echoes scaling laws in LLMs and other foundation models, and suggests that large-scale TFMs can be achievable. We open-source our full pipeline: inference code including trained model weights can be found at github.com/layer6ai-labs/TabDPT-inference, and the training code to reproduce experiments can be found at github.com/layer6ai-labs/TabDPT-training.

preprint2022arXiv

Characterizing Urban Lifestyle Signatures Using Motif Properties in Network of Places

The lifestyles of urban dwellers could reveal important insights regarding the dynamics and complexity of cities. Despite growing research on analysis of lifestyle patterns in cities, little is known about the characteristics of people's lifestyles patterns at urban scale. This limitation is primarily due to challenges in characterizing lifestyle patterns when human movement data is aggregated to protect the privacy of users. In this study, we model cities based on aggregated human visitation data to construct a network of places. We then examine the subgraph signatures in the networks of places to map and characterize lifestyle patterns at city scale. Location-based data from Harris County, Dallas County, New York County, and Broward County in the United States were examined to reveal lifestyle signatures in cities. For the motif analysis, two-node, three-node, and four-node motifs without location attributes were extracted from human visitation networks. Second, homogenized nodes in motifs were encoded with location categories from NAICS codes. Multiple statistical measures, including network metrics and motif properties, were quantified to characterize lifestyle signatures. The results show that: people's lifestyles in urban environments can be well depicted and quantified based on distribution and attributes of motifs in networks of places; motifs in networks of places show stability in quantity and distance as well as periodicity on weekends and weekdays indicating the stability of lifestyle patterns in cities; human visitation networks and lifestyle patterns show similarities across different metropolitan areas implying the universality of lifestyle signatures across cities. The findings provide deeper insights into urban lifestyles signatures in urban studies and provide important insights for data-informed urban planning and management.

preprint2022arXiv

Quantitative Measures for Integrating Resilience into Transportation Planning Practice: Study in Texas

The objective of this study is to propose a system-level framework with quantitative measures to assess the resilience of road networks. The framework proposed in this paper can help transportation agencies incorporate resilience considerations into project development proactively and to understand the resilience performance of current road networks effectively. This study identified and implemented four quantitative metrics to classify the criticality of road segments based on critical dimensions of road network resilience, and two integrated metrics were proposed to combine all metrics to show the overall resilience performance of road segments. A case study was conducted on the Texas road networks to demonstrate the effectiveness of implementing this framework in a practical scenario. Since the data used in this study is available to other states and countries, the framework presented in this study can be adopted by other transportation agencies across the globe for regional transportation resilience assessments.

preprint2022arXiv

X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval

In text-video retrieval, the objective is to learn a cross-modal similarity function between a text and a video that ranks relevant text-video pairs higher than irrelevant pairs. However, videos inherently express a much wider gamut of information than texts. Instead, texts often capture sub-regions of entire videos and are most semantically similar to certain frames within videos. Therefore, for a given text, a retrieval model should focus on the text's most semantically similar video sub-regions to make a more relevant comparison. Yet, most existing works aggregate entire videos without directly considering text. Common text-agnostic aggregations schemes include mean-pooling or self-attention over the frames, but these are likely to encode misleading visual information not described in the given text. To address this, we propose a cross-modal attention model called X-Pool that reasons between a text and the frames of a video. Our core mechanism is a scaled dot product attention for a text to attend to its most semantically similar frames. We then generate an aggregated video representation conditioned on the text's attention weights over the frames. We evaluate our method on three benchmark datasets of MSR-VTT, MSVD and LSMDC, achieving new state-of-the-art results by up to 12% in relative improvement in Recall@1. Our findings thereby highlight the importance of joint text-video reasoning to extract important visual cues according to text. Full code and demo can be found at: https://layer6ai-labs.github.io/xpool/

Junwei Ma

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

DisastQA: A Comprehensive Benchmark for Evaluating Question Answering in Disaster Management

TabDPT: Scaling Tabular Foundation Models on Real Data

Characterizing Urban Lifestyle Signatures Using Motif Properties in Network of Places

Quantitative Measures for Integrating Resilience into Transportation Planning Practice: Study in Texas

X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval