Researcher profile

Chao Fan

Chao Fan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2022arXiv

FMP: Toward Fair Graph Message Passing against Topology Bias

Despite recent advances in achieving fair representations and predictions through regularization, adversarial debiasing, and contrastive learning in graph neural networks (GNNs), the working mechanism (i.e., message passing) behind GNNs inducing unfairness issue remains unknown. In this work, we theoretically and experimentally demonstrate that representative aggregation in message-passing schemes accumulates bias in node representation due to topology bias induced by graph topology. Thus, a \textsf{F}air \textsf{M}essage \textsf{P}assing (FMP) scheme is proposed to aggregate useful information from neighbors but minimize the effect of topology bias in a unified framework considering graph smoothness and fairness objectives. The proposed FMP is effective, transparent, and compatible with back-propagation training. An acceleration approach on gradient calculation is also adopted to improve algorithm efficiency. Experiments on node classification tasks demonstrate that the proposed FMP outperforms the state-of-the-art baselines in effectively and efficiently mitigating bias on three real-world datasets.

preprint2022arXiv

GaitEdge: Beyond Plain End-to-end Gait Recognition for Better Practicality

Gait is one of the most promising biometrics to identify individuals at a long distance. Although most previous methods have focused on recognizing the silhouettes, several end-to-end methods that extract gait features directly from RGB images perform better. However, we demonstrate that these end-to-end methods may inevitably suffer from the gait-irrelevant noises, i.e., low-level texture and colorful information. Experimentally, we design the cross-domain evaluation to support this view. In this work, we propose a novel end-to-end framework named GaitEdge which can effectively block gait-irrelevant information and release end-to-end training potential. Specifically, GaitEdge synthesizes the output of the pedestrian segmentation network and then feeds it to the subsequent recognition network, where the synthetic silhouettes consist of trainable edges of bodies and fixed interiors to limit the information that the recognition network receives. Besides, GaitAlign for aligning silhouettes is embedded into the GaitEdge without losing differentiability. Experimental results on CASIA-B and our newly built TTG-200 indicate that GaitEdge significantly outperforms the previous methods and provides a more practical end-to-end paradigm. All the source code are available at https://github.com/ShiqiYu/OpenGait.

preprint2022arXiv

Human Mobility Disproportionately Extends PM2.5 Emission Exposure for Low Income Populations

Ambient exposure to fine particulate matters of diameters smaller than 2.5μm (PM2.5) has been identified as one critical cause for respiratory disease. Disparities in exposure to PM2.5 among income groups at individual residences are known to exist and are easy to calculate. Existing approaches for exposure assessment, however, do not capture the exposure implied by the dynamic mobility of city dwellers that accounts for a large proportion of the exposure outside homes. To overcome the challenge of gauging the exposure to PM2.5 for city dwellers, we analyzed billions of anonymized and privacy-enhanced location-based data generated by mobile phone users in Harris County, Texas, to characterize the mobility patterns of the populations and associated exposure. We introduce the metric for exposure extent based on the time people spent at places with the air pollutant and examine the disparities in mobility-based exposure across income groups. Our results show that PM2.5 emissions disproportionately expose low-income populations due to their mobility activities. People with higher-than-average income are exposed to lower levels of PM2.5 emissions. These disparities in mobility-based exposure are the result of frequent visits of low-income people to the industrial sectors of urban areas with high PM2.5 emissions, and the larger mobility scale of these people for life needs. The results inform about environmental justice and public health strategies, not only to reduce the overall PM2.5 exposure but also to mitigate the disproportional impacts on low-income populations. The findings also suggest that an integration of extensive fine-scale population mobility and pollution emissions data can unveil new insights into inequality in air pollution exposures at the urban scale.

preprint2022arXiv

Quantitative Measures for Integrating Resilience into Transportation Planning Practice: Study in Texas

The objective of this study is to propose a system-level framework with quantitative measures to assess the resilience of road networks. The framework proposed in this paper can help transportation agencies incorporate resilience considerations into project development proactively and to understand the resilience performance of current road networks effectively. This study identified and implemented four quantitative metrics to classify the criticality of road segments based on critical dimensions of road network resilience, and two integrated metrics were proposed to combine all metrics to show the overall resilience performance of road segments. A case study was conducted on the Texas road networks to demonstrate the effectiveness of implementing this framework in a practical scenario. Since the data used in this study is available to other states and countries, the framework presented in this study can be adopted by other transportation agencies across the globe for regional transportation resilience assessments.

preprint2021arXiv

Unraveling the Dynamic Importance of County-level Features in Trajectory of COVID-19

The objective of this study was to investigate the importance of multiple county-level features in the trajectory of COVID-19. We examined feature importance across 2,787 counties in the United States using a data-driven machine learning model. We trained random forest models using 23 features representing six key influencing factors affecting pandemic spread: social demographics of counties, population activities, mobility within the counties, movement across counties, disease attributes, and social network structure. Also, we categorized counties into multiple groups according to their population densities, and we divided the trajectory of COVID-19 into three stages: the outbreak stage, the social distancing stage, and the reopening stage. The study aims to answer two research questions: (1) The extent to which the importance of heterogeneous features evolves in different stages; (2) The extent to which the importance of heterogeneous features varies across counties with different characteristics. We fitted a set of random forest models to determine weekly feature importance. The results showed that: (1) Social demographic features, such as gross domestic product, population density, and minority status maintained high-importance features throughout stages of COVID-19 across the 2787 studied counties; (2) Within-county mobility features had the highest importance in county clusters with higher population densities; (3) The feature reflecting the social network structure (Facebook, social connectedness index), had higher importance in the models for counties with higher population densities. The results show that the data-driven machine learning models could provide important insights to inform policymakers regarding feature importance for counties with various population densities and in different stages of a pandemic life cycle.

preprint2020arXiv

A Network Percolation-based Contagion Model of Flood Propagation and Recession in Urban Road Networks

In this study, we propose a contagion model as a simple and powerful mathematical approach for predicting the spatial spread and temporal evolution of the onset and recession of flood waters in urban road networks. A network of urban roads resilient to flooding events is essential for provision of public services and for emergency response. The spread of floodwaters in urban networks is a complex spatial-temporal phenomenon. This study presents a mathematical contagion model to describe the spatial-temporal spread and recession process of flood waters in urban road networks. The evolution of floods within networks can be captured based on three macroscopic characteristics-flood propagation rate ($β$), flood incubation rate ($α$), and recovery rate ($μ$)-in a system of ordinary differential equations analogous to the Susceptible-Exposed-Infected-Recovered (SEIR) model. We integrated the flood contagion model with the network percolation process in which the probability of flooding of a road segment depends on the degree to which the nearby road segments are flooded. The application of the proposed model was verified using high-resolution historical data of road flooding in Harris County during Hurricane Harvey in 2017. The results show that the model can monitor and predict the fraction of flooded roads over time. Additionally, the proposed model can achieve $90\%$ precision and recall for the spatial spread of the flooded roads at the majority of tested time intervals. The findings suggest that the proposed mathematical contagion model offers great potential to support emergency managers, public officials, citizens, first responders, and other decision makers for flood forecast in road networks.

preprint2020arXiv

Adaptive Reinforcement Learning Model for Simulation of Urban Mobility during Crises

The objective of this study is to propose and test an adaptive reinforcement learning model that can learn the patterns of human mobility in a normal context and simulate the mobility during perturbations caused by crises, such as flooding, wildfire, and hurricanes. Understanding and predicting human mobility patterns, such as destination and trajectory selection, can inform emerging congestion and road closures raised by disruptions in emergencies. Data related to human movement trajectories are scarce, especially in the context of emergencies, which places a limitation on applications of existing urban mobility models learned from empirical data. Models with the capability of learning the mobility patterns from data generated in normal situations and which can adapt to emergency situations are needed to inform emergency response and urban resilience assessments. To address this gap, this study creates and tests an adaptive reinforcement learning model that can predict the destinations of movements, estimate the trajectory for each origin and destination pair, and examine the impact of perturbations on humans' decisions related to destinations and movement trajectories. The application of the proposed model is shown in the context of Houston and the flooding scenario caused by Hurricane Harvey in August 2017. The results show that the model can achieve more than 76\% precision and recall. The results also show that the model could predict traffic patterns and congestion resulting from to urban flooding. The outcomes of the analysis demonstrate the capabilities of the model for analyzing urban mobility during crises, which can inform the public and decision-makers about the response strategies and resilience planning to reduce the impacts of crises on urban mobility.

preprint2020arXiv

DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive Surveillance of COVID-19 Using Heterogeneous Features and their Interactions

In this paper, we propose a deep learning model to forecast the range of increase in COVID-19 infected cases in future days and we present a novel method to compute equidimensional representations of multivariate time series and multivariate spatial time series data. Using this novel method, the proposed model can both take in a large number of heterogeneous features, such as census data, intra-county mobility, inter-county mobility, social distancing data, past growth of infection, among others, and learn complex interactions between these features. Using data collected from various sources, we estimate the range of increase in infected cases seven days into the future for all U.S. counties. In addition, we use the model to identify the most influential features for prediction of the growth of infection. We also analyze pairs of features and estimate the amount of observed second-order interaction between them. Experiments show that the proposed model obtains satisfactory predictive performance and fairly interpretable feature analysis results; hence, the proposed model could complement the standard epidemiological models for national-level surveillance of pandemics, such as COVID-19. The results and findings obtained from the deep learning model could potentially inform policymakers and researchers in devising effective mitigation and response strategies. To fast-track further development and experimentation, the code used to implement the proposed model has been made fully open source.

preprint2020arXiv

Disparate Patterns of Movements and Visits to Points of Interests Located in Urban Hotspots across U.S. Metropolitan Cities during COVID-19

We examined the effect of social distancing on changes in visits to urban hotspot points of interest. Urban hotspots, such as central business districts, are gravity activity centers orchestrating movement and mobility patterns in cities. In a pandemic situation, urban hotspots could be potential superspreader areas as visits to urban hotspots can increase the risk of contact and transmission of a disease among a population. We mapped origin-destination networks from census block groups to points of interest (POIs) in sixteen cities in the United States. We adopted a coarse-grain approach to study movement patterns of visits to POIs among the hotspots and non-hotspots from January to May 2020. Also, we conducted chi-square tests to identify POIs with significant flux-in changes during the analysis period. The results showed disparate patterns across cities in terms of reduction in POI visits to hotspot areas. The sixteen cities are divided into two categories based on visits to POIs in hotspot areas. In one category, which includes the cities of, San Francisco, Seattle, and Chicago, we observe a considerable decrease in visits to POIs in hotspot areas, while in another category, including the cites of, Austin, Houston, and San Diego, the visits to hotspot areas did not greatly decrease during the social distancing period. In addition, while all the cities exhibited overall decreasing visits to POIs, one category maintained the proportion of visits to POIs in the hotspots. The proportion of visits to some POIs (e.g., Restaurant and Other Eating Places) remained stable during the social distancing period, while some POIs had an increased proportion of visits (e.g., Grocery Stores). The findings highlight that social distancing orders do yield disparate patterns of reduction in movements to hotspots POIs.

preprint2020arXiv

Early Indicators of COVID-19 Spread Risk Using Digital Trace Data of Population Activities

The spread of pandemics such as COVID-19 is strongly linked to human activities. The objective of this paper is to specify and examine early indicators of disease spread risk in cities during the initial stages of outbreak based on patterns of human activities obtained from digital trace data. In this study, the Venables distance (D_v), and the activity density (D_a) are used to quantify and evaluate human activities for 193 US counties, whose cumulative number of confirmed cases was greater than 100 as of March 31, 2020. Venables distance provides a measure of the agglomeration of the level of human activities based on the average distance of human activities across a city or a county (less distance could lead to a greater contact risk). Activity density provides a measure of level of overall activity level in a county or a city (more activity could lead to a greater risk). Accordingly, Pearson correlation analysis is used to examine the relationship between the two human activity indicators and the basic reproduction number in the following weeks. The results show statistically significant correlations between the indicators of human activities and the basic reproduction number in all counties, as well as a significant leader-follower relationship (time lag) between them. The results also show one to two weeks' lag between the change in activity indicators and the decrease in the basic reproduction number. This result implies that the human activity indicators provide effective early indicators for the spread risk of the pandemic during the early stages of the outbreak. Hence, the results could be used by the authorities to proactively assess the risk of disease spread by monitoring the daily Venables distance and activity density in a proactive manner.

preprint2020arXiv

Effects of Population Co-location Reduction on Cross-county Transmission Risk of COVID-19 in the United States

The rapid spread of COVID-19 in the United States has imposed a major threat to public health, the real economy, and human well-being. With the absence of effective vaccines, the preventive actions of social distancing and travel reduction are recognized as essential non-pharmacologic approaches to control the spread of COVID-19. Prior studies demonstrated that human movement and mobility drove the spatiotemporal distribution of COVID-19 in China. Little is known, however, about the patterns and effects of co-location reduction on cross-county transmission risk of COVID-19. This study utilizes Facebook co-location data for all counties in the United States from March to early May 2020. The analysis examines the synchronicity and time lag between travel reduction and pandemic growth trajectory to evaluate the efficacy of social distancing in ceasing the population co-location probabilities, and subsequently the growth in weekly new cases. The results show that the mitigation effects of co-location reduction appear in the growth of weekly new cases with one week of delay. Furthermore, significant segregation is found among different county groups which are categorized based on numbers of cases. The results suggest that within-group co-location probabilities remain stable, and social distancing policies primarily resulted in reduced cross-group co-location probabilities (due to travel reduction from counties with large number of cases to counties with low numbers of cases). These findings could have important practical implications for local governments to inform their intervention measures for monitoring and reducing the spread of COVID-19, as well as for adoption in future pandemics. Public policy, economic forecasting, and epidemic modeling need to account for population co-location patterns in evaluating transmission risk of COVID-19 across counties.