Researcher profile

Bilal Farooq

Bilal Farooq contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2023arXiv

Attention-LSTM for Multivariate Traffic State Prediction on Rural Roads

Accurate traffic volume and speed prediction have a wide range of applications in transportation. It can result in useful and timely information for both travellers and transportation decision-makers. In this study, an Attention based Long Sort-Term Memory model (A-LSTM) is proposed to simultaneously predict traffic volume and speed in a critical rural road segmentation which connects Tehran to Chalus, the most tourist destination city in Iran. Moreover, this study compares the results of the A-LSTM model with the Long Short-Term Memory (LSTM) model. Both models show acceptable performance in predicting speed and flow. However, the A-LSTM model outperforms the LSTM in 5 and 15-minute intervals. In contrast, there is no meaningful difference between the two models for the 30-minute time interval. By comparing the performance of the models based on different time horizons, the 15-minute horizon model outperforms the others by reaching the lowest Mean Square Error (MSE) loss of 0.0032, followed by the 30 and 5-minutes horizons with 0.004 and 0.0051, respectively. In addition, this study compares the results of the models based on two transformations of temporal categorical input variables, one-hot or cyclic, for the 15-minute time interval. The results demonstrate that both LSTM and A-LSTM with cyclic feature encoding outperform those with one-hot feature encoding.

preprint2022arXiv

Distributed Ride-Matching for Shared Ridehailing Service with Intelligent City Infrastructure

High computational time is one of the most important operational issues in centralized dynamic shared ridehailing services. To resolve this issue, we propose a distributed ride-matching system that is based on vehicle to infrastructure (V2I) and infrastructure to infrastructure (I2I) communication. The application on downtown Toronto road network demonstrated that the distributed system resulted in a speed-up of 125 times in terms of computational time and showed high scalability. Moreover, the service rate in the proposed system improved by 7% compared to the centralized. However, the centralized system showed 29% and 17% improvement in wait time and detour time, respectively.

preprint2022arXiv

eFedDNN: Ensemble based Federated Deep Neural Networks for Trajectory Mode Inference

As the most significant data source in smart mobility systems, GPS trajectories can help identify user travel mode. However, these GPS datasets may contain users' private information (e.g., home location), preventing many users from sharing their private information with a third party. Hence, identifying travel modes while protecting users' privacy is a significant issue. To address this challenge, we use federated learning (FL), a privacy-preserving machine learning technique that aims at collaboratively training a robust global model by accessing users' locally trained models but not their raw data. Specifically, we designed a novel ensemble-based Federated Deep Neural Network (eFedDNN). The ensemble method combines the outputs of the different models learned via FL by the users and shows an accuracy that surpasses comparable models reported in the literature. Extensive experimental studies on a real-world open-access dataset from Montreal demonstrate that the proposed inference model can achieve accurate identification of users' mode of travel without compromising privacy.

preprint2022arXiv

Green Vehicle Routing Problem: State of the Art and Future Directions

Green vehicle routing problem (GVRP) aims to consider greenhouse gas emissions reduction, while routing the vehicles. It can be either through adopting Alternative Fuel Vehicles (AFVs) or with existing conventional fossil fuel vehicles in fleets. GVRP also takes into account environmental sustainability in transportation and logistics. We critically review several variations and specializations of GVRP to address issues related to charging, pickup, delivery, and energy consumption. Starting with the concepts and definitions of GVRP, we summarize the key elements and contributors to GVRP publications. Afterward, the issues regarding each category of green vehicle routing are reviewed, based on which key future research directions and challenges are suggested. It was observed that the main focus of previous publications is on the operational level routing decision and not the supply chain issues. The majority of publications used metaheuristic methods, while overlooking the emerging machine learning methods. We envision that in addition to machine learning, reinforcement learning, distributed systems, the internet of vehicles (IoV), and new fuel technologies have a strong role in developing the GVRP research further.

preprint2022arXiv

Interpretable and Actionable Vehicular Greenhouse Gas Emission Prediction at Road link-level

To help systematically lower anthropogenic Greenhouse gas (GHG) emissions, accurate and precise GHG emission prediction models have become a key focus of the climate research. The appeal is that the predictive models will inform policymakers, and hopefully, in turn, they will bring about systematic changes. Since the transportation sector is constantly among the top GHG emission contributors, especially in populated urban areas, substantial effort has been going into building more accurate and informative GHG prediction models to help create more sustainable urban environments. In this work, we seek to establish a predictive framework of GHG emissions at the urban road segment or link level of transportation networks. The key theme of the framework centers around model interpretability and actionability for high-level decision-makers using econometric Discrete Choice Modelling (DCM). We illustrate that DCM is capable of predicting link-level GHG emission levels on urban road networks in a parsimonious and effective manner. Our results show up to 85.4% prediction accuracy in the DCM models' performances. We also argue that since the goal of most GHG emission prediction models focuses on involving high-level decision-makers to make changes and curb emissions, the DCM-based GHG emission prediction framework is the most suitable framework.

preprint2022arXiv

Ordered-logit pedestrian stress model for traffic flow with automated vehicles

An ordered-logit model is developed to study the effects of Automated Vehicles (AVs) in the traffic mix on the average stress level of a pedestrian when crossing an urban street at mid-block. Information collected from a galvanic skin resistance sensor and virtual reality experiments are transformed into a dataset with interpretable average stress levels (low, medium, and high) and geometric, traffic, and environmental conditions. Modelling results indicate a decrease in average stress level with the increase in the percentage of AVs in the traffic mix.

preprint2022arXiv

Ordinal-ResLogit: Interpretable Deep Residual Neural Networks for Ordered Choices

This study presents an Ordinal version of Residual Logit (Ordinal-ResLogit) model to investigate the ordinal responses. We integrate the standard ResLogit model into COnsistent RAnk Logits (CORAL) framework, classified as a binary classification algorithm, to develop a fully interpretable deep learning-based ordinal regression model. As the formulation of the Ordinal-ResLogit model enjoys the Residual Neural Networks concept, our proposed model addresses the main constraint of machine learning algorithms, known as black-box. Moreover, the Ordinal-ResLogit model, as a binary classification framework for ordinal data, guarantees consistency among binary classifiers. We showed that the resulting formulation is able to capture underlying unobserved heterogeneity from the data as well as being an interpretable deep learning-based model. Formulations for market share, substitution patterns, and elasticities are derived. We compare the performance of the Ordinal-ResLogit model with an Ordered Logit Model using a stated preference (SP) dataset on pedestrian wait time and a revealed preference (RP) dataset on travel distance. Our results show that Ordinal-ResLogit outperforms the traditional ordinal regression model for both datasets. Furthermore, the results obtained from the Ordinal-ResLogit RP model show that travel attributes such as driving and transit cost have significant effects on choosing the location of non-mandatory trips. In terms of the Ordinal-ResLogit SP model, our results highlight that the road-related variables and traffic condition are contributing factors in the prediction of pedestrian waiting time such that the mixed traffic condition significantly increases the probability of choosing longer waiting times.

preprint2022arXiv

Untargeted Poisoning Attack Detection in Federated Learning via Behavior Attestation

Federated Learning (FL) is a paradigm in Machine Learning (ML) that addresses data privacy, security, access rights and access to heterogeneous information issues by training a global model using distributed nodes. Despite its advantages, there is an increased potential for cyberattacks on FL-based ML techniques that can undermine the benefits. Model-poisoning attacks on FL target the availability of the model. The adversarial objective is to disrupt the training. We propose attestedFL, a defense mechanism that monitors the training of individual nodes through state persistence in order to detect a malicious worker. A fine-grained assessment of the history of the worker permits the evaluation of its behavior in time and results in innovative detection strategies. We present three lines of defense that aim at assessing if the worker is reliable by observing if the node is really training, advancing towards a goal. Our defense exposes an attacker's malicious behavior and removes unreliable nodes from the aggregation process so that the FL process converge faster. Through extensive evaluations and against various adversarial settings, attestedFL increased the accuracy of the model between 12% to 58% under different scenarios such as attacks performed at different stages of convergence, attackers colluding and continuous attacks.

preprint2021arXiv

Decoding pedestrian and automated vehicle interactions using immersive virtual reality and interpretable deep learning

To ensure pedestrian friendly streets in the era of automated vehicles, reassessment of current policies, practices, design, rules and regulations of urban areas is of importance. This study investigates pedestrian crossing behaviour, as an important element of urban dynamics that is expected to be affected by the presence of automated vehicles. For this purpose, an interpretable machine learning framework is proposed to explore factors affecting pedestrians' wait time before crossing mid-block crosswalks in the presence of automated vehicles. To collect rich behavioural data, we developed a dynamic and immersive virtual reality experiment, with 180 participants from a heterogeneous population in 4 different locations in the Greater Toronto Area (GTA). Pedestrian wait time behaviour is then analyzed using a data-driven Cox Proportional Hazards (CPH) model, in which the linear combination of the covariates is replaced by a flexible non-linear deep neural network. The proposed model achieved a 5% improvement in goodness of fit, but more importantly, enabled us to incorporate a richer set of covariates. A game theoretic based interpretability method is used to understand the contribution of different covariates to the time pedestrians wait before crossing. Results show that the presence of automated vehicles on roads, wider lane widths, high density on roads, limited sight distance, and lack of walking habits are the main contributing factors to longer wait times. Our study suggested that, to move towards pedestrian-friendly urban areas, national level educational programs for children, enhanced safety measures for seniors, promotion of active modes of transportation, and revised traffic rules and regulations should be considered.

preprint2021arXiv

ResLogit: A residual neural network logit model for data-driven choice modelling

This paper presents a novel deep learning-based travel behaviour choice model.Our proposed Residual Logit (ResLogit) model formulation seamlessly integrates a Deep Neural Network (DNN) architecture into a multinomial logit model. Recently, DNN models such as the Multi-layer Perceptron (MLP) and the Recurrent Neural Network (RNN) have shown remarkable success in modelling complex and noisy behavioural data. However, econometric studies have argued that machine learning techniques are a `black-box' and difficult to interpret for use in the choice analysis.We develop a data-driven choice model that extends the systematic utility function to incorporate non-linear cross-effects using a series of residual layers and using skipped connections to handle model identifiability in estimating a large number of parameters.The model structure accounts for cross-effects and choice heterogeneity arising from substitution, interactions with non-chosen alternatives and other effects in a non-linear manner.We describe the formulation, model estimation, interpretability and examine the relative performance and econometric implications of our proposed model.We present an illustrative example of the model on a classic red/blue bus choice scenario example. For a real-world application, we use a travel mode choice dataset to analyze the model characteristics compared to traditional neural networks and Logit formulations.Our findings show that our ResLogit approach significantly outperforms MLP models while providing similar interpretability as a Multinomial Logit model.

preprint2020arXiv

A bi-partite generative model framework for analyzing and simulating large scale multiple discrete-continuous travel behaviour data

The emergence of data-driven demand analysis has led to the increased use of generative modelling to learn the probabilistic dependencies between random variables. Although their apparent use has mostly been limited to image recognition and classification in recent years, generative machine learning algorithms can be a powerful tool for travel behaviour research by replicating travel behaviour by the underlying properties of data structures. In this paper, we examine the use of generative machine learning approach for analyzing multiple discrete-continuous (MDC) travel behaviour data. We provide a plausible perspective of how we can exploit the use of machine learning techniques to interpret the underlying heterogeneities in the data. We show that generative models are conceptually similar to the choice selection behaviour process through information entropy and variational Bayesian inference. Without loss of generality, we consider a restricted Boltzmann machine (RBM) based algorithm with multiple discrete-continuous layers, formulated as a variational Bayesian inference optimization problem. We systematically describe the proposed machine learning algorithm and develop a process of analyzing travel behaviour data from a generative learning perspective. We show parameter stability from model analysis and simulation tests on an open dataset with multiple discrete-continuous dimensions from a data size of 293,330 observations. For interpretability, we derive the conditional probabilities, elasticities and perform statistical analysis on the latent variables. We show that our model can generate statistically similar data distributions for travel forecasting and prediction and performs better than purely discriminative methods in validation. Our results indicate that latent constructs in generative models can accurately represent the joint distribution consistently on MDC data.

preprint2020arXiv

A Differentially Private Multi-Output Deep Generative Networks Approach For Activity Diary Synthesis

In this work, we develop a privacy-by-design generative model for synthesizing the activity diary of the travel population using state-of-art deep learning approaches. This proposed approach extends literature on population synthesis by contributing novel deep learning to the development and application of synthetic travel data while guaranteeing privacy protection for members of the sample population on which the synthetic populations are based. First, we show a complete de-generalization of activity diaries to simulate the socioeconomic features and longitudinal sequences of geographically and temporally explicit activities. Second, we introduce a differential privacy approach to control the level of resolution disclosing the uniqueness of survey participants. Finally, we experiment using the Generative Adversarial Networks (GANs). We evaluate the statistical distributions, pairwise correlations and measure the level of privacy guaranteed on simulated datasets for varying noise. The results of the model show successes in simulating activity diaries composed of multiple outputs including structured socio-economic features and sequential tour activities in a differentially private manner.

preprint2020arXiv

A multi-layered blockchain framework for smart mobility data-markets

Blockchain has the potential to render the transaction of information more secure and transparent. Nowadays, transportation data are shared across multiple entities using heterogeneous mediums, from paper collected data to smartphone. Most of this data are stored in central servers that are susceptible to hacks. In some cases shady actors who may have access to such sources, share the mobility data with unwanted third parties. A multi-layered Blockchain framework for Smart Mobility Data-market (BSMD) is presented for addressing the associated privacy, security, management, and scalability challenges. Each participant shares their encrypted data to the blockchain network and can transact information with other participants as long as both parties agree to the transaction rules issued by the owner of the data. Data ownership, transparency, auditability and access control are the core principles of the proposed blockchain for smart mobility data-market. In a case study of real-time mobility data sharing, we demonstrate the performance of BSMD on a 370 nodes blockchain running on heterogeneous and geographically-separated devices communicating on a physical network. We also demonstrate how BSMD ensures the cybersecurity and privacy of individual by safeguarding against spoofing and message interception attacks and providing information access management control.

preprint2020arXiv

Applications of brain imaging methods in driving behaviour research

Applications of neuroimaging methods have substantially contributed to the scientific understanding of human factors during driving by providing a deeper insight into the neuro-cognitive aspects of driver brain. This has been achieved by conducting simulated (and occasionally, field) driving experiments while collecting driver brain signals of certain types. Here, this sector of studies is comprehensively reviewed at both macro and micro scales. Different themes of neuroimaging driving behaviour research are identified and the findings within each theme are synthesised. The surveyed literature has reported on applications of four major brain imaging methods. These include Functional Magnetic Resonance Imaging (fMRI), Electroencephalography (EEG), Functional Near-Infrared Spectroscopy (fNIRS) and Magnetoencephalography (MEG), with the first two being the most common methods in this domain. While collecting driver fMRI signal has been particularly instrumental in studying neural correlates of intoxicated driving (e.g. alcohol or cannabis) or distracted driving, the EEG method has been predominantly utilised in relation to the efforts aiming at development of automatic fatigue/drowsiness detection systems, a topic to which the literature on neuro-ergonomics of driving particularly has shown a spike of interest within the last few years. The survey also reveals that topics such as driver brain activity in semi-automated settings or the brain activity of drivers with brain injuries or chronic neurological conditions have by contrast been investigated to a very limited extent. Further, potential topics in relation to driving behaviour are identified that could benefit from the adoption of neuroimaging methods in future studies.

preprint2020arXiv

Composite Travel Generative Adversarial Networks for Tabular and Sequential Population Synthesis

Agent-based transportation modelling has become the standard to simulate travel behaviour, mobility choices and activity preferences using disaggregate travel demand data for entire populations, data that are not typically readily available. Various methods have been proposed to synthesize population data for this purpose. We present a Composite Travel Generative Adversarial Network (CTGAN), a novel deep generative model to estimate the underlying joint distribution of a population, that is capable of reconstructing composite synthetic agents having tabular (e.g. age and sex) as well as sequential mobility data (e.g. trip trajectory and sequence). The CTGAN model is compared with other recently proposed methods such as the Variational Autoencoders (VAE) method, which has shown success in high dimensional tabular population synthesis. We evaluate the performance of the synthesized outputs based on distribution similarity, multi-variate correlations and spatio-temporal metrics. The results show the consistent and accurate generation of synthetic populations and their tabular and spatially sequential attributes, generated over varying spatial scales and dimensions.

preprint2020arXiv

Multimodal Autonomous Last Mile Delivery System Design and Application

With the rapid increase in congestion, alternative solutions are needed to efficiently use the capacity of our existing networks. This paper focuses on exploring the emerging autonomous technologies for on-demand food delivery in congested urban cities. Three different last mile food delivery systems are proposed in this study employing aerial and ground autonomous vehicles technologies. The three proposed systems are: robot delivery system, drone delivery system and a hybrid delivery system. In the hybrid system the concept of hub-and-spoke network is explored in order to consolidate orders and reach more destinations in less time. To investigate the performance of the three proposed delivery systems, they are applied to the city of Mississauga network, in an in-house agent-based simulation in MATLAB. 18 Scenarios are tested differing in terms of demand and fleet size. The results show that the hybrid robot-drone delivery system performs the best with a fleet side of 25 robots and 15 drones and with an average preparation and delivery time less than the individual robot and drone system by 48% and 42% respectively.