Source author record

Hadi Jahanshahi

Hadi Jahanshahi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Software Engineering Artificial Intelligence Computation and Language Information Retrieval math.NA math.OC Numerical Analysis physics.soc-ph

Catalog footprint

What is connected

7works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Deep Reinforcement Learning Approach for the Meal Delivery Problem

We consider a meal delivery service fulfilling dynamic customer requests given a set of couriers over the course of a day. A courier's duty is to pick-up an order from a restaurant and deliver it to a customer. We model this service as a Markov decision process and use deep reinforcement learning as the solution approach. We experiment with the resulting policies on synthetic and real-world datasets and compare those with the baseline policies. We also examine the courier utilization for different numbers of couriers. In our analysis, we specifically focus on the impact of the limited available resources in the meal delivery problem. Furthermore, we investigate the effect of intelligent order rejection and re-positioning of the couriers. Our numerical experiments show that, by incorporating the geographical locations of the restaurants, customers, and the depot, our model significantly improves the overall service quality as characterized by the expected total reward and the delivery times. Our results present valuable insights on both the courier assignment process and the optimal number of couriers for different order frequencies on a given day. The proposed model also shows a robust performance under a variety of scenarios for real-world implementation.

preprint2022arXiv

Auto Response Generation in Online Medical Chat Services

Telehealth helps to facilitate access to medical professionals by enabling remote medical services for the patients. These services have become gradually popular over the years with the advent of necessary technological infrastructure. The benefits of telehealth have been even more apparent since the beginning of the COVID-19 crisis, as people have become less inclined to visit doctors in person during the pandemic. In this paper, we focus on facilitating the chat sessions between a doctor and a patient. We note that the quality and efficiency of the chat experience can be critical as the demand for telehealth services increases. Accordingly, we develop a smart auto-response generation mechanism for medical conversations that helps doctors respond to consultation requests efficiently, particularly during busy sessions. We explore over 900,000 anonymous, historical online messages between doctors and patients collected over nine months. We implement clustering algorithms to identify the most frequent responses by doctors and manually label the data accordingly. We then train machine learning algorithms using this preprocessed data to generate the responses. The considered algorithm has two steps: a filtering (i.e., triggering) model to filter out infeasible patient messages and a response generator to suggest the top-3 doctor responses for the ones that successfully pass the triggering phase. The method provides an accuracy of 83.28\% for precision@3 and shows robustness to its parameters.

preprint2022arXiv

Can transit investments in low-income neighbourhoods increase transit use? Exploring the nexus of income, car-ownership, and transit accessibility in Toronto

Transportation equity advocates recommend improving public transit in low-income neighbourhoods to alleviate socio-spatial inequalities and increase quality of life. However, transportation planners often overlook transit investments in neighbourhoods with "transit-captive" populations because they are assumed to result in less mode-shifting, congestion relief, and environmental benefits, compared to investments that aim to attract choice riders in wealthier communities. In North American cities, while many low-income households are already transit users, some also own and use private vehicles. It suggests that transit improvements in low-income communities could indeed result in more transit use and less car use. Accordingly, the main objective of this article is to explore the statistical relationship between transit use and transit accessibility as well as how this varies by household income and vehicle ownership in the Greater Toronto and Hamilton Area (GTHA). Using stratified regression models, we find that low-income households with one or more cars per adult have the most elastic relationship between transit accessibility and transit use; they are more likely to be transit riders if transit improves. However, we confirm that in auto-centric areas with poor transit, the transit use of low-income households drops off sharply as car ownership increases. On the other hand, a sensitivity analysis suggests more opportunities for increasing transit ridership among car-deficit households when transit is improved. These findings indicate that improving transit in low-income inner suburbs, where most low-income car-owning households are living, would align social with environmental planning goals.

preprint2022arXiv

nTreeClus: a Tree-based Sequence Encoder for Clustering Categorical Series

The overwhelming presence of categorical/sequential data in diverse domains emphasizes the importance of sequence mining. The challenging nature of sequences proves the need for continuing research to find a more accurate and faster approach providing a better understanding of their (dis)similarities. This paper proposes a new Model-based approach for clustering sequence data, namely nTreeClus. The proposed method deploys Tree-based Learners, k-mers, and autoregressive models for categorical time series, culminating with a novel numerical representation of the categorical sequences. Adopting this new representation, we cluster sequences, considering the inherent patterns in categorical time series. Accordingly, the model showed robustness to its parameter. Under different simulated scenarios, nTreeClus improved the baseline methods for various internal and external cluster validation metrics for up to 10.7% and 2.7%, respectively. The empirical evaluation using synthetic and real datasets, protein sequences, and categorical time series showed that nTreeClus is competitive or superior to most state-of-the-art algorithms.

preprint2022arXiv

S-DABT: Schedule and Dependency-Aware Bug Triage in Open-Source Bug Tracking Systems

Fixing bugs in a timely manner lowers various potential costs in software maintenance. However, manual bug fixing scheduling can be time-consuming, cumbersome, and error-prone. In this paper, we propose the Schedule and Dependency-aware Bug Triage (S-DABT), a bug triaging method that utilizes integer programming and machine learning techniques to assign bugs to suitable developers. Unlike prior works that largely focus on a single component of the bug reports, our approach takes into account the textual data, bug fixing costs, and bug dependencies. We further incorporate the schedule of developers in our formulation to have a more comprehensive model for this multifaceted problem. As a result, this complete formulation considers developers' schedules and the blocking effects of the bugs while covering the most significant aspects of the previously proposed methods. Our numerical study on four open-source software systems, namely, EclipseJDT, LibreOffice, GCC, and Mozilla, shows that taking into account the schedules of the developers decreases the average bug fixing times. We find that S-DABT leads to a high level of developer utilization through a fair distribution of the tasks among the developers and efficient use of the free spots in their schedules. Via the simulation of the issue tracking system, we also show how incorporating the schedule in the model formulation reduces the bug fixing time, improves the assignment accuracy, and utilizes the capability of each developer without much comprising in the model run times. We find that S-DABT decreases the complexity of the bug dependency graph by prioritizing blocking bugs and effectively reduces the infeasible assignment ratio due to bug dependencies. Consequently, we recommend considering developers' schedules while automating bug triage.

preprint2022arXiv

Wayback Machine: A tool to capture the evolutionary behaviour of the bug reports and their triage process in open-source software systems

The issue tracking system (ITS) is a rich data source for data-driven decision-making. Different characteristics of bugs, such as severity, priority, and time to fix, provide a clear picture of an ITS. Nevertheless, such information may be misleading. For example, the exact time and the effort spent on a bug might be significantly different from the actual reporting time and the fixing time. Similarly, these values may be subjective, e.g., severity and priority values are assigned based on the intuition of a user or a developer rather than a structured and well-defined procedure. Hence, we explore the evolution of the bug dependency graph together with priority and severity levels to explore the actual triage process. Inspired by the idea of the "Wayback Machine" for the World Wide Web, we aim to reconstruct the historical decisions made in the ITS. Therefore, any bug prioritization or bug triage algorithms/scenarios can be applied in the same environment using our proposed ITS Wayback Machine. More importantly, we track the evolutionary metrics in the ITS when a custom triage/prioritization strategy is employed. We test the efficiency of the proposed algorithm using data extracted from three open-source projects. Our empirical study sheds light on the overlooked evolutionary metrics--e.g., overdue bugs and developers' loads--which are facilitated via our proposed past-event re-generator.

preprint2021arXiv

Does chronology matter in JIT defect prediction? A Partial Replication Study

Just-In-Time (JIT) models detect the fix-inducing changes (or defect-inducing changes). These models are designed based on the assumption that past code change properties are similar to future ones. However, as the system evolves, the expertise of developers and/or the complexity of the system also changes. In this work, we aim to investigate the effect of code change properties on JIT models over time. We also study the impact of using recent data as well as all available data on the performance of JIT models. Further, we analyze the effect of weighted sampling on the performance of fix-inducing properties of JIT models. For this purpose, we used datasets from Eclipse JDT, Mozilla, Eclipse Platform, and PostgreSQL. We used five families of change-code properties such as size, diffusion, history, experience, and purpose. We used Random Forest to train and test the JIT model and Brier Score and the area under the ROC curve for performance measurement. Our paper suggests that the predictive power of JIT models does not change over time. Furthermore, we observed that the chronology of data in JIT defect prediction models can be discarded by considering all the available data. On the other hand, the importance score of families of code change properties is found to oscillate over time. To mitigate the impact of the evolution of code change properties, it is recommended to use a weighted sampling approach in which more emphasis is placed upon the changes occurring closer to the current time. Moreover, since properties such as "Expertise of the Developer" and "Size" evolve with time, the models obtained from old data may exhibit different characteristics compared to those employing the newer dataset. Hence, practitioners should constantly retrain JIT models to include fresh data.

Hadi Jahanshahi

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

A Deep Reinforcement Learning Approach for the Meal Delivery Problem

Auto Response Generation in Online Medical Chat Services

Can transit investments in low-income neighbourhoods increase transit use? Exploring the nexus of income, car-ownership, and transit accessibility in Toronto

nTreeClus: a Tree-based Sequence Encoder for Clustering Categorical Series

S-DABT: Schedule and Dependency-Aware Bug Triage in Open-Source Bug Tracking Systems

Wayback Machine: A tool to capture the evolutionary behaviour of the bug reports and their triage process in open-source software systems

Does chronology matter in JIT defect prediction? A Partial Replication Study