Researcher profile

Debajyoti Mondal

Debajyoti Mondal contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

Leveraging Structural Properties of Source Code Graphs for Just-In-Time Bug Prediction

The most common use of data visualization is to minimize the complexity for proper understanding. A graph is one of the most commonly used representations for understanding relational data. It produces a simplified representation of data that is challenging to comprehend if kept in a textual format. In this study, we propose a methodology to utilize the relational properties of source code in the form of a graph to identify Just-in-Time (JIT) bug prediction in software systems during different revisions of software evolution and maintenance. We presented a method to convert the source codes of commit patches to equivalent graph representations and named it Source Code Graph (SCG). To understand and compare multiple source code graphs, we extracted several structural properties of these graphs, such as the density, number of cycles, nodes, edges, etc. We then utilized the attribute values of those SCGs to visualize and detect buggy software commits. We process more than 246K software commits from 12 subject systems in this investigation. Our investigation on these 12 open-source software projects written in C++ and Java programming languages shows that if we combine the features from SCG with conventional features used in similar studies, we will get the increased performance of Machine Learning (ML) based buggy commit detection models. We also find the increase of F1~Scores in predicting buggy and non-buggy commits statistically significant using the Wilcoxon Signed Rank Test. Since SCG-based feature values represent the style or structural properties of source code updates or changes in the software system, it suggests the importance of careful maintenance of source code style or structure for keeping a software system bug-free.

preprint2022arXiv

Minimum Ply Covering of Points with Unit Squares

Given a set $P$ of points and a set $U$ of axis-parallel unit squares in the Euclidean plane, a minimum ply cover of $P$ with $U$ is a subset of $U$ that covers $P$ and minimizes the number of squares that share a common intersection, called the minimum ply cover number of $P$ with $U$. Biedl et al. [Comput. Geom., 94:101712, 2020] showed that determining the minimum ply cover number for a set of points by a set of axis-parallel unit squares is NP-hard, and gave a polynomial-time 2-approximation algorithm for instances in which the minimum ply cover number is constant. The question of whether there exists a polynomial-time approximation algorithm remained open when the minimum ply cover number is $ω(1)$. We settle this open question and present a polynomial-time $(8+\varepsilon)$-approximation algorithm for the general problem, for every fixed $\varepsilon>0$.

preprint2022arXiv

Oriented Diameter of Planar Triangulations

The diameter of an undirected or a directed graph is defined to be the maximum shortest path distance over all pairs of vertices in the graph. Given an undirected graph $G$, we examine the problem of assigning directions to each edge of $G$ such that the diameter of the resulting oriented graph is minimized. The minimum diameter over all strongly connected orientations is called the oriented diameter of $G$. The problem of determining the oriented diameter of a graph is known to be NP-hard, but the time-complexity question is open for planar graphs. In this paper we compute the exact value of the oriented diameter for triangular grid graphs. We then prove an $n/3$ lower bound and an $n/2+O(\sqrt{n})$ upper bound on the oriented diameter of planar triangulations. It is known that given a planar graph $G$ with bounded treewidth and a fixed positive integer $k$, one can determine in linear time whether the oriented diameter of $G$ is at most $k$. In contrast, we consider a weighted version of the oriented diameter problem and show it to be is weakly NP-complete for planar graphs with bounded pathwidth.

preprint2022arXiv

StreamTable: An Area Proportional Visualization for Tables with Flowing Streams

Let $M$ be a two-dimensional table with each cell weighted by a nonzero positive number. A StreamTable visualization of $M$ represents the columns as non-overlapping vertical streams and the rows as horizontal stripes such that the intersection between a stream and a stripe is a rectangle with area equal to the weight of the corresponding cell. To avoid large wiggle of the streams, it is desirable to keep the consecutive cells in a stream to be adjacent. Let $B$ be the smallest axis-aligned bounding box containing the StreamTable. Then the difference between the area of $B$ and the sum of the weights is referred to as the excess area. We attempt to optimize various StreamTable aesthetics (e.g., minimizing excess area, or maximizing cell adjacencies in streams). (A) If the row permutation is fixed and the row heights are given, then we give an $O(rc)$-time algorithm to optimize these aesthetics, where $r$ and $c$ are the number of rows and columns, respectively. (B) If the row permutation is fixed but the row heights can be chosen, then we discuss a technique to compute an aesthetic (but not necessarily optimal) StreamTable by solving a quadratically-constrained quadratic program, followed by iterative improvements. If the row heights are restricted to be integers, then we prove the problem to be NP-hard. (C) If the row permutations can be chosen, then we show that it is NP-hard to find a row permutation that optimizes the area or adjacency aesthetics.

preprint2021arXiv

APX-Hardness and Approximation for the k-Burning Number Problem

Consider an information diffusion process on a graph $G$ that starts with $k>0$ burnt vertices, and at each subsequent step, burns the neighbors of the currently burnt vertices, as well as $k$ other unburnt vertices. The \emph{$k$-burning number} of $G$ is the minimum number of steps $b_k(G)$ such that all the vertices can be burned within $b_k(G)$ steps. Note that the last step may have smaller than $k$ unburnt vertices available, where all of them are burned. The $1$-burning number coincides with the well-known burning number problem, which was proposed to model the spread of social contagion. The generalization to $k$-burning number allows us to examine different worst-case contagion scenarios by varying the spread factor $k$. In this paper we prove that computing $k$-burning number is APX-hard, for any fixed constant $k$. We then give an $O((n+m)\log n)$-time 3-approximation algorithm for computing $k$-burning number, for any $k\ge 1$, where $n$ and $m$ are the number of vertices and edges, respectively. Finally, we show that even if the burning sources are given as an input, computing a burning sequence itself is an NP-hard problem.

preprint2021arXiv

Simultaneous Embedding of Colored Graphs

A set of colored graphs are compatible, if for every color $i$, the number of vertices of color $i$ is the same in every graph. A simultaneous embedding of $k$ compatibly colored graphs, each with $n$ vertices, consists of $k$ planar polyline drawings of these graphs such that the vertices of the same color are mapped to a common set of vertex locations. We prove that simultaneous embedding of $k\in o(\log \log n)$ colored planar graphs, each with $n$ vertices, can always be computed with a sublinear number of bends per edge. Specifically, we show an $O(\min\{c, n^{1-1/γ}\})$ upper bound on the number of bends per edge, where $γ= 2^{\lceil k/2 \rceil}$ and $c$ is the total number of colors. Our bound, which results from a better analysis of a previously known algorithm [Durocher and Mondal, SIAM J. Discrete Math., 32(4), 2018], improves the bound for $k$, as well as the bend complexity by a factor of $\sqrt{2}^{k}$. The algorithm can be generalized to obtain small universal point sets for colored graphs. We prove that $n\lceil c/b \rceil$ vertex locations, where $b\ge 1$, suffice to embed any set of compatibly colored $n$-vertex planar graphs with bend complexity $O(b)$, where $c$ is the number of colors.

preprint2020arXiv

(Faster) Multi-Sided Boundary Labelling

A 1-bend boundary labelling problem consists of an axis-aligned rectangle $B$, $n$ points (called sites) in the interior, and $n$ points (called ports) on the labels along the boundary of $B$. The goal is to find a set of $n$ axis-aligned curves (called leaders), each having at most one bend and connecting one site to one port, such that the leaders are pairwise disjoint. A 1-bend boundary labelling problem is $k$-sided ($1\leq k\leq 4$) if the ports appear on $k$ different sides of $B$. Kindermann et al. ["Multi-Sided Boundary Labeling", Algorithmica, 76(1): 225-258, 2016] showed that the 1-bend three-sided and four-sided boundary labelling problems can be solved in $O(n^4)$ and $O(n^9)$ time, respectively. Bose et al. [SWAT, 12:1-12:14, 2018] improved the latter running time to $O(n^6)$ by reducing the problem to computing maximum independent set in an outerstring graph. In this paper, we improve both previous results by giving new algorithms with running times $O(n^3\log n)$ and $O(n^5)$ to solve the 1-bend three-sided and four-sided boundary labelling problems, respectively.

preprint2020arXiv

Compatible Paths on Labelled Point Sets

Let $P$ and $Q$ be finite point sets of the same cardinality in $\mathbb{R}^2$, each labelled from $1$ to $n$. Two noncrossing geometric graphs $G_P$ and $G_Q$ spanning $P$ and $Q$, respectively, are called compatible if for every face $f$ in $G_P$, there exists a corresponding face in $G_Q$ with the same clockwise ordering of the vertices on its boundary as in $f$. In particular, $G_P$ and $G_Q$ must be straight-line embeddings of the same connected $n$-vertex graph. Deciding whether two labelled point sets admit compatible geometric paths is known to be NP-complete. We give polynomial-time algorithms to find compatible paths or report that none exist in three scenarios: $O(n)$ time for points in convex position; $O(n^2)$ time for two simple polygons, where the paths are restricted to remain inside the closed polygons; and $O(n^2 \log n)$ time for points in general position if the paths are restricted to be monotone.

preprint2020arXiv

Parameterized Complexity of Two-Interval Pattern Problem

A \emph{2-interval} is the union of two disjoint intervals on the real line. Two 2-intervals $D_1$ and $D_2$ are \emph{disjoint} if their intersection is empty (i.e., no interval of $D_1$ intersects any interval of $D_2$). There can be three different relations between two disjoint 2-intervals; namely, preceding ($<$), nested ($\sqsubset$) and crossing ($\between$). Two 2-intervals $D_1$ and $D_2$ are called \emph{$R$-comparable} for some $R\in\{<,\sqsubset,\between\}$, if either $D_1RD_2$ or $D_2RD_1$. A set $\mathcal{D}$ of disjoint 2-intervals is $\mathcal{R}$-comparable, for some $\mathcal{R}\subseteq\{<,\sqsubset,\between\}$ and $\mathcal{R}\neq\emptyset$, if every pair of 2-intervals in $\mathcal{R}$ are $R$-comparable for some $R\in\mathcal{R}$. Given a set of 2-intervals and some $\mathcal{R}\subseteq\{<,\sqsubset,\between\}$, the objective of the \emph{2-interval pattern problem} is to find a largest subset of 2-intervals that is $\mathcal{R}$-comparable. The 2-interval pattern problem is known to be $W[1]$-hard when $|\mathcal{R}|=3$ and $NP$-hard when $|\mathcal{R}|=2$ (except for $\mathcal{R}=\{<,\sqsubset\}$, which is solvable in quadratic time). In this paper, we fully settle the parameterized complexity of the problem by showing it to be $W[1]$-hard for both $\mathcal{R}=\{\sqsubset,\between\}$ and $\mathcal{R}=\{<,\between\}$ (when parameterized by the size of an optimal solution); this answers an open question posed by Vialette [Encyclopedia of Algorithms, 2008].