Topic overview

Data Structures and Algorithms

2032 works4165 researchers0 institutions

Topic snapshot

What this area looks like now

2032works
4165authors
0experts visible
0communities

Next steps

Move from topic reading into action

The graph preview below keeps the nearby papers, people and communities visible in the same reading flow.

Topic graph

See the topic as a live network

Open full explorer

Inspect nearby papers, researchers, institutions and communities without opening a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Papers in this area

24 featured work(s)

preprint2013arXiv

Modulated String Searching

In his 1987 paper entitled "Generalized String Matching", Abrahamson introduced {\em pattern matching with character classes} and provided the first efficient algorithm to solve it. The best known solution to date is due to Linhart and Shamir (2009). Another broad yet comparatively less studied class of string matching problems is that of numerical string searching, such as, e.g., the `less-than' or $L_1$-norm string searching. The best known solutions for problems in this class are based on FFT convolution after some suitable re-encoding. The present paper introduces {\em modulated string searching} as a unified framework for string matching problems where the numerical conditions can be combined with some Boolean/numerical decision conditions on the character classes. One example problem in this class is the {\em locally bounded $L_1$-norm} matching problem on character classes: here the "match" between a character at some position in the text and a set of characters at some position in the pattern is assessed based on the smallest $L_1$ distance between the text character and one of those pattern characters. The two positions "match" if the (absolute value of the) difference between the two characters does not exceed a predefined constant. The pattern has an occurrence in an alignment with the text if the sum of all such differences does not exceed a second predefined constant value. This problem requires a pointwise evaluation of the quality of each match and has no known solution based on the previously mentioned algorithms.

preprint2013arXiv

Minimum length path decompositions

We consider a bi-criteria generalization of the pathwidth problem, where, for given integers $k,l$ and a graph $G$, we ask whether there exists a path decomposition $\cP$ of $G$ such that the width of $\cP$ is at most $k$ and the number of bags in $\cP$, i.e., the \emph{length} of $\cP$, is at most $l$. We provide a complete complexity classification of the problem in terms of $k$ and $l$ for general graphs. Contrary to the original pathwidth problem, which is fixed-parameter tractable with respect to $k$, we prove that the generalized problem is NP-complete for any fixed $k\geq 4$, and is also NP-complete for any fixed $l\geq 2$. On the other hand, we give a polynomial-time algorithm that, for any (possibly disconnected) graph $G$ and integers $k\leq 3$ and $l>0$, constructs a path decomposition of width at most $k$ and length at most $l$, if any exists. As a by-product, we obtain an almost complete classification of the problem in terms of $k$ and $l$ for connected graphs. Namely, the problem is NP-complete for any fixed $k\geq 5$ and it is polynomial-time for any $k\leq 3$. This leaves open the case $k=4$ for connected graphs.

preprint2015arXiv

Structural Properties of an Open Problem in Preemptive Scheduling

Structural properties of optimal preemptive schedules have been studied in a number of recent papers with a primary focus on two structural parameters: the minimum number of preemptions necessary, and a tight lower bound on `shifts', i.e., the sizes of intervals bounded by the times created by preemptions, job starts, or completions. So far only rough bounds for these parameters have been derived for specific problems. This paper sharpens the bounds on these structural parameters for a well-known open problem in the theory of preemptive scheduling: Instances consist of in-trees of $n$ unit-execution-time jobs with release dates, and the objective is to minimize the total completion time on two processors. This is among the current, tantalizing `threshold' problems of scheduling theory: Our literature survey reveals that any significant generalization leads to an NP-hard problem, but that any significant simplification leads to tractable problem. For the above problem, we show that the number of preemptions necessary for optimality need not exceed $2n-1$; that the number must be of order $Ω(\log n)$ for some instances; and that the minimum shift need not be less than $2^{-2n+1}$. These bounds are obtained by combinatorial analysis of optimal schedules rather than by the analysis of polytope corners for linear-program formulations, an approach to be found in earlier papers. The bounds immediately follow from a fundamental structural property called `normality', by which minimal shifts of a job are exponentially decreasing functions. In particular, the first interval between a preempted job's start and its preemption is a multiple of 1/2, the second such interval is a multiple of 1/4, and in general, the $i$-th preemption occurs at a multiple of $2^{-i}$. We expect the new structural properties to play a prominent role in finally settling a vexing, still-open question of complexity.

preprint2016arXiv

A Game-Theoretic Approach for Detection of Overlapping Communities in Dynamic Complex Networks

Complex networks tend to display communities which are groups of nodes cohesively connected among themselves in one group and sparsely connected to the remainder of the network. Detecting such communities is an important computational problem, since it provides an insight into the functionality of networks. Further, investigating community structure in a dynamic network, where the network is subject to change, is even more challenging. This paper presents a game-theoretical technique for detecting community structures in dynamic as well as static complex networks. In our method, each node takes the role of a player that attempts to gain a higher payoff by joining one or more communities or switching between them. The goal of the game is to reveal community structure formed by these players by finding a Nash-equilibrium point among them. To the best of our knowledge, this is the first game-theoretic algorithm which is able to extract overlapping communities from either static or dynamic networks. We present the experimental results illustrating the effectiveness of the proposed method on both synthetic and real-world networks.

preprint2018arXiv

On tradeoffs between width- and fill-like graph parameters

In this work we consider two two-criteria optimization problems: given an input graph, the goal is to find its interval (or chordal) supergraph that minimizes the number of edges and its clique number simultaneously. For the interval supergraph, the problem can be restated as simultaneous minimization of the pathwidth $pw(G)$ and the profile $p(G)$ of the input graph $G$. We prove that for an arbitrary graph $G$ and an integer $t\in\{1,\ldots,pw(G)+1\}$, there exists an interval supergraph $G'$ of $G$ such that for its clique number it holds $ω(G')\leq(1+\frac{2}{t})(pw(G)+1)$ and the number of its edges is bounded by $|E(G')|\leq(t+2)p(G)$. In other words, the pathwidth and the profile of a graph can be simultaneously minimized within the factors of $1+\frac{2}{t}$ (plus a small constant) and $t+2$, respectively. Note that for a fixed $t$, both upper bounds provide constant factor approximations. On the negative side, we show an example that proves that, for some graphs, there is no solution in which both parameters are optimal. In case of finding a chordal supergraph, the two corresponding graph parameters that reflect its clique size and number of edges are the treewidth and fill-in. We obtain that the treewidth and the fill-in problems are also `orthogonal' in the sense that for some graphs, a solution that minimizes one of those parameters cannot minimize the other. As a motivating example, we recall graph searching games which illustrates a need of simultaneous minimization of these pairs of graph parameters.

preprint2017arXiv

Parameterized Shifted Combinatorial Optimization

Shifted combinatorial optimization is a new nonlinear optimization framework which is a broad extension of standard combinatorial optimization, involving the choice of several feasible solutions at a time. This framework captures well studied and diverse problems ranging from so-called vulnerability problems to sharing and partitioning problems. In particular, every standard combinatorial optimization problem has its shifted counterpart, which is typically much harder. Already with explicitly given input set the shifted problem may be NP-hard. In this article we initiate a study of the parameterized complexity of this framework. First we show that shifting over an explicitly given set with its cardinality as the parameter may be in XP, FPT or P, depending on the objective function. Second, we study the shifted problem over sets definable in MSO logic (which includes, e.g., the well known MSO partitioning problems). Our main results here are that shifted combinatorial optimization over MSO definable sets is in XP with respect to the MSO formula and the treewidth (or more generally clique-width) of the input graph, and is W[1]-hard even under further severe restrictions.

preprint2018arXiv

Energy-Efficient Scheduling: Classification, Bounds, and Algorithms

The problem of attaining energy efficiency in distributed systems is of importance, but a general, non-domain-specific theory of energy-minimal scheduling is far from developed. In this paper, we classify the problems of energy-minimal scheduling and present theoretical foundations of the same. We derive results concerning energy-minimal scheduling of independent jobs in a distributed system with functionally similar machines with different working and idle power ratings. The machines considered in our system can have identical as well as different speeds. If the jobs can be divided into arbitrary parts, we show that the minimum-energy schedule can be generated in linear time and give exact scheduling algorithms. For the cases where jobs are non-divisible, we prove that the scheduling problems are NP-hard and also give approximation algorithms for the same along with their bounds.

preprint2018arXiv

Finding small-width connected path decompositions in polynomial time

A connected path decomposition of a simple graph $G$ is a path decomposition $(X_1,\ldots,X_l)$ such that the subgraph of $G$ induced by $X_1\cup\cdots\cup X_i$ is connected for each $i\in\{1,\ldots,l\}$. The connected pathwidth of $G$ is then the minimum width over all connected path decompositions of $G$. We prove that for each fixed $k$, the connected pathwidth of any input graph can be computed in polynomial-time. This answers an open question raised by Fedor V. Fomin during the GRASTA 2017 workshop, since connected pathwidth is equivalent to the connected (monotone) node search game.

preprint2017arXiv

Shared processor scheduling

We study the shared processor scheduling problem with a single shared processor where a unit time saving (weight) obtained by processing a job on the shared processor depends on the job. A polynomial-time optimization algorithm has been given for the problem with equal weights in the literature. This paper extends that result by showing an $O(n \log n)$ optimization algorithm for a class of instances in which non-decreasing order of jobs with respect to processing times provides a non-increasing order with respect to weights --- this instance generalizes the unweighted case of the problem. This algorithm also leads to a $\frac{1}{2}$-approximation algorithm for the general weighted problem. The complexity of the weighted problem remains open.

preprint2018arXiv

A Parameterized Strongly Polynomial Algorithm for Block Structured Integer Programs

The theory of $n$-fold integer programming has been recently emerging as an important tool in parameterized complexity. The input to an $n$-fold integer program (IP) consists of parameter $A$, dimension $n$, and numerical data of binary encoding length $L$. It was known for some time that such programs can be solved in polynomial time using $O(n^{g(A)}L)$ arithmetic operations where $g$ is an exponential function of the parameter. In 2013 it was shown that it can be solved in fixed-parameter tractable (FPT) time using $O(f(A)n^3L)$ arithmetic operations for a single-exponential function $f$. This, and a faster algorithm for a special case of combinatorial $n$-fold IP, have led to several very recent breakthroughs in the parameterized complexity of scheduling, stringology, and computational social choice. In 2015 it was shown that it can be solved in strongly polynomial time using $O(n^{g(A)})$ arithmetic operations. Here we establish a result which subsumes all three of the above results by showing that $n$-fold IP can be solved in strongly polynomial FPT time using $O(f(A)n^3)$ arithmetic operations. In fact, our results are much more general, briefly outlined as follows. - There is a strongly polynomial algorithm for ILP whenever a so-called Graver-best oracle is realizable for it. - Graver-best oracles for the large classes of multi-stage stochastic and tree-fold ILPs can be realized in FPT time. Together with the previous oracle algorithm, this newly shows two large classes of ILP to be strongly polynomial; in contrast, only few classes of ILP were previously known to be strongly polynomial. - We show that ILP is FPT parameterized by the largest coefficient $\|A\|_\infty$ and the primal or dual treedepth of $A$, and that this parameterization cannot be relaxed, signifying substantial progress in understanding the parameterized complexity of ILP.

preprint2017arXiv

Metric Distortion of Social Choice Rules: Lower Bounds and Fairness Properties

We study social choice rules under the utilitarian distortion framework, with an additional metric assumption on the agents' costs over the alternatives. In this approach, these costs are given by an underlying metric on the set of all agents plus alternatives. Social choice rules have access to only the ordinal preferences of agents but not the latent cardinal costs that induce them. Distortion is then defined as the ratio between the social cost (typically the sum of agent costs) of the alternative chosen by the mechanism at hand, and that of the optimal alternative chosen by an omniscient algorithm. The worst-case distortion of a social choice rule is, therefore, a measure of how close it always gets to the optimal alternative without any knowledge of the underlying costs. Under this model, it has been conjectured that Ranked Pairs, the well-known weighted-tournament rule, achieves a distortion of at most 3 [Anshelevich et al. 2015]. We disprove this conjecture by constructing a sequence of instances which shows that the worst-case distortion of Ranked Pairs is at least 5. Our lower bound on the worst case distortion of Ranked Pairs matches a previously known upper bound for the Copeland rule, proving that in the worst case, the simpler Copeland rule is at least as good as Ranked Pairs. And as long as we are limited to (weighted or unweighted) tournament rules, we demonstrate that randomization cannot help achieve an expected worst-case distortion of less than 3. Using the concept of approximate majorization within the distortion framework, we prove that Copeland and Randomized Dictatorship achieve low constant factor fairness-ratios (5 and 3 respectively), which is a considerable generalization of similar results for the sum of costs and single largest cost objectives. In addition to all of the above, we outline several interesting directions for further research in this space.

preprint2019arXiv

Permutation Code Equivalence is not Harder than Graph Isomorphism when Hulls are Trivial

The paper deals with the problem of deciding if two finite-dimensional linear subspaces over an arbitrary field are identical up to a permutation of the coordinates. This problem is referred to as the permutation code equivalence. We show that given access to a subroutine that decides if two weighted undirected graphs are isomorphic, one may deterministically decide the permutation code equivalence provided that the underlying vector spaces intersect trivially with their orthogonal complement with respect to an arbitrary inner product. Such a class of vector spaces is usually called linear codes with trivial hulls. The reduction is efficient because it essentially boils down to computing the inverse of a square matrix of order the length of the involved codes. Experimental results obtained with randomly drawn binary codes having trivial hulls show that permutation code equivalence can be decided in a few minutes for lengths up to 50,000.

preprint2019arXiv

FPTAS for barrier covering problem with equal circles in 2D

In this paper, we consider a problem of covering a straight line segment by equal circles that are initially arbitrarily placed on a plane by moving their centers on a segment or on a straight line containing a segment so that the segment is completely covered, the neighboring circles in the cover are touching each other and the total length of the paths traveled by circles is minimal. The complexity status of the problem is not known. We propose a $O(n^{2+c}/\varepsilon^2)$--time FPTAS for this problem, where $n$ is the number of circles and $c>0$ is arbitrarily small real.

preprint2019arXiv

Statistical physics approaches to Unique Games

We show how two techniques from statistical physics can be adapted to solve a variant of the notorious Unique Games problem, potentially opening new avenues towards the Unique Games Conjecture. The variant, which we call Count Unique Games, is a promise problem in which the "yes" case guarantees a certain number of highly satisfiable assignments to the Unique Games instance. In the standard Unique Games problem, the "yes" case only guarantees at least one such assignment. We exhibit efficient algorithms for Count Unique Games based on approximating a suitable partition function for the Unique Games instance via (i) a zero-free region and polynomial interpolation, and (ii) the cluster expansion. We also show that a modest improvement to the parameters for which we give results would refute the Unique Games Conjecture.

preprint2019arXiv

Enumerating Isolated Cliques in Temporal Networks

Isolation is a concept from the world of clique enumeration that is mostly used to model communities that do not have much contact to the outside world. Herein, a clique is considered isolated if it has few edges connecting it to the rest of the graph. Motivated by recent work on enumerating cliques in temporal networks, we lift the isolation concept to this setting. We discover that the addition of the time dimension leads to six distinct natural isolation concepts. Our main contribution is the development of fixed-parameter enumeration algorithms for five of these six clique types employing the parameter "degree of isolation". On the empirical side, we implement and test these algorithms on (temporal) social network data, obtaining encouraging preliminary results.

preprint2019arXiv

h-Index Manipulation by Undoing Merges

The h-index is an important bibliographic measure used to assess the performance of researchers. Dutiful researchers merge different versions of their articles in their Google Scholar profile even though this can decrease their h-index. In this article, we study the manipulation of the h-index by undoing such merges. In contrast to manipulation by merging articles (van Bevern et al. [Artif. Intel. 240:19-35, 2016]) such manipulation is harder to detect. We present numerous results on computational complexity (from linear-time algorithms to parameterized computational hardness results) and empirically indicate that at least small improvements of the h-index by splitting merged articles are unfortunately easily achievable.

preprint2019arXiv

Classical algorithms, correlation decay, and complex zeros of partition functions of quantum many-body systems

In this paper, we present a quasi-polynomial time classical algorithm that estimates the partition function of quantum many-body systems at temperatures above the thermal phase transition point. It is known that in the worst case, the same problem is NP-hard below this point. Together with our work, this shows that the transition in the phase of a quantum system is also accompanied by a transition in the hardness of approximation. We also show that in a system of n particles above the phase transition point, the correlation between two observables whose distance is at least log(n) decays exponentially. We can improve the factor of log(n) to a constant when the Hamiltonian has commuting terms or is on a 1D chain. The key to our results is a characterization of the phase transition and the critical behavior of the system in terms of the complex zeros of the partition function. Our work extends a seminal work of Dobrushin and Shlosman on the equivalence between the decay of correlations and the analyticity of the free energy in classical spin models. On the algorithmic side, our result extends the scope of a recent approach due to Barvinok for solving classical counting problems to quantum many-body systems.

preprint2020arXiv

Stable-Matching Voronoi Diagrams: Combinatorial Complexity and Algorithms

We study algorithms and combinatorial complexity bounds for \emph{stable-matching Voronoi diagrams}, where a set, $S$, of $n$ point sites in the plane determines a stable matching between the points in $\mathbb{R}^2$ and the sites in $S$ such that (i) the points prefer sites closer to them and sites prefer points closer to them, and (ii) each site has a quota or "appetite" indicating the area of the set of points that can be matched to it. Thus, a stable-matching Voronoi diagram is a solution to the well-known post office problem with the added (realistic) constraint that each post office has a limit on the size of its jurisdiction. Previous work on the stable-matching Voronoi diagram provided existence and uniqueness proofs, but did not analyze its combinatorial or algorithmic complexity. In this paper, we show that a stable-matching Voronoi diagram of $n$ point sites has $O(n^{2+\varepsilon})$ faces and edges, for any $\varepsilon>0$, and show that this bound is almost tight by giving a family of diagrams with $Θ(n^2)$ faces and edges. We also provide a discrete algorithm for constructing it in $O(n^3\log n+n^2f(n))$ time in the real-RAM model of computation, where $f(n)$ is the runtime of a geometric primitive (which we define) that can be approximated numerically, but cannot, in general, be performed exactly in an algebraic model of computation. We show, however, how to compute the geometric primitive exactly for polygonal convex distance functions.

preprint2020arXiv

Multi-objective scheduling on two dedicated processors

We study a multi-objective scheduling problem on two dedicated processors. The aim is to minimize simultaneously the makespan, the total tardiness and the total completion time. This NP-hard problem requires the use of well-adapted methods. For this, we adapted genetic algorithms to multi-objective case. Three methods are presented to solve this problem. The first is aggregative, the second is Pareto and the third is non-dominated sorting genetic algorithm II (NSGA-II). We proposed some adapted lower bounds for each criterion to evaluate the quality of the found results on a large set of instances. Indeed, these bounds also make it possible to determine the dominance of one algorithm over another based on the different results found by each of them. We used two metrics to measure the quality of the Pareto front: the hypervolume indicator (HV) and the number of solutions in the optimal front (ND). The obtained results show the effectiveness of the proposed algorithms.

preprint2020arXiv

Optimal-size problem kernels for $d$-Hitting Set in linear time and space

The known linear-time kernelizations for $d$-Hitting Set guarantee linear worst-case running times using a quadratic-size data structure (that is not fully initialized). Getting rid of this data structure, we show that problem kernels of asymptotically optimal size $O(k^d)$ for $d$-Hitting Set are computable in linear time and space. Additionally, we experimentally compare the linear-time kernelizations for $d$-Hitting Set to each other and to a classical data reduction algorithm due to Weihe.

preprint2020arXiv

Quantum pattern matching Oracle construction

We propose a couple of oracle construction methods for quantum pattern matching. We in turn show that one of the construct can be used with the Grover's search algorithm for exact and partial pattern matching, deterministically. The other one also points to the matched indices, but primarily provides a means to generate the Hamming distance between the pattern to be searched and all the possible sub strings in the input string, in a probabilistic way.

preprint2020arXiv

Two-Bar Charts Packing Problem

We consider a Bar Charts Packing Problem (BCPP), in which it is necessary to pack bar charts (BCs) in a strip of minimum length. The problem is, on the one hand, a generalization of the Bin Packing Problem (BPP), and, on the other hand, a particular case of the Project Scheduling Problem with multidisciplinary jobs and one limited non-accumulative resource. Earlier, we proposed a polynomial algorithm that constructs the optimal package for a given order of non-increasing BCs. This result generalizes a similar result for BPP. For Two-Bar Charts Packing Problem (2-BCPP), when each BC consists of two bars, the algorithm we have proposed constructs a package in polynomial time, the length of which does not exceed $2\ OPT+1$, where $OPT$ is the minimum possible length of the packing. As far as we know, this is the first guaranteed estimate for 2-BCPP. We also conducted a numerical experiment in which we compared the solutions built by our approximate algorithms with the optimal solutions built by the CPLEX package. The experimental results confirmed the high efficiency of the developed algorithms.

preprint2020arXiv

COPT: Coordinated Optimal Transport for Graph Sketching

We introduce COPT, a novel distance metric between graphs defined via an optimization routine, computing a coordinated pair of optimal transport maps simultaneously. This gives an unsupervised way to learn general-purpose graph representation, applicable to both graph sketching and graph comparison. COPT involves simultaneously optimizing dual transport plans, one between the vertices of two graphs, and another between graph signal probability distributions. We show theoretically that our method preserves important global structural information on graphs, in particular spectral information, and analyze connections to existing studies. Empirically, COPT outperforms state of the art methods in graph classification on both synthetic and real datasets.

People in this topic

12 visible researcher(s)