Source author record

Mordecai Golin

Mordecai Golin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Information Theory math.IT Computational Geometry math.CA

Catalog footprint

What is connected

12works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

On Huang and Wong's Algorithm for Generalized Binary Split Trees

Huang and Wong [1984] proposed a polynomial-time dynamic-programming algorithm for computing optimal generalized binary split trees. We show that their algorithm is incorrect. Thus, it remains open whether such trees can be computed in polynomial time. Spuler [1994] proposed modifying Huang and Wong's algorithm to obtain an algorithm for a different problem: computing optimal two-way-comparison search trees. We show that the dynamic program underlying Spuler's algorithm is not valid, in that it does not satisfy the necessary optimal-substructure property and its proposed recurrence relation is incorrect. It remains unknown whether the algorithm is guaranteed to compute a correct overall solution.

preprint2021arXiv

A Simple Algorithm for Optimal Search Trees with Two-Way Comparisons

We present a simple $O(n^4)$-time algorithm for computing optimal search trees with two-way comparisons. The only previous solution to this problem, by Anderson et al., has the same running time, but is significantly more complicated and is restricted to the variant where only successful queries are allowed. Our algorithm extends directly to solve the standard full variant of the problem, which also allows unsuccessful queries and for which no polynomial-time algorithm was previously known. The correctness proof of our algorithm relies on a new structural theorem for two-way-comparison search trees.

preprint2021arXiv

On the Cost of Unsuccessful Searches in Search Trees with Two-way Comparisons

Search trees are commonly used to implement access operations to a set of stored keys. If this set is static and the probabilities of membership queries are known in advance, then one can precompute an optimal search tree, namely one that minimizes the expected access cost. For a non-key query, a search tree can determine its approximate location by returning the inter-key interval containing the query. This is in contrast to other dictionary data structures, like hash tables, that only report a failed search. We address the question "what is the additional cost of determining approximate locations for non-key queries"? We prove that for two-way comparison trees this additional cost is at most 1. Our proof is based on a novel probabilistic argument that involves converting a search tree that does not identify non-key queries into a random tree that does.

preprint2021arXiv

Optimal Search Trees with 2-Way Comparisons

In 1971, Knuth gave an $O(n^2)$-time algorithm for the classic problem of finding an optimal binary search tree. Knuth's algorithm works only for search trees based on 3-way comparisons, while most modern computers support only 2-way comparisons (e.g., $<, \le, =, \ge$, and $>$). Until this paper, the problem of finding an optimal search tree using 2-way comparisons remained open -- poly-time algorithms were known only for restricted variants. We solve the general case, giving (i) an $O(n^4)$-time algorithm and (ii) an $O(n \log n)$-time additive-3 approximation algorithm. Also, for finding optimal binary split trees, we (iii) obtain a linear speedup and (iv) prove some previous work incorrect.

preprint2020arXiv

Minmax Regret for sink location on paths with general capacities

In dynamic flow networks, every vertex starts with items (flow) that need to be shipped to designated sinks. All edges have two associated quantities: length, the amount of time required for a particle to traverse the edge, and capacity, the number of units of flow that can enter the edge in unit time. The goal is move all flow to the sinks. A variation of the problem, modelling evacuation protocols, is to find the sink location(s) that minimize evacuation time, restricting the flow to be CONFLUENT. Solving this problem is is NP-hard on general graphs, and thus research into optimal algorithms has traditionally been restricted to special graphs such as paths, and trees. A specialized version of robust optimization is minmax REGRET, in which the input flows at the vertices are only partially defined by constraints. The goal is to find a sink location that has the minimum{ regret} over all input flows that satisfy the partially defined constraints. Regret for a fully defined input flow and a sink is defined to be the difference between the evacuation time to that sink and the optimal evacuation time. A large recent literature derives polynomial time algorithms for the minmax regret $k$-sink location problem on paths and trees under the simplifying condition that all edges have the same (uniform) capacity. This paper develops a $O(n^4 \log n)$ time algorithm for the minmax regret $1$-sink problem on paths with general (non-uniform) capacities. To the best of our knowledge, this is the first minmax regret result for dynamic flow problems in any type of graph with general capacities.

preprint2020arXiv

Polynomial Time Algorithms for Constructing Optimal Binary AIFV-$2$ Codes

Huffman Codes are optimal Instantaneous Fixed-to-Variable (FV) codes in which every source symbol can only be encoded by one codeword. Relaxing these constraints permits constructing better FV codes. More specifically, recent work has shown that AIFV-$m$ codes can beat Huffman coding. AIFV-$m$ codes construct am $m$-tuple of different coding trees between which the code alternates and are only almost instantaneous (AI). This means that decoding a word might require a delay of a finite number of bits. Current algorithms for constructing optimal AIFV-$m$ codes are iterative processes that construct progressively "better sets" of code trees. The processes have been proven to finitely converge to the optimal code but with no known bounds on the convergence rate. This paper derives a geometric interpretation of the space of binary AIFV-$2$ codes, permitting the development of the first polynomially time-bounded iterative procedures for constructing optimal AIFV codes. We first show that a simple binary search procedure can replace the current iterative process to construct optimal binary AIFV-$2$ codes. We then describe how to frame the problem as a linear programming with an exponential number of constraints but a polynomial-time separability oracle. This permits using the Grötschel, Lovász and Schrijver ellipsoid method to solve the problem in a polynomial number of steps. While more complicated, this second method has the potential to lead to a polynomial time algorithm to construct optimal AIFV-$m$ codes for general $m$.

preprint2020arXiv

Scheduling with Gaps: New Models and Algorithms

We consider scheduling problems for unit jobs with release times, where the number or size of the gaps in the schedule is taken into consideration, either in the objective function or as a constraint. Except for a few papers on energy minimization, there is no work in the scheduling literature that uses performance metrics depending on the gap structure of a schedule. One of our objectives is to initiate the study of such scheduling problems with gaps. We show that such problems often lead to interesting algorithmic problems, with connections to other areas of algorithmics. We focus on the model with unit jobs. First we examine scheduling problems with deadlines, where we consider variants of minimum-gap scheduling, including maximizing throughput with a budget for gaps or minimizing the number of gaps with a throughput requirement. We then turn to other objective functions. For example, in some scenarios, gaps in a schedule may be actually desirable, leading to the problem of maximizing the number of gaps. Other versions we study include minimizing maximum gap or maximizing minimum gap. The second part of the paper examines the model without deadlines, where we focus on the tradeoff between the number of gaps and the total or maximum flow time. For all these problems we provide polynomial time algorithms, with running times ranging from $O(n \log n)$ for some problems, to $O(n^7)$ for other. The solutions involve a spectrum of algorithmic techniques, including different dynamic programming formulations, speed-up techniques based on searching Monge arrays, searching X + Y matrices, or implicit binary search.

preprint2020arXiv

Speeding up the AIFV-$2$ dynamic programs by two orders of magnitude using Range Minimum Queries

AIFV-$2$ codes are a new method for constructing lossless codes for memoryless sources that provide better worst-case redundancy than Huffman codes. They do this by using two code trees instead of one and also allowing some bounded delay in the decoding process. Known algorithms for constructing AIFV-code are iterative; at each step they replace the current code tree pair with a "better" one. The current state of the art for performing this replacement is a pair of Dynamic Programming (DP) algorithms that use $O(n^5)$ time to fill in two tables, each of size $O(n^3)$ (where $n$ is the number of different characters in the source). This paper describes how to reduce the time for filling in the DP tables by two orders of magnitude, down to $O(n^3)$. It does this by introducing a grouping technique that permits separating the $Θ(n^3)$-space tables into $Θ(n)$ groups, each of size $O(n^2)$, and then using Two-Dimensional Range-Minimum Queries (RMQs) to fill in that group's table entries in $O(n^2)$ time. This RMQ speedup technique seems to be new and might be of independent interest.

preprint2016arXiv

Minmax Tree Facility Location and Sink Evacuation with Dynamic Confluent Flows

Let $G=(V,E)$ be a graph modelling a building or road network in which edges have-both travel times (lengths) and capacities associated with them. An edge's capacity is the number of people that can enter that edge in a unit of time. In emergencies, people evacuate towards the exits. If too many people try to evacuate through the same edge, congestion builds up and slows down the evacuation. Graphs with both lengths and capacities are known as Dynamic Flow networks. An evacuation plan for $G$ consists of a choice of exit locations and a partition of the people at the vertices into groups, with each group evacuating to the same exit. The evacuation time of a plan is the time it takes until the last person evacuates. The $k$-sink evacuation problem is to provide an evacuation plan with $k$ exit locations that minimizes the evacuation time. It is known that this problem is NP-Hard for general graphs but no polynomial time algorithm was previously known even for the case of $G$ a tree. This paper presents an $O(n k^2 \log^5 n)$ algorithm for the $k$-sink evacuation problem on trees. Our algorithms also apply to a more general class of problems, which we call minmax tree facility location.

preprint2012arXiv

Huffman Coding with Letter Costs: A Linear-Time Approximation Scheme

We give a polynomial-time approximation scheme for the generalization of Huffman Coding in which codeword letters have non-uniform costs (as in Morse code, where the dash is twice as long as the dot). The algorithm computes a (1+epsilon)-approximate solution in time O(n + f(epsilon) log^3 n), where n is the input size.

preprint2010arXiv

Multidimensional Divide-and-Conquer and Weighted Digital Sums

This paper studies three types of functions arising separately in the analysis of algorithms that we analyze exactly using similar Mellin transform techniques. The first is the solution to a Multidimensional Divide-and-Conquer (MDC) recurrence that arises when solving problems on points in $d$-dimensional space. The second involves weighted digital sums. Write $n$ in its binary representation $n=(b_i b_{i-1}... b_1 b_0)_2$ and set $S_M(n) = \sum_{t=0}^i t^{\bar{M}} b_t 2^t$. We analyze the average $TS_M(n) = \frac{1}{n}\sum_{j<n} S_M(j)$. The third is a different variant of weighted digital sums. Write $n$ as $n=2^{i_1} + 2^{i_2} + ... + 2^{i_k}$ with $i_1 > i_2 > ... > i_k\geq 0$ and set $W_M(n) = \sum_{t=1}^k t^M 2^{i_t}$. We analyze the average $TW_M(n) = \frac{1}{n}\sum_{j<n} W_M(j)$. We show that both the MDC functions and $TS_M(n)$ (with $d=M+1$) have solutions of the form $λ_d n \lg^{d-1}n + \sum_{m=0}^{d-2}(n\lg^m n)A_{d,m}(\lg n) + c_d,$ where $λ_d,c_d$ are constants and $A_{d,m}(u)$'s are periodic functions with period one (given by absolutely convergent Fourier series). We also show that $TW_M(n)$ has a solution of the form $n G_M(\lg n) + d_M \lg^M n + \sum_{d=0}^{M-1}(\lg^d n)G_{M,d}(\lg n),$ where $d_M$ is a constant, $G_M(u)$ and $G_{M,d}(u)$'s are again periodic functions with period one (given by absolutely convergent Fourier series).

preprint2002arXiv

Prefix Codes: Equiprobable Words, Unequal Letter Costs

Describes a near-linear-time algorithm for a variant of Huffman coding, in which the letters may have non-uniform lengths (as in Morse code), but with the restriction that each word to be encoded has equal probability. [See also ``Huffman Coding with Unequal Letter Costs'' (2002).]

Mordecai Golin

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

On Huang and Wong's Algorithm for Generalized Binary Split Trees

A Simple Algorithm for Optimal Search Trees with Two-Way Comparisons

On the Cost of Unsuccessful Searches in Search Trees with Two-way Comparisons

Optimal Search Trees with 2-Way Comparisons

Minmax Regret for sink location on paths with general capacities

Polynomial Time Algorithms for Constructing Optimal Binary AIFV-$2$ Codes

Scheduling with Gaps: New Models and Algorithms

Speeding up the AIFV-$2$ dynamic programs by two orders of magnitude using Range Minimum Queries

Minmax Tree Facility Location and Sink Evacuation with Dynamic Confluent Flows

Huffman Coding with Letter Costs: A Linear-Time Approximation Scheme

Multidimensional Divide-and-Conquer and Weighted Digital Sums

Prefix Codes: Equiprobable Words, Unequal Letter Costs