Researcher profile

Ho-Leung Chan

Ho-Leung Chan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2011arXiv

Edit Distance to Monotonicity in Sliding Windows

Given a stream of items each associated with a numerical value, its edit distance to monotonicity is the minimum number of items to remove so that the remaining items are non-decreasing with respect to the numerical value. The space complexity of estimating the edit distance to monotonicity of a data stream is becoming well-understood over the past few years. Motivated by applications on network quality monitoring, we extend the study to estimating the edit distance to monotonicity of a sliding window covering the $w$ most recent items in the stream for any $w \ge 1$. We give a deterministic algorithm which can return an estimate within a factor of $(4+\eps)$ using $O(\frac{1}{\eps^2} \log^2(\eps w))$ space. We also extend the study in two directions. First, we consider a stream where each item is associated with a value from a partial ordered set. We give a randomized $(4+ε)$-approximate algorithm using $O(\frac{1}{ε^2} \log ε^2 w \log w)$ space. Second, we consider an out-of-order stream where each item is associated with a creation time and a numerical value, and items may be out of order with respect to their creation times. The goal is to estimate the edit distance to monotonicity with respect to the numerical value of items arranged in the order of creation times. We show that any randomized constant-approximate algorithm requires linear space.

preprint2010arXiv

Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window

The past decade has witnessed many interesting algorithms for maintaining statistics over a data stream. This paper initiates a theoretical study of algorithms for monitoring distributed data streams over a time-based sliding window (which contains a variable number of items and possibly out-of-order items). The concern is how to minimize the communication between individual streams and the root, while allowing the root, at any time, to be able to report the global statistics of all streams within a given error bound. This paper presents communication-efficient algorithms for three classical statistics, namely, basic counting, frequent items and quantiles. The worst-case communication cost over a window is $O(\frac{k} ε \log \frac{εN}{k})$ bits for basic counting and $O(\frac{k}ε \log \frac{N}{k})$ words for the remainings, where $k$ is the number of distributed data streams, $N$ is the total number of items in the streams that arrive or expire in the window, and $ε< 1$ is the desired error bound. Matching and nearly matching lower bounds are also obtained.

preprint2010arXiv

The Sorting Buffer Problem is NP-hard

We consider the offline sorting buffer problem. The input is a sequence of items of different types. All items must be processed one by one by a server. The server is equipped with a random-access buffer of limited capacity which can be used to rearrange items. The problem is to design a scheduling strategy that decides upon the order in which items from the buffer are sent to the server. Each type change incurs unit cost, and thus, the cost minimizing objective is to minimize the total number of type changes for serving the entire sequence. This problem is motivated by various applications in manufacturing processes and computer science, and it has attracted significant attention in the last few years. The main focus has been on online competitive algorithms. Surprisingly little is known on the basic offline problem. In this paper, we show that the sorting buffer problem with uniform cost is NP-hard and, thus, close one of the most fundamental questions for the offline problem. On the positive side, we give an O(1)-approximation algorithm when the scheduler is given a buffer only slightly larger than double the original size. We also give a dynamic programming algorithm for the special case of buffer size two that solves the problem exactly in linear time, improving on the standard DP which runs in cubic time.