Source author record

Ada Wai-Chee Fu

Ada Wai-Chee Fu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Databases

Catalog footprint

What is connected

4works

1topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2014arXiv

(α, k)-Minimal Sorting and Skew Join in MPI and MapReduce

As computer clusters are found to be highly effective for handling massive datasets, the design of efficient parallel algorithms for such a computing model is of great interest. We consider (α, k)-minimal algorithms for such a purpose, where α is the number of rounds in the algorithm, and k is a bound on the deviation from perfect workload balance. We focus on new (α, k)-minimal algorithms for sorting and skew equijoin operations for computer clusters. To the best of our knowledge the proposed sorting and skew join algorithms achieve the best workload balancing guarantee when compared to previous works. Our empirical study shows that they are close to optimal in workload balancing. In particular, our proposed sorting algorithm is around 25% more efficient than the state-of-the-art Terasort algorithm and achieves significantly more even workload distribution by over 50%.

preprint2014arXiv

Hop Doubling Label Indexing for Point-to-Point Distance Querying on Scale-Free Networks

We study the problem of point-to-point distance querying for massive scale-free graphs, which is important for numerous applications. Given a directed or undirected graph, we propose to build an index for answering such queries based on a hop-doubling labeling technique. We derive bounds on the index size, the computation costs and I/O costs based on the properties of unweighted scale-free graphs. We show that our method is much more efficient compared to the state-of-the-art technique, in terms of both querying time and indexing time. Our empirical study shows that our method can handle graphs that are orders of magnitude larger than existing methods.

preprint2012arXiv

IS-LABEL: an Independent-Set based Labeling Scheme for Point-to-Point Distance Querying on Large Graphs

We study the problem of computing shortest path or distance between two query vertices in a graph, which has numerous important applications. Quite a number of indexes have been proposed to answer such distance queries. However, all of these indexes can only process graphs of size barely up to 1 million vertices, which is rather small in view of many of the fast-growing real-world graphs today such as social networks and Web graphs. We propose an efficient index, which is a novel labeling scheme based on the independent set of a graph. We show that our method can handle graphs of size three orders of magnitude larger than those existing indexes.

preprint2012arXiv

Small Count Privacy and Large Count Utility in Data Publishing

While the introduction of differential privacy has been a major breakthrough in the study of privacy preserving data publication, some recent work has pointed out a number of cases where it is not possible to limit inference about individuals. The dilemma that is intrinsic in the problem is the simultaneous requirement of data utility in the published data. Differential privacy does not aim to protect information about an individual that can be uncovered even without the participation of the individual. However, this lack of coverage may violate the principle of individual privacy. Here we propose a solution by providing protection to sensitive information, by which we refer to the answers for aggregate queries with small counts. Previous works based on $\ell$-diversity can be seen as providing a special form of this kind of protection. Our method is developed with another goal which is to provide differential privacy guarantee, and for that we introduce a more refined form of differential privacy to deal with certain practical issues. Our empirical studies show that our method can preserve better utilities than a number of state-of-the-art methods although these methods do not provide the protections that we provide.

Ada Wai-Chee Fu

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

(α, k)-Minimal Sorting and Skew Join in MPI and MapReduce

Hop Doubling Label Indexing for Point-to-Point Distance Querying on Scale-Free Networks

IS-LABEL: an Independent-Set based Labeling Scheme for Point-to-Point Distance Querying on Large Graphs

Small Count Privacy and Large Count Utility in Data Publishing