Source author record

Kanat Tangwongsan

Kanat Tangwongsan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Distributed, Parallel, and Cluster Computing Databases Numerical Analysis Social and Information Networks

Catalog footprint

What is connected

6works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2016arXiv

Faster and Simpler Width-Independent Parallel Algorithms for Positive Semidefinite Programming

This paper studies the problem of finding an $(1+ε)$-approximate solution to positive semidefinite programs. These are semidefinite programs in which all matrices in the constraints and objective are positive semidefinite and all scalars are non-negative. We present a simpler \NC parallel algorithm that on input with $n$ constraint matrices, requires $O(\frac{1}{ε^3} log^3 n)$ iterations, each of which involves only simple matrix operations and computing the trace of the product of a matrix exponential and a positive semidefinite matrix. Further, given a positive SDP in a factorized form, the total work of our algorithm is nearly-linear in the number of non-zero entries in the factorization.

preprint2016arXiv

Parallel Shortest-Paths Using Radius Stepping

The single-source shortest path problem (SSSP) with nonnegative edge weights is a notoriously difficult problem to solve efficiently in parallel---it is one of the graph problems said to suffer from the transitive-closure bottleneck. In practice, the $Δ$-stepping algorithm of Meyer and Sanders (J. Algorithms, 2003) often works efficiently but has no known theoretical bounds on general graphs. The algorithm takes a sequence of steps, each increasing the radius by a user-specified value $Δ$. Each step settles the vertices in its annulus but can take $Θ(n)$ substeps, each requiring $Θ(m)$ work ($n$ vertices and $m$ edges). In this paper, we describe Radius-Stepping, an algorithm with the best-known tradeoff between work and depth bounds for SSSP with nearly-linear ($\otilde(m)$) work. The algorithm is a $Δ$-stepping-like algorithm but uses a variable instead of fixed-size increase in radii, allowing us to prove a bound on the number of steps. In particular, by using what we define as a vertex $k$-radius, each step takes at most $k+2$ substeps. Furthermore, we define a $(k, ρ)$-graph property and show that if an undirected graph has this property, then the number of steps can be bounded by $O(\frac{n}ρ \log ρL)$, for a total of $O(\frac{kn}ρ \log ρL)$ substeps, each parallel. We describe how to preprocess a graph to have this property. Altogether, Radius-Stepping takes $O((m+n\log n)\log \frac{n}ρ)$ work and $O(\frac{n}ρ\log n \log (ρL))$ depth per source after preprocessing. The preprocessing step can be done in $O(m\log n + nρ^2)$ work and $O(ρ^2)$ depth or in $O(m\log n + nρ^2\log n)$ work and $O(ρ\log ρ)$ depth, and adds no more than $O(nρ)$ edges.

preprint2016arXiv

Work-Efficient Parallel and Incremental Graph Connectivity

On an evolving graph that is continuously updated by a high-velocity stream of edges, how can one efficiently maintain if two vertices are connected? This is the connectivity problem, a fundamental and widely studied problem on graphs. We present the first shared-memory parallel algorithm for incremental graph connectivity that is both provably work-efficient and has polylogarithmic parallel depth. We also present a simpler algorithm with slightly worse theoretical properties, but which is easier to implement and has good practical performance. Our experiments show a throughput of hundreds of millions of edges per second on a $20$-core machine.

preprint2013arXiv

Parallel Triangle Counting in Massive Streaming Graphs

The number of triangles in a graph is a fundamental metric, used in social network analysis, link classification and recommendation, and more. Driven by these applications and the trend that modern graph datasets are both large and dynamic, we present the design and implementation of a fast and cache-efficient parallel algorithm for estimating the number of triangles in a massive undirected graph whose edges arrive as a stream. It brings together the benefits of streaming algorithms and parallel algorithms. By building on the streaming algorithms framework, the algorithm has a small memory footprint. By leveraging the paralell cache-oblivious framework, it makes efficient use of the memory hierarchy of modern multicore machines without needing to know its specific parameters. We prove theoretical bounds on accuracy, memory access cost, and parallel runtime complexity, as well as showing empirically that the algorithm yields accurate results and substantial speedups compared to an optimized sequential implementation. (This is an expanded version of a CIKM'13 paper of the same title.)

preprint2011arXiv

Near Linear-Work Parallel SDD Solvers, Low-Diameter Decomposition, and Low-Stretch Subgraphs

We present the design and analysis of a near linear-work parallel algorithm for solving symmetric diagonally dominant (SDD) linear systems. On input of a SDD $n$-by-$n$ matrix $A$ with $m$ non-zero entries and a vector $b$, our algorithm computes a vector $\tilde{x}$ such that $\norm[A]{\tilde{x} - A^+b} \leq \vareps \cdot \norm[A]{A^+b}$ in $O(m\log^{O(1)}{n}\log{\frac1ε})$ work and $O(m^{1/3+θ}\log \frac1ε)$ depth for any fixed $θ> 0$. The algorithm relies on a parallel algorithm for generating low-stretch spanning trees or spanning subgraphs. To this end, we first develop a parallel decomposition algorithm that in polylogarithmic depth and $\otilde(|E|)$ work, partitions a graph into components with polylogarithmic diameter such that only a small fraction of the original edges are between the components. This can be used to generate low-stretch spanning trees with average stretch $O(n^α)$ in $O(n^{1+α})$ work and $O(n^α)$ depth. Alternatively, it can be used to generate spanning subgraphs with polylogarithmic average stretch in $\otilde(|E|)$ work and polylogarithmic depth. We apply this subgraph construction to derive a parallel linear system solver. By using this solver in known applications, our results imply improved parallel randomized algorithms for several problems, including single-source shortest paths, maximum flow, minimum-cost flow, and approximate maximum flow.

preprint2010arXiv

Parallel Approximation Algorithms for Facility-Location Problems

This paper presents the design and analysis of parallel approximation algorithms for facility-location problems, including $\NC$ and $\RNC$ algorithms for (metric) facility location, $k$-center, $k$-median, and $k$-means. These problems have received considerable attention during the past decades from the approximation algorithms community, concentrating primarily on improving the approximation guarantees. In this paper, we ask, is it possible to parallelize some of the beautiful results from the sequential setting? Our starting point is a small, but diverse, subset of results in approximation algorithms for facility-location problems, with a primary goal of developing techniques for devising their efficient parallel counterparts. We focus on giving algorithms with low depth, near work efficiency (compared to the sequential versions), and low cache complexity. Common in algorithms we present is the idea that instead of picking only the most cost-effective element, we make room for parallelism by allowing a small slack (e.g., a $(1+\vareps)$ factor) in what can be selected---then, we use a clean-up step to ensure that the behavior does not deviate too much from the sequential steps. All the algorithms we developed are ``cache efficient'' in that the cache complexity is bounded by $O(w/B)$, where $w$ is the work in the EREW model and $B$ is the block size.

Kanat Tangwongsan

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Faster and Simpler Width-Independent Parallel Algorithms for Positive Semidefinite Programming

Parallel Shortest-Paths Using Radius Stepping

Work-Efficient Parallel and Incremental Graph Connectivity

Parallel Triangle Counting in Massive Streaming Graphs

Near Linear-Work Parallel SDD Solvers, Low-Diameter Decomposition, and Low-Stretch Subgraphs

Parallel Approximation Algorithms for Facility-Location Problems