Source author record

Wai Ming Tai

Wai Ming Tai appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computational Geometry Data Structures and Algorithms Artificial Intelligence math.ST Statistics Theory

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Optimal Coreset for Gaussian Kernel Density Estimation

Given a point set $P\subset \mathbb{R}^d$, the kernel density estimate of $P$ is defined as \[ \overline{\mathcal{G}}_P(x) = \frac{1}{\left|P\right|}\sum_{p\in P}e^{-\left\lVert x-p \right\rVert^2} \] for any $x\in\mathbb{R}^d$. We study how to construct a small subset $Q$ of $P$ such that the kernel density estimate of $P$ is approximated by the kernel density estimate of $Q$. This subset $Q$ is called a coreset. The main technique in this work is constructing a $\pm 1$ coloring on the point set $P$ by discrepancy theory and we leverage Banaszczyk's Theorem. When $d>1$ is a constant, our construction gives a coreset of size $O\left(\frac{1}{\varepsilon}\right)$ as opposed to the best-known result of $O\left(\frac{1}{\varepsilon}\sqrt{\log\frac{1}{\varepsilon}}\right)$. It is the first result to give a breakthrough on the barrier of $\sqrt{\log}$ factor even when $d=2$.

preprint2022arXiv

Optimal estimation of Gaussian DAG models

We study the optimal sample complexity of learning a Gaussian directed acyclic graph (DAG) from observational data. Our main results establish the minimax optimal sample complexity for learning the structure of a linear Gaussian DAG model in two settings of interest: 1) Under equal variances without knowledge of the true ordering, and 2) For general linear models given knowledge of the ordering. In both cases the sample complexity is $n\asymp q\log(d/q)$, where $q$ is the maximum number of parents and $d$ is the number of nodes. We further make comparisons with the classical problem of learning (undirected) Gaussian graphical models, showing that under the equal variance assumption, these two problems share the same optimal sample complexity. In other words, at least for Gaussian models with equal error variances, learning a directed graphical model is statistically no more difficult than learning an undirected graphical model. Our results also extend to more general identification assumptions as well as subgaussian errors.

preprint2020arXiv

The GaussianSketch for Almost Relative Error Kernel Distance

We introduce two versions of a new sketch for approximately embedding the Gaussian kernel into Euclidean inner product space. These work by truncating infinite expansions of the Gaussian kernel, and carefully invoking the RecursiveTensorSketch [Ahle et al. SODA 2020]. After providing concentration and approximation properties of these sketches, we use them to approximate the kernel distance between points sets. These sketches yield almost $(1+\varepsilon)$-relative error, but with a small additive $α$ term. In the first variants the dependence on $1/α$ is poly-logarithmic, but has higher degree of polynomial dependence on the original dimension $d$. In the second variant, the dependence on $1/α$ is still poly-logarithmic, but the dependence on $d$ is linear.

preprint2014arXiv

Tracking the Frequency Moments at All Times

The traditional requirement for a randomized streaming algorithm is just {\em one-shot}, i.e., algorithm should be correct (within the stated $\eps$-error bound) at the end of the stream. In this paper, we study the {\em tracking} problem, where the output should be correct at all times. The standard approach for solving the tracking problem is to run $O(\log m)$ independent instances of the one-shot algorithm and apply the union bound to all $m$ time instances. In this paper, we study if this standard approach can be improved, for the classical frequency moment problem. We show that for the $F_p$ problem for any $1 < p \le 2$, we actually only need $O(\log \log m + \log n)$ copies to achieve the tracking guarantee in the cash register model, where $n$ is the universe size. Meanwhile, we present a lower bound of $Ω(\log m \log\log m)$ bits for all linear sketches achieving this guarantee. This shows that our upper bound is tight when $n=(\log m)^{O(1)}$. We also present an $Ω(\log^2 m)$ lower bound in the turnstile model, showing that the standard approach by using the union bound is essentially optimal.