Source author record

Jiayi Xian

Jiayi Xian appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Data Structures and Algorithms Artificial Intelligence

Catalog footprint

What is connected

6works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Improving Uncertainty Calibration of Deep Neural Networks via Truth Discovery and Geometric Optimization

Deep Neural Networks (DNNs), despite their tremendous success in recent years, could still cast doubts on their predictions due to the intrinsic uncertainty associated with their learning process. Ensemble techniques and post-hoc calibrations are two types of approaches that have individually shown promise in improving the uncertainty calibration of DNNs. However, the synergistic effect of the two types of methods has not been well explored. In this paper, we propose a truth discovery framework to integrate ensemble-based and post-hoc calibration methods. Using the geometric variance of the ensemble candidates as a good indicator for sample uncertainty, we design an accuracy-preserving truth estimator with provably no accuracy drop. Furthermore, we show that post-hoc calibration can also be enhanced by truth discovery-regularized optimization. On large-scale datasets including CIFAR and ImageNet, our method shows consistent improvement against state-of-the-art calibration approaches on both histogram-based and kernel density-based evaluation metrics. Our codes are available at https://github.com/horsepurve/truly-uncertain.

preprint2021arXiv

Meta-Learning with Neural Tangent Kernels

Model Agnostic Meta-Learning (MAML) has emerged as a standard framework for meta-learning, where a meta-model is learned with the ability of fast adapting to new tasks. However, as a double-looped optimization problem, MAML needs to differentiate through the whole inner-loop optimization path for every outer-loop training step, which may lead to both computational inefficiency and sub-optimal solutions. In this paper, we generalize MAML to allow meta-learning to be defined in function spaces, and propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK). Within this paradigm, we introduce two meta-learning algorithms in the RKHS, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework. We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory. Extensive experimental studies demonstrate advantages of our paradigm in both efficiency and quality of solutions compared to related meta-learning algorithms. Another interesting feature of our proposed methods is that they are demonstrated to be more robust to adversarial attacks and out-of-distribution adaptation than popular baselines, as demonstrated in our experiments.

preprint2020arXiv

Consistent $k$-Median: Simpler, Better and Robust

In this paper we introduce and study the online consistent $k$-clustering with outliers problem, generalizing the non-outlier version of the problem studied in [Lattanzi-Vassilvitskii, ICML17]. We show that a simple local-search based online algorithm can give a bicriteria constant approximation for the problem with $O(k^2 \log^2 (nD))$ swaps of medians (recourse) in total, where $D$ is the diameter of the metric. When restricted to the problem without outliers, our algorithm is simpler, deterministic and gives better approximation ratio and recourse, compared to that of [Lattanzi-Vassilvitskii, ICML17].

preprint2020arXiv

Graph Neural Networks with Composite Kernels

Learning on graph structured data has drawn increasing interest in recent years. Frameworks like Graph Convolutional Networks (GCNs) have demonstrated their ability to capture structural information and obtain good performance in various tasks. In these frameworks, node aggregation schemes are typically used to capture structural information: a node's feature vector is recursively computed by aggregating features of its neighboring nodes. However, most of aggregation schemes treat all connections in a graph equally, ignoring node feature similarities. In this paper, we re-interpret node aggregation from the perspective of kernel weighting, and present a framework to consider feature similarity in an aggregation scheme. Specifically, we show that normalized adjacency matrix is equivalent to a neighbor-based kernel matrix in a Krein Space. We then propose feature aggregation as the composition of the original neighbor-based kernel and a learnable kernel to encode feature similarities in a feature space. We further show how the proposed method can be extended to Graph Attention Network (GAT). Experimental results demonstrate better performance of our proposed framework in several real-world applications.

preprint2020arXiv

On Approximating Degree-Bounded Network Design Problems

Directed Steiner Tree (DST) is a central problem in combinatorial optimization and theoretical computer science: Given a directed graph $G=(V, E)$ with edge costs $c \in \mathbb{R}_{\geq 0}^E$, a root $r \in V$ and $k$ terminals $K\subseteq V$, we need to output the minimum-cost arborescence in $G$ that contains an $r$\textrightarrow $t$ path for every $t \in K$. Recently, Grandoni, Laekhanukit and Li, and independently Ghuge and Nagarajan, gave quasi-polynomial time $O(\log^2k/\log \log k)$-approximation algorithms for the problem, which are tight under popular complexity assumptions. In this paper, we consider the more general Degree-Bounded Directed Steiner Tree (DB-DST) problem, where we are additionally given a degree bound $d_v$ on each vertex $v \in V$, and we require that every vertex $v$ in the output tree has at most $d_v$ children. We give a quasi-polynomial time $(O(\log n \log k), O(\log^2 n))$-bicriteria approximation: The algorithm produces a solution with cost at most $O(\log n\log k)$ times the cost of the optimum solution that violates the degree constraints by at most a factor of $O(\log^2n)$. This is the first non-trivial result for the problem. While our cost-guarantee is nearly optimal, the degree violation factor of $O(\log^2n)$ is an $O(\log n)$-factor away from the approximation lower bound of $Ω(\log n)$ from the set-cover hardness. The hardness result holds even on the special case of the {\em Degree-Bounded Group Steiner Tree} problem on trees (DB-GST-T). With the hope of closing the gap, we study the question of whether the degree violation factor can be made tight for this special case. We answer the question in the affirmative by giving an $(O(\log n\log k), O(\log n))$-bicriteria approximation algorithm for DB-GST-T.

preprint2020arXiv

The Power of Recourse: Better Algorithms for Facility Location in Online and Dynamic Models

In this paper we study the facility location problem in the online with recourse and dynamic algorithm models. In the online with recourse model, clients arrive one by one and our algorithm needs to maintain good solutions at all time steps with only a few changes to the previously made decisions (called recourse). We show that the classic local search technique can lead to a $(1+\sqrt{2}+ε)$-competitive online algorithm for facility location with only $O\left(\frac{\log n}ε\log\frac1ε\right)$ amortized facility and client recourse. We then turn to the dynamic algorithm model for the problem, where the main goal is to design fast algorithms that maintain good solutions at all time steps. We show that the result for online facility location, combined with the randomized local search technique of Charikar and Guha [10], leads to an $O(1+\sqrt{2}+ε)$ approximation dynamic algorithm with amortized update time of $\tilde O(n)$ in the incremental setting. Notice that the running time is almost optimal, since in general metric space it takes $Ω(n)$ time to specify a new client's position. The approximation factor of our algorithm also matches the best offline analysis of the classic local search algorithm. Finally, we study the fully dynamic model for facility location, where clients can both arrive and depart. Our main result is an $O(1)$-approximation algorithm in this model with $O(|F|)$ preprocessing time and $O(\log^3 D)$ amortized update time for the HST metric spaces. Using the seminal results of Bartal [4] and Fakcharoenphol, Rao and Talwar [17], which show that any arbitrary $N$-point metric space can be embedded into a distribution over HSTs such that the expected distortion is at most $O(\log N)$, we obtain a $O(\log |F|)$ approximation with preprocessing time of $O(|F|^2\log |F|)$ and $O(\log^3 D)$ amortized update time.

Jiayi Xian

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Improving Uncertainty Calibration of Deep Neural Networks via Truth Discovery and Geometric Optimization

Meta-Learning with Neural Tangent Kernels

Consistent $k$-Median: Simpler, Better and Robust

Graph Neural Networks with Composite Kernels

On Approximating Degree-Bounded Network Design Problems

The Power of Recourse: Better Algorithms for Facility Location in Online and Dynamic Models