Researcher profile

Jie Han

Jie Han contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2022arXiv

$F$-factors in Quasi-random Hypergraphs

Given $k\ge 2$ and two $k$-graphs ($k$-uniform hypergraphs) $F$ and $H$, an $F$-factor in $H$ is a set of vertex-disjoint copies of $F$ that together covers the vertex set of $H$. Lenz and Mubayi [J. Combin. Theory Ser. B, 2016] studied the $F$-factor problem in quasi-random $k$-graphs with minimum degree $Ω(n^{k-1})$. They posed the problem of characterizing the $k$-graphs $F$ such that every sufficiently large quasi-random $k$-graph with constant edge density and minimum degree $Ω(n^{k-1})$ contains an $F$-factor, and in particular, they showed that all linear $k$-graphs satisfy this property. In this paper we prove a general theorem on $F$-factors which reduces the $F$-factor problem of Lenz and Mubayi to a natural sub-problem, that is, the $F$-cover problem. By using this result, we answer the question of Lenz and Mubayi for those $F$ which are $k$-partite $k$-graphs, and for all 3-graphs $F$, separately. Our characterization result on 3-graphs is motivated by the recent work of Reiher, Rödl and Schacht [J. Lond. Math. Soc., 2018] that classifies the 3-graphs with vanishing Turán density in quasi-random $k$-graphs.

preprint2022arXiv

$H$-factors in graphs with small independence number

Let $H$ be an $h$-vertex graph. The vertex arboricity $ar(H)$ of $H$ is the least integer $r$ such that $V(H)$ can be partitioned into $r$ parts and each part induces a forest in $H$. We show that for sufficiently large $n\in h\mathbb{N}$, every $n$-vertex graph $G$ with $δ(G)\geq \max\left\{\left(1-\frac{2}{f(H)}+o(1)\right)n, \left(\frac{1}{2}+o(1)\right)n\right\}$ and $α(G)=o(n)$ contains an $H$-factor, where $f(H)=2ar(H)$ or $2ar(H)-1$. The result can be viewed an analogue of the Alon--Yuster theorem \cite{MR1376050} in Ramsey--Turán theory, which generalises the results of Balogh--Molla--Sharifzadeh~\cite{MR3570984} and Knierm--Su~\cite{MR4193066} on clique factors. In particular the degree conditions are asymptotically sharp for infinitely many graphs $H$ which are not cliques.

preprint2022arXiv

Adversarial Fine-tune with Dynamically Regulated Adversary

Adversarial training is an effective method to boost model robustness to malicious, adversarial attacks. However, such improvement in model robustness often leads to a significant sacrifice of standard performance on clean images. In many real-world applications such as health diagnosis and autonomous surgical robotics, the standard performance is more valued over model robustness against such extremely malicious attacks. This leads to the question: To what extent we can boost model robustness without sacrificing standard performance? This work tackles this problem and proposes a simple yet effective transfer learning-based adversarial training strategy that disentangles the negative effects of adversarial samples on model's standard performance. In addition, we introduce a training-friendly adversarial attack algorithm, which facilitates the boost of adversarial robustness without introducing significant training complexity. Extensive experimentation indicates that the proposed method outperforms previous adversarial training algorithms towards the target: to improve model robustness while preserving model's standard performance on clean data.

preprint2022arXiv

Cover 3-uniform hypergraphs by vertex-disjoint tight paths

Let $H$ be an $n$-vertex 3-uniform hypergraph such that every pair of vertices is in at least $n/3+o(n)$ edges. We show that $H$ contains two vertex-disjoint tight paths whose union covers the vertex set of $H$. The quantity two here is best possible and the degree condition is asymptotically best possible. This result also has an interpretation as the \emph{deficiency problems}, recently introduced by Nenadov, Sudakov and Wagner: every such $H$ can be made Hamiltonian by adding at most two vertices and all triples intersecting them.

preprint2022arXiv

Dropout Prediction Uncertainty Estimation Using Neuron Activation Strength

Dropout has been commonly used to quantify prediction uncertainty, i.e, the variations of model predictions on a given input example. However, using dropout in practice can be expensive as it requires running dropout inferences many times. In this paper, we study how to estimate dropout prediction uncertainty in a resource-efficient manner. We demonstrate that we can use neuron activation strengths to estimate dropout prediction uncertainty under different dropout settings and on a variety of tasks using three large datasets, MovieLens, Criteo, and EMNIST. Our approach provides an inference-once method to estimate dropout prediction uncertainty as a cheap auxiliary task. We also demonstrate that using activation features from a subset of the neural network layers can be sufficient to achieve uncertainty estimation performance almost comparable to that of using activation features from all layers, thus reducing resources even further for uncertainty estimation.

preprint2022arXiv

Matching of given sizes in hypergraphs

For all integers $k,d$ such that $k \geq 3$ and $k/2\leq d \leq k-1$, let $n$ be a sufficiently large integer {\rm(}which may not be divisible by $k${\rm)} and let $s\le \lfloor n/k\rfloor-1$. We show that if $H$ is a $k$-uniform hypergraph on $n$ vertices with $δ_{d}(H)>\binom{n-d}{k-d}-\binom{n-d-s+1}{k-d}$, then $H$ contains a matching of size $s$. This improves a recent result of Lu, Yu, and Yuan and also answers a question of Kühn, Osthus, and Townsend. In many cases, our result can be strengthened to $s\leq \lfloor n/k\rfloor$, which then covers the entire possible range of $s$. On the other hand, there are examples showing that the result does not hold for certain $n, k, d$ and $s= \lfloor n/k\rfloor$.

preprint2022arXiv

Semantic Compression with Side Information: A Rate-Distortion Perspective

We consider the semantic rate-distortion problem motivated by task-oriented video compression. The semantic information corresponding to the task, which is not observable to the encoder, shows impacts on the observations through a joint probability distribution. The similarities among intra-frame segments and inter-frames in video compression are formulated as side information available at both the encoder and the decoder. The decoder is interested in recovering the observation and making an inference of the semantic information under certain distortion constraints. We establish the information-theoretic limits for the tradeoff between compression rates and distortions by fully characterizing the rate-distortion function. We further evaluate the rate-distortion function under specific Markov conditions for three scenarios: i) both the task and the observation are binary sources; ii) the task is a binary classification of an integer observation as even and odd; iii) Gaussian correlated task and observation. We also illustrate through numerical results that recovering only the semantic information can reduce the coding rate comparing to recovering the source observation.

preprint2021arXiv

Boost-S: Gradient Boosted Trees for Spatial Data and Its Application to FDG-PET Imaging Data

Boosting Trees are one of the most successful statistical learning approaches that involve sequentially growing an ensemble of simple regression trees (i.e., "weak learners"). However, gradient boosted trees are not yet available for spatially correlated data. This paper proposes a new gradient Boosted Trees algorithm for Spatial Data (Boost-S) with covariate information. Boost-S integrates the spatial correlation structure into the classical framework of gradient boosted trees. Each tree is grown by solving a regularized optimization problem, where the objective function involves two penalty terms on tree complexity and takes into account the underlying spatial correlation. A computationally-efficient algorithm is proposed to obtain the ensemble trees. The proposed Boost-S is applied to the spatially-correlated FDG-PET (fluorodeoxyglucose-positron emission tomography) imaging data collected during cancer chemoradiotherapy. Our numerical investigations successfully demonstrate the advantages of the proposed Boost-S over existing approaches for this particular application.

preprint2021arXiv

Minimum degree thresholds for Hamilton $(k/2)$-cycles in $k$-uniform hypergraphs

For any even integer $k\ge 6$, integer $d$ such that $k/2\le d\le k-1$, and sufficiently large $n\in (k/2)\mathbb N$, we find a tight minimum $d$-degree condition that guarantees the existence of a Hamilton $(k/2)$-cycle in every $k$-uniform hypergraph on $n$ vertices. When $n\in k\mathbb N$, the degree condition coincides with the one for the existence of perfect matchings provided by Rödl, Ruciński and Szemerédi (for $d=k-1$) and Treglown and Zhao (for $d\ge k/2$), and thus our result strengthens theirs in this case.

preprint2021arXiv

Structural Entropy of the Stochastic Block Models

With the rapid expansion of graphs and networks and the growing magnitude of data from all areas of science, effective treatment and compression schemes of context-dependent data is extremely desirable. A particularly interesting direction is to compress the data while keeping the "structural information" only and ignoring the concrete labelings. Under this direction, Choi and Szpankowski introduced the structures (unlabeled graphs) which allowed them to compute the structural entropy of the Erdős--Rényi random graph model. Moreover, they also provided an asymptotically optimal compression algorithm that (asymptotically) achieves this entropy limit and runs in expectation in linear time. In this paper, we consider the Stochastic Block Models with an arbitrary number of parts. Indeed, we define a partitioned structural entropy for Stochastic Block Models, which generalizes the structural entropy for unlabeled graphs and encodes the partition information as well. We then compute the partitioned structural entropy of the Stochastic Block Models, and provide a compression scheme that asymptotically achieves this entropy limit.

preprint2020arXiv

Tilings in randomly perturbed graphs: bridging the gap between Hajnal-Szemerédi and Johansson-Kahn-Vu

A perfect $K_r$-tiling in a graph $G$ is a collection of vertex-disjoint copies of $K_r$ that together cover all the vertices in $G$. In this paper we consider perfect $K_r$-tilings in the setting of randomly perturbed graphs; a model introduced by Bohman, Frieze and Martin where one starts with a dense graph and then adds $m$ random edges to it. Specifically, given any fixed $0< α<1-1/r$ we determine how many random edges one must add to an $n$-vertex graph $G$ of minimum degree $δ(G) \geq αn$ to ensure that, asymptotically almost surely, the resulting graph contains a perfect $K_r$-tiling. As one increases $α$ we demonstrate that the number of random edges required `jumps&#39; at regular intervals, and within these intervals our result is best-possible. This work therefore closes the gap between the seminal work of Johansson, Kahn and Vu (which resolves the purely random case, i.e., $α=0$) and that of Hajnal and Szemerédi (which demonstrates that for $α\geq 1-1/r$ the initial graph already houses the desired perfect $K_r$-tiling).