Source author record

Han Xiao

Han Xiao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.ST Statistics Theory Discrete Mathematics Computer Science and Game Theory Computation and Language Social and Information Networks Artificial Intelligence Computer Vision cs.CY eess.SP Information Retrieval math.CO Methodology physics.geo-ph physics.optics physics.soc-ph quant-ph

Catalog footprint

What is connected

21works

18topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

jina-embeddings-v5-omni: Geometry-preserving Embeddings via Locked Aligned Towers

In this work, we introduce GELATO (Geometry-preserving Embeddings via Locked Aligned TOwers), a novel approach to multimodal embedding models. We build on the VLM-style architecture, in which non-text encoders are adapted to produce input for a language model, which in turn generates embeddings for all varieties of input. We present the result: the jina-embeddings-v5-omni suite, a pair of models that encode text, image, audio, and video input into a single semantic embedding space. GELATO extends the two Jina Embeddings v5 Text models to support additional modality by adding encoders for images and audio. The backbone text embedding models and the added non-text modality encoders remain frozen. We only trained the connecting components, representing 0.35% of the total weights of the joint model. Training is therefore much more efficient than full-parameter retraining. Additionally, the language model remains effectively unaltered, producing exactly the same embeddings for text inputs as the Jina Embeddings v5 Text models. Our evaluations show that GELATO produces results that are competitive with the state-of-the-art, yielding nearly equal performance to larger multimodal embedding models.

preprint2022arXiv

AI Enlightens Wireless Communication: A Transformer Backbone for CSI Feedback

This paper is based on the background of the 2nd Wireless Communication Artificial Intelligence (AI) Competition (WAIC) which is hosted by IMT-2020(5G) Promotion Group 5G+AIWork Group, where the framework of the eigenvector-based channel state information (CSI) feedback problem is firstly provided. Then a basic Transformer backbone for CSI feedback referred to EVCsiNet-T is proposed. Moreover, a series of potential enhancements for deep learning based (DL-based) CSI feedback including i) data augmentation, ii) loss function design, iii) training strategy, and iv) model ensemble are introduced. The experimental results involving the comparison between EVCsiNet-T and traditional codebook methods over different channels are further provided, which show the advanced performance and a promising prospect of Transformer on DL-based CSI feedback problem.

preprint2022arXiv

Arboricity games: the core and the nucleolus

The arboricity of a graph is the minimum number of forests required to cover all its edges. In this paper, we examine arboricity from a game-theoretic perspective and investigate cost-sharing in the minimum forest cover problem. We introduce the arboricity game as a cooperative cost game defined on a graph. The players are edges, and the cost of each coalition is the arboricity of the subgraph induced by the coalition. We study properties of the core and propose an efficient algorithm for computing the nucleolus when the core is not empty. In order to compute the nucleolus in the core, we introduce the prime partition which is built on the densest subgraph lattice. The prime partition decomposes the edge set of a graph into a partially ordered set defined from minimal densest minors and their invariant precedence relation. Moreover, edges from the same partition always have the same value in a core allocation. Consequently, when the core is not empty, the prime partition significantly reduces the number of variables and constraints required in the linear programs of Maschler's scheme and allows us to compute the nucleolus in polynomial time. Besides, the prime partition provides a graph decomposition analogous to the celebrated core decomposition and the density-friendly decomposition, which may be of independent interest.

preprint2022arXiv

On the Population Monotonicity of Independent Set Games

An independent set game is a cooperative game defined on graphs and dealing with profit-sharing in maximum independent set problems. A population monotonic allocation scheme is a rule specifying how to share the profit of each coalition among its participants such that every participant is better off when the coalition expands. In this paper, we provide a necessary and sufficient characterization for population monotonic allocation schemes in independent set games. Moreover, our characterization can be verified efficiently.

preprint2022arXiv

Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search

In this paper, we propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search. Differentiable architecture search (DARTS) acquires the optimal architectures by optimizing the architecture parameters with gradient descent, which significantly reduces the search cost. However, the magnitude of architecture parameters updated by gradient descent fails to reveal the actual operation importance to the task performance and therefore harms the effectiveness of obtained architectures. By contrast, we propose to evaluate the direct influence of operations on validation accuracy. To deal with the complex relationships between supernet components, we leverage Shapley value to quantify their marginal contributions by considering all possible combinations. Specifically, we iteratively optimize the supernet weights and update the architecture parameters by evaluating operation contributions via Shapley value, so that the optimal architectures are derived by selecting the operations that contribute significantly to the tasks. Since the exact computation of Shapley value is NP-hard, the Monte-Carlo sampling based algorithm with early truncation is employed for efficient approximation, and the momentum update mechanism is adopted to alleviate fluctuation of the sampling process. Extensive experiments on various datasets and various search spaces show that our Shapley-NAS outperforms the state-of-the-art methods by a considerable margin with light search cost. The code is available at https://github.com/Euphoria16/Shapley-NAS.git

preprint2022arXiv

Support Vector Machines under Adversarial Label Contamination

Machine learning algorithms are increasingly being applied in security-related tasks such as spam and malware detection, although their security properties against deliberate attacks have not yet been widely understood. Intelligent and adaptive attackers may indeed exploit specific vulnerabilities exposed by machine learning techniques to violate system security. Being robust to adversarial data manipulation is thus an important, additional requirement for machine learning algorithms to successfully operate in adversarial settings. In this work, we evaluate the security of Support Vector Machines (SVMs) to well-crafted, adversarial label noise attacks. In particular, we consider an attacker that aims to maximize the SVM's classification error by flipping a number of labels in the training data. We formalize a corresponding optimal attack strategy, and solve it by means of heuristic approaches to keep the computational complexity tractable. We report an extensive experimental analysis on the effectiveness of the considered attacks against linear and non-linear SVMs, both on synthetic and real-world datasets. We finally argue that our approach can also provide useful insights for developing more secure SVM learning algorithms, and also novel techniques in a number of related research areas, such as semi-supervised and active learning.

preprint2021arXiv

Excite atom-photon bound state inside the coupled-resonator waveguide coupled with a giant atom

It is of fundamental interest in controlling the light-matter interaction for a long time in the field of quantum information processing. However, usual excitation with the propagating photon can hardly excite a localized state of light while keeping the atom under a subradiant decay in an atom-waveguide system. Here, we propose a model of coupling between a giant atom and the dynamically-modulated coupled-resonator waveguide and find that a bound state, where the light shows the localization effect and atom exhibits a subradiant decay time, can be excited by a propagating photon. An analytical treatment based on the separation of the propagating states and localized states of light has been used and provides inspiring explanation of our finding, i.e., a propagating photon can be efficiently converted to the localized light through the light-atom interactions in three resonators at frequency difference precisely equivalent to external modulation frequency. Our work therefore provides an alternative method for actively localizing the photon in a modulated coupled-resonator waveguide system interacting with giant atom, and also points out a way to study the light-atom interaction in a synthetic frequency dimension that holds the similar Hamiltonian.

preprint2020arXiv

COVID-19 societal response captured by seismic noise in China and Italy

Seismic noise with frequencies above 1 Hz is often called cultural noise and is generally correlated quite well with human activities. Recently, cities in mainland China and Italy imposed lockdown restrictions in response to COVID-19, which gave us an unprecedented opportunity to study the relationship between seismic noise above 1 Hz and human activities. Using seismic records from stations in China and Italy, we show that seismic noise above 1 Hz was primarily generated by the local transportation systems. The lockdown of the cities and the imposition of travel restrictions led to a ~4-12 dB energy decrease in seismic noise in mainland China. Data also show that different Chinese cities experienced distinct periods of diminished cultural noise, related to differences in local response to the epidemic. In contrast, there was only ~1-6 dB energy decrease of seismic noise in Italy, after the country was put under a lockdown. The noise data indicate that traffic flow did not decrease as much in Italy, but show how different cities reacted distinctly to the lockdown conditions.

preprint2020arXiv

KoPA: Automated Kronecker Product Approximation

We consider the problem of matrix approximation and denoising induced by the Kronecker product decomposition. Specifically, we propose to approximate a given matrix by the sum of a few Kronecker products of matrices, which we refer to as the Kronecker product approximation (KoPA). Because the Kronecker product is an extension of the outer product from vectors to matrices, KoPA extends the low rank matrix approximation, and includes it as a special case. Comparing with the latter, KoPA also offers a greater flexibility, since it allows the user to choose the configuration, which are the dimensions of the two smaller matrices forming the Kronecker product. On the other hand, the configuration to be used is usually unknown, and needs to be determined from the data in order to achieve the optimal balance between accuracy and parsimony. We propose to use extended information criteria to select the configuration. Under the paradigm of high dimensional analysis, we show that the proposed procedure is able to select the true configuration with probability tending to one, under suitable conditions on the signal-to-noise ratio. We demonstrate the superiority of KoPA over the low rank approximations through numerical studies, and several benchmark image examples.

preprint2020arXiv

KSR: A Semantic Representation of Knowledge Graph within a Novel Unsupervised Paradigm

Knowledge representation is a long-history topic in AI, which is very important. A variety of models have been proposed for knowledge graph embedding, which projects symbolic entities and relations into continuous vector space. However, most related methods merely focus on the data-fitting of knowledge graph, and ignore the interpretable semantic expression. Thus, traditional embedding methods are not friendly for applications that require semantic analysis, such as question answering and entity retrieval. To this end, this paper proposes a semantic representation method for knowledge graph \textbf{(KSR)}, which imposes a two-level hierarchical generative process that globally extracts many aspects and then locally assigns a specific category in each aspect for every triple. Since both aspects and categories are semantics-relevant, the collection of categories in each aspect is treated as the semantic representation of this triple. Extensive experiments show that our model outperforms other state-of-the-art baselines substantially.

preprint2020arXiv

On the Convexity of Independent Set Games

Independent set games are cooperative games defined on graphs, where players are edges and the value of a coalition is the maximum cardinality of independent sets in the subgraph defined by the coalition. In this paper, we investigate the convexity of independent set games, as convex games possess many nice properties both economically and computationally. For independent set games introduced by Deng et al. (Math. Oper. Res., 24:751-766, 1999), we provide a necessary and sufficient characterization for the convexity, i.e., every non-pendant edge is incident to a pendant edge in the underlying graph. Our characterization immediately yields a polynomial time algorithm for recognizing convex instances of independent set games. Besides, we introduce a new class of independent set games and provide an efficient characterization for the convexity.

preprint2020arXiv

Searching for polarization in signed graphs: a local spectral approach

Signed graphs have been used to model interactions in social net-works, which can be either positive (friendly) or negative (antagonistic). The model has been used to study polarization and other related phenomena in social networks, which can be harmful to the process of democratic deliberation in our society. An interesting and challenging task in this application domain is to detect polarized communities in signed graphs. A number of different methods have been proposed for this task. However, existing approaches aim at finding globally optimal solutions. Instead, in this paper we are interested in finding polarized communities that are related to a small set of seed nodes provided as input. Seed nodes may consist of two sets, which constitute the two sides of a polarized structure. In this paper we formulate the problem of finding local polarized communities in signed graphs as a locally-biased eigen-problem. By viewing the eigenvector associated with the smallest eigenvalue of the Laplacian matrix as the solution of a constrained optimization problem, we are able to incorporate the local information as an additional constraint. In addition, we show that the locally-biased vector can be used to find communities with approximation guarantee with respect to a local analogue of the Cheeger constant on signed graphs. By exploiting the sparsity in the input graph, an indicator vector for the polarized communities can be found in time linear to the graph size. Our experiments on real-world networks validate the proposed algorithm and demonstrate its usefulness in finding local structures in this semi-supervised manner.

preprint2016arXiv

Discovering topically- and temporally-coherent events in interaction networks

With the increasing use of online communication platforms, such as email, twitter, and messaging applications, we are faced with a growing amount of data that combine content (what is said), time (when), and user (by whom) information. An important computational challenge is to analyze these data, discover meaningful patterns, and understand what is happening. We consider the problem of mining online communication data and finding top-k temporal events. We define a temporal event to be a coherent topic that is discussed frequently, in a relatively short time span, while the information ow of the event respects the underlying network structure. We construct our model for detecting temporal events in two steps. We first introduce the notion of interaction meta-graph, which connects associated interactions. Using this notion, we define a temporal event to be a subset of interactions that (i) are topically and temporally close and (ii) correspond to a tree that captures the information ow. Finding the best temporal event leads to budget version of the prize-collecting Steiner-tree (PCST) problem, which we solve using three different methods: a greedy approach, a dynamic-programming algorithm, and an adaptation to an existing approximation algorithm. The problem of finding the top- k events among a set of candidate events maps to maximum set-cover problem, and thus, solved by greedy. We compare and analyze our algorithms in both synthetic and real datasets, such as twitter and email communication. The results show that our methods are able to detect meaningful temporal events.

preprint2016arXiv

Wald tests of singular hypotheses

Motivated by the problem of testing tetrad constraints in factor analysis, we study the large-sample distribution of Wald statistics at parameter points at which the gradient of the tested constraint vanishes. When based on an asymptotically normal estimator, the Wald statistic converges to a rational function of a normal random vector. The rational function is determined by a homogeneous polynomial and a covariance matrix. For quadratic forms and bivariate monomials of arbitrary degree, we show unexpected relationships to chi-square distributions that explain conservative behavior of certain Wald tests. For general monomials, we offer a conjecture according to which the reciprocal of a certain quadratic form in the reciprocals of dependent normal random variables is chi-square distributed.

preprint2015arXiv

Margin-Based Feed-Forward Neural Network Classifiers

Margin-Based Principle has been proposed for a long time, it has been proved that this principle could reduce the structural risk and improve the performance in both theoretical and practical aspects. Meanwhile, feed-forward neural network is a traditional classifier, which is very hot at present with a deeper architecture. However, the training algorithm of feed-forward neural network is developed and generated from Widrow-Hoff Principle that means to minimize the squared error. In this paper, we propose a new training algorithm for feed-forward neural networks based on Margin-Based Principle, which could effectively promote the accuracy and generalization ability of neural network classifiers with less labelled samples and flexible network. We have conducted experiments on four UCI open datasets and achieved good results as expected. In conclusion, our model could handle more sparse labelled and more high-dimension dataset in a high accuracy while modification from old ANN method to our method is easy and almost free of work.

preprint2015arXiv

Max-Entropy Feed-Forward Clustering Neural Network

The outputs of non-linear feed-forward neural network are positive, which could be treated as probability when they are normalized to one. If we take Entropy-Based Principle into consideration, the outputs for each sample could be represented as the distribution of this sample for different clusters. Entropy-Based Principle is the principle with which we could estimate the unknown distribution under some limited conditions. As this paper defines two processes in Feed-Forward Neural Network, our limited condition is the abstracted features of samples which are worked out in the abstraction process. And the final outputs are the probability distribution for different clusters in the clustering process. As Entropy-Based Principle is considered into the feed-forward neural network, a clustering method is born. We have conducted some experiments on six open UCI datasets, comparing with a few baselines and applied purity as the measurement . The results illustrate that our method outperforms all the other baselines that are most popular clustering methods.

preprint2015arXiv

On Kernel Mengerian Orientations of Line Multigraphs

We present a polyhedral description of kernels in orientations of line multigraphs. Given a digraph $D$, let $FK(D)$ denote the fractional kernel polytope defined on $D$, and let $σ(D)$ denote the linear system defining $FK(D)$. A digraph $D$ is called kernel perfect if every induced subdigraph $D^\prime$ has a kernel, called kernel ideal if $FK(D^\prime)$ is integral for each induced subdigraph $D^\prime$, and called kernel Mengerian if $σ (D^\prime)$ is TDI for each induced subdigraph $D^\prime$. We show that an orientation of a line multigraph is kernel perfect iff it is kernel ideal iff it is kernel Mengerian. Our result strengthens the theorem of Borodin et al. [3] on kernel perfect digraphs and generalizes the theorem of Kiraly and Pap [7] on stable matching problem.

preprint2015arXiv

TransA: An Adaptive Approach for Knowledge Graph Embedding

Knowledge representation is a major topic in AI, and many studies attempt to represent entities and relations of knowledge base in a continuous vector space. Among these attempts, translation-based methods build entity and relation vectors by minimizing the translation loss from a head entity to a tail one. In spite of the success of these methods, translation-based methods also suffer from the oversimplified loss metric, and are not competitive enough to model various and complex entities/relations in knowledge bases. To address this issue, we propose \textbf{TransA}, an adaptive metric approach for embedding, utilizing the metric learning ideas to provide a more flexible embedding method. Experiments are conducted on the benchmark datasets and our proposed method makes significant and consistent improvements over the state-of-the-art baselines.

preprint2012arXiv

Covariance matrix estimation for stationary time series

We obtain a sharp convergence rate for banded covariance matrix estimates of stationary processes. A precise order of magnitude is derived for spectral radius of sample covariance matrices. We also consider a thresholded covariance matrix estimator that can better characterize sparsity if the true covariance matrix is sparse. As our main tool, we implement Toeplitz [Math. Ann. 70 (1911) 351-376] idea and relate eigenvalues of covariance matrices to the spectral densities or Fourier transforms of the covariances. We develop a large deviation result for quadratic forms of stationary processes using m-dependence approximation, under the framework of causal representation and physical dependence measures.

preprint2011arXiv

Asymptotic Inference of Autocovariances of Stationary Processes

The paper presents a systematic theory for asymptotic inference of autocovariances of stationary processes. We consider nonparametric tests for serial correlations based on the maximum (or ${\cal L}^\infty$) and the quadratic (or ${\cal L}^2$) deviations. For these two cases, with proper centering and rescaling, the asymptotic distributions of the deviations are Gumbel and Gaussian, respectively. To establish such an asymptotic theory, as byproducts, we develop a normal comparison principle and propose a sufficient condition for summability of joint cumulants of stationary processes. We adopt a simulation-based block of blocks bootstrapping procedure that improves the finite-sample performance.

preprint2011arXiv

Simultaneous Inference of Covariances

We consider asymptotic distributions of maximum deviations of sample covariance matrices, a fundamental problem in high-dimensional inference of covariances. Under mild dependence conditions on the entries of the data matrices, we establish the Gumbel convergence of the maximum deviations. Our result substantially generalizes earlier ones where the entries are assumed to be independent and identically distributed, and it provides a theoretical foundation for high-dimensional simultaneous inference of covariances.

Han Xiao

What is connected

Connect this record

See the researcher in context

Building this map preview

21 published item(s)

jina-embeddings-v5-omni: Geometry-preserving Embeddings via Locked Aligned Towers

AI Enlightens Wireless Communication: A Transformer Backbone for CSI Feedback

Arboricity games: the core and the nucleolus

On the Population Monotonicity of Independent Set Games

Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search

Support Vector Machines under Adversarial Label Contamination

Excite atom-photon bound state inside the coupled-resonator waveguide coupled with a giant atom

COVID-19 societal response captured by seismic noise in China and Italy

KoPA: Automated Kronecker Product Approximation

KSR: A Semantic Representation of Knowledge Graph within a Novel Unsupervised Paradigm

On the Convexity of Independent Set Games

Searching for polarization in signed graphs: a local spectral approach

Discovering topically- and temporally-coherent events in interaction networks

Wald tests of singular hypotheses

Margin-Based Feed-Forward Neural Network Classifiers

Max-Entropy Feed-Forward Clustering Neural Network

On Kernel Mengerian Orientations of Line Multigraphs

TransA: An Adaptive Approach for Knowledge Graph Embedding

Covariance matrix estimation for stationary time series

Asymptotic Inference of Autocovariances of Stationary Processes

Simultaneous Inference of Covariances