Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

jina-embeddings-v5-omni: Geometry-preserving Embeddings via Locked Aligned Towers

In this work, we introduce GELATO (Geometry-preserving Embeddings via Locked Aligned TOwers), a novel approach to multimodal embedding models. We build on the VLM-style architecture, in which non-text encoders are adapted to produce input for a language model, which in turn generates embeddings for all varieties of input. We present the result: the jina-embeddings-v5-omni suite, a pair of models that encode text, image, audio, and video input into a single semantic embedding space. GELATO extends the two Jina Embeddings v5 Text models to support additional modality by adding encoders for images and audio. The backbone text embedding models and the added non-text modality encoders remain frozen. We only trained the connecting components, representing 0.35% of the total weights of the joint model. Training is therefore much more efficient than full-parameter retraining. Additionally, the language model remains effectively unaltered, producing exactly the same embeddings for text inputs as the Jina Embeddings v5 Text models. Our evaluations show that GELATO produces results that are competitive with the state-of-the-art, yielding nearly equal performance to larger multimodal embedding models.

preprint2022arXiv

AI Enlightens Wireless Communication: A Transformer Backbone for CSI Feedback

This paper is based on the background of the 2nd Wireless Communication Artificial Intelligence (AI) Competition (WAIC) which is hosted by IMT-2020(5G) Promotion Group 5G+AIWork Group, where the framework of the eigenvector-based channel state information (CSI) feedback problem is firstly provided. Then a basic Transformer backbone for CSI feedback referred to EVCsiNet-T is proposed. Moreover, a series of potential enhancements for deep learning based (DL-based) CSI feedback including i) data augmentation, ii) loss function design, iii) training strategy, and iv) model ensemble are introduced. The experimental results involving the comparison between EVCsiNet-T and traditional codebook methods over different channels are further provided, which show the advanced performance and a promising prospect of Transformer on DL-based CSI feedback problem.

preprint2022arXiv

Arboricity games: the core and the nucleolus

The arboricity of a graph is the minimum number of forests required to cover all its edges. In this paper, we examine arboricity from a game-theoretic perspective and investigate cost-sharing in the minimum forest cover problem. We introduce the arboricity game as a cooperative cost game defined on a graph. The players are edges, and the cost of each coalition is the arboricity of the subgraph induced by the coalition. We study properties of the core and propose an efficient algorithm for computing the nucleolus when the core is not empty. In order to compute the nucleolus in the core, we introduce the prime partition which is built on the densest subgraph lattice. The prime partition decomposes the edge set of a graph into a partially ordered set defined from minimal densest minors and their invariant precedence relation. Moreover, edges from the same partition always have the same value in a core allocation. Consequently, when the core is not empty, the prime partition significantly reduces the number of variables and constraints required in the linear programs of Maschler's scheme and allows us to compute the nucleolus in polynomial time. Besides, the prime partition provides a graph decomposition analogous to the celebrated core decomposition and the density-friendly decomposition, which may be of independent interest.

preprint2022arXiv

On the Population Monotonicity of Independent Set Games

An independent set game is a cooperative game defined on graphs and dealing with profit-sharing in maximum independent set problems. A population monotonic allocation scheme is a rule specifying how to share the profit of each coalition among its participants such that every participant is better off when the coalition expands. In this paper, we provide a necessary and sufficient characterization for population monotonic allocation schemes in independent set games. Moreover, our characterization can be verified efficiently.

preprint2022arXiv

Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search

In this paper, we propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search. Differentiable architecture search (DARTS) acquires the optimal architectures by optimizing the architecture parameters with gradient descent, which significantly reduces the search cost. However, the magnitude of architecture parameters updated by gradient descent fails to reveal the actual operation importance to the task performance and therefore harms the effectiveness of obtained architectures. By contrast, we propose to evaluate the direct influence of operations on validation accuracy. To deal with the complex relationships between supernet components, we leverage Shapley value to quantify their marginal contributions by considering all possible combinations. Specifically, we iteratively optimize the supernet weights and update the architecture parameters by evaluating operation contributions via Shapley value, so that the optimal architectures are derived by selecting the operations that contribute significantly to the tasks. Since the exact computation of Shapley value is NP-hard, the Monte-Carlo sampling based algorithm with early truncation is employed for efficient approximation, and the momentum update mechanism is adopted to alleviate fluctuation of the sampling process. Extensive experiments on various datasets and various search spaces show that our Shapley-NAS outperforms the state-of-the-art methods by a considerable margin with light search cost. The code is available at https://github.com/Euphoria16/Shapley-NAS.git

preprint2022arXiv

Support Vector Machines under Adversarial Label Contamination

Machine learning algorithms are increasingly being applied in security-related tasks such as spam and malware detection, although their security properties against deliberate attacks have not yet been widely understood. Intelligent and adaptive attackers may indeed exploit specific vulnerabilities exposed by machine learning techniques to violate system security. Being robust to adversarial data manipulation is thus an important, additional requirement for machine learning algorithms to successfully operate in adversarial settings. In this work, we evaluate the security of Support Vector Machines (SVMs) to well-crafted, adversarial label noise attacks. In particular, we consider an attacker that aims to maximize the SVM's classification error by flipping a number of labels in the training data. We formalize a corresponding optimal attack strategy, and solve it by means of heuristic approaches to keep the computational complexity tractable. We report an extensive experimental analysis on the effectiveness of the considered attacks against linear and non-linear SVMs, both on synthetic and real-world datasets. We finally argue that our approach can also provide useful insights for developing more secure SVM learning algorithms, and also novel techniques in a number of related research areas, such as semi-supervised and active learning.

preprint2021arXiv

Excite atom-photon bound state inside the coupled-resonator waveguide coupled with a giant atom

It is of fundamental interest in controlling the light-matter interaction for a long time in the field of quantum information processing. However, usual excitation with the propagating photon can hardly excite a localized state of light while keeping the atom under a subradiant decay in an atom-waveguide system. Here, we propose a model of coupling between a giant atom and the dynamically-modulated coupled-resonator waveguide and find that a bound state, where the light shows the localization effect and atom exhibits a subradiant decay time, can be excited by a propagating photon. An analytical treatment based on the separation of the propagating states and localized states of light has been used and provides inspiring explanation of our finding, i.e., a propagating photon can be efficiently converted to the localized light through the light-atom interactions in three resonators at frequency difference precisely equivalent to external modulation frequency. Our work therefore provides an alternative method for actively localizing the photon in a modulated coupled-resonator waveguide system interacting with giant atom, and also points out a way to study the light-atom interaction in a synthetic frequency dimension that holds the similar Hamiltonian.

preprint2020arXiv

COVID-19 societal response captured by seismic noise in China and Italy

Seismic noise with frequencies above 1 Hz is often called cultural noise and is generally correlated quite well with human activities. Recently, cities in mainland China and Italy imposed lockdown restrictions in response to COVID-19, which gave us an unprecedented opportunity to study the relationship between seismic noise above 1 Hz and human activities. Using seismic records from stations in China and Italy, we show that seismic noise above 1 Hz was primarily generated by the local transportation systems. The lockdown of the cities and the imposition of travel restrictions led to a ~4-12 dB energy decrease in seismic noise in mainland China. Data also show that different Chinese cities experienced distinct periods of diminished cultural noise, related to differences in local response to the epidemic. In contrast, there was only ~1-6 dB energy decrease of seismic noise in Italy, after the country was put under a lockdown. The noise data indicate that traffic flow did not decrease as much in Italy, but show how different cities reacted distinctly to the lockdown conditions.

preprint2020arXiv

KoPA: Automated Kronecker Product Approximation

We consider the problem of matrix approximation and denoising induced by the Kronecker product decomposition. Specifically, we propose to approximate a given matrix by the sum of a few Kronecker products of matrices, which we refer to as the Kronecker product approximation (KoPA). Because the Kronecker product is an extension of the outer product from vectors to matrices, KoPA extends the low rank matrix approximation, and includes it as a special case. Comparing with the latter, KoPA also offers a greater flexibility, since it allows the user to choose the configuration, which are the dimensions of the two smaller matrices forming the Kronecker product. On the other hand, the configuration to be used is usually unknown, and needs to be determined from the data in order to achieve the optimal balance between accuracy and parsimony. We propose to use extended information criteria to select the configuration. Under the paradigm of high dimensional analysis, we show that the proposed procedure is able to select the true configuration with probability tending to one, under suitable conditions on the signal-to-noise ratio. We demonstrate the superiority of KoPA over the low rank approximations through numerical studies, and several benchmark image examples.

preprint2020arXiv

KSR: A Semantic Representation of Knowledge Graph within a Novel Unsupervised Paradigm

Knowledge representation is a long-history topic in AI, which is very important. A variety of models have been proposed for knowledge graph embedding, which projects symbolic entities and relations into continuous vector space. However, most related methods merely focus on the data-fitting of knowledge graph, and ignore the interpretable semantic expression. Thus, traditional embedding methods are not friendly for applications that require semantic analysis, such as question answering and entity retrieval. To this end, this paper proposes a semantic representation method for knowledge graph \textbf{(KSR)}, which imposes a two-level hierarchical generative process that globally extracts many aspects and then locally assigns a specific category in each aspect for every triple. Since both aspects and categories are semantics-relevant, the collection of categories in each aspect is treated as the semantic representation of this triple. Extensive experiments show that our model outperforms other state-of-the-art baselines substantially.

preprint2020arXiv

On the Convexity of Independent Set Games

Independent set games are cooperative games defined on graphs, where players are edges and the value of a coalition is the maximum cardinality of independent sets in the subgraph defined by the coalition. In this paper, we investigate the convexity of independent set games, as convex games possess many nice properties both economically and computationally. For independent set games introduced by Deng et al. (Math. Oper. Res., 24:751-766, 1999), we provide a necessary and sufficient characterization for the convexity, i.e., every non-pendant edge is incident to a pendant edge in the underlying graph. Our characterization immediately yields a polynomial time algorithm for recognizing convex instances of independent set games. Besides, we introduce a new class of independent set games and provide an efficient characterization for the convexity.

preprint2020arXiv

Searching for polarization in signed graphs: a local spectral approach

Signed graphs have been used to model interactions in social net-works, which can be either positive (friendly) or negative (antagonistic). The model has been used to study polarization and other related phenomena in social networks, which can be harmful to the process of democratic deliberation in our society. An interesting and challenging task in this application domain is to detect polarized communities in signed graphs. A number of different methods have been proposed for this task. However, existing approaches aim at finding globally optimal solutions. Instead, in this paper we are interested in finding polarized communities that are related to a small set of seed nodes provided as input. Seed nodes may consist of two sets, which constitute the two sides of a polarized structure. In this paper we formulate the problem of finding local polarized communities in signed graphs as a locally-biased eigen-problem. By viewing the eigenvector associated with the smallest eigenvalue of the Laplacian matrix as the solution of a constrained optimization problem, we are able to incorporate the local information as an additional constraint. In addition, we show that the locally-biased vector can be used to find communities with approximation guarantee with respect to a local analogue of the Cheeger constant on signed graphs. By exploiting the sparsity in the input graph, an indicator vector for the polarized communities can be found in time linear to the graph size. Our experiments on real-world networks validate the proposed algorithm and demonstrate its usefulness in finding local structures in this semi-supervised manner.