Source author record

Zhiyong Yang

Zhiyong Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Information Retrieval physics.acc-ph hep-ex Methodology

Catalog footprint

What is connected

9works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

AdAUC: End-to-end Adversarial AUC Optimization Against Long-tail Problems

It is well-known that deep learning models are vulnerable to adversarial examples. Existing studies of adversarial training have made great progress against this challenge. As a typical trait, they often assume that the class distribution is overall balanced. However, long-tail datasets are ubiquitous in a wide spectrum of applications, where the amount of head class instances is larger than the tail classes. Under such a scenario, AUC is a much more reasonable metric than accuracy since it is insensitive toward class distribution. Motivated by this, we present an early trial to explore adversarial training methods to optimize AUC. The main challenge lies in that the positive and negative examples are tightly coupled in the objective function. As a direct result, one cannot generate adversarial examples without a full scan of the dataset. To address this issue, based on a concavity regularization scheme, we reformulate the AUC optimization problem as a saddle point problem, where the objective becomes an instance-wise function. This leads to an end-to-end training protocol. Furthermore, we provide a convergence guarantee of the proposed algorithm. Our analysis differs from the existing studies since the algorithm is asked to generate adversarial examples by calculating the gradient of a min-max problem. Finally, the extensive experimental results show the performance and robustness of our algorithm in three long-tail datasets.

preprint2022arXiv

ER: Equivariance Regularizer for Knowledge Graph Completion

Tensor factorization and distanced based models play important roles in knowledge graph completion (KGC). However, the relational matrices in KGC methods often induce a high model complexity, bearing a high risk of overfitting. As a remedy, researchers propose a variety of different regularizers such as the tensor nuclear norm regularizer. Our motivation is based on the observation that the previous work only focuses on the "size" of the parametric space, while leaving the implicit semantic information widely untouched. To address this issue, we propose a new regularizer, namely, Equivariance Regularizer (ER), which can suppress overfitting by leveraging the implicit semantic information. Specifically, ER can enhance the generalization ability of the model by employing the semantic equivariance between the head and tail entities. Moreover, it is a generic solution for both distance based models and tensor factorization based models. The experimental results indicate a clear and substantial improvement over the state-of-the-art relation prediction methods.

preprint2022arXiv

Geometry Interaction Knowledge Graph Embeddings

Knowledge graph (KG) embeddings have shown great power in learning representations of entities and relations for link prediction tasks. Previous work usually embeds KGs into a single geometric space such as Euclidean space (zero curved), hyperbolic space (negatively curved) or hyperspherical space (positively curved) to maintain their specific geometric structures (e.g., chain, hierarchy and ring structures). However, the topological structure of KGs appears to be complicated, since it may contain multiple types of geometric structures simultaneously. Therefore, embedding KGs in a single space, no matter the Euclidean space, hyperbolic space or hyperspheric space, cannot capture the complex structures of KGs accurately. To overcome this challenge, we propose Geometry Interaction knowledge graph Embeddings (GIE), which learns spatial structures interactively between the Euclidean, hyperbolic and hyperspherical spaces. Theoretically, our proposed GIE can capture a richer set of relational information, model key inference patterns, and enable expressive semantic matching across entities. Experimental results on three well-established knowledge graph completion benchmarks show that our GIE achieves the state-of-the-art performance with fewer parameters.

preprint2022arXiv

Meta-Wrapper: Differentiable Wrapping Operator for User Interest Selection in CTR Prediction

Click-through rate (CTR) prediction, whose goal is to predict the probability of the user to click on an item, has become increasingly significant in the recommender systems. Recently, some deep learning models with the ability to automatically extract the user interest from his/her behaviors have achieved great success. In these work, the attention mechanism is used to select the user interested items in historical behaviors, improving the performance of the CTR predictor. Normally, these attentive modules can be jointly trained with the base predictor by using gradient descents. In this paper, we regard user interest modeling as a feature selection problem, which we call user interest selection. For such a problem, we propose a novel approach under the framework of the wrapper method, which is named Meta-Wrapper. More specifically, we use a differentiable module as our wrapping operator and then recast its learning problem as a continuous bilevel optimization. Moreover, we use a meta-learning algorithm to solve the optimization and theoretically prove its convergence. Meanwhile, we also provide theoretical analysis to show that our proposed method 1) efficiencies the wrapper-based feature selection, and 2) achieves better resistance to overfitting. Finally, extensive experiments on three public datasets manifest the superiority of our method in boosting the performance of CTR prediction.

preprint2022arXiv

Optimizing Two-way Partial AUC with an End-to-end Framework

The Area Under the ROC Curve (AUC) is a crucial metric for machine learning, which evaluates the average performance over all possible True Positive Rates (TPRs) and False Positive Rates (FPRs). Based on the knowledge that a skillful classifier should simultaneously embrace a high TPR and a low FPR, we turn to study a more general variant called Two-way Partial AUC (TPAUC), where only the region with $\mathsf{TPR} \ge α, \mathsf{FPR} \le β$ is included in the area. Moreover, recent work shows that the TPAUC is essentially inconsistent with the existing Partial AUC metrics where only the FPR range is restricted, opening a new problem to seek solutions to leverage high TPAUC. Motivated by this, we present the first trial in this paper to optimize this new metric. The critical challenge along this course lies in the difficulty of performing gradient-based optimization with end-to-end stochastic training, even with a proper choice of surrogate loss. To address this issue, we propose a generic framework to construct surrogate optimization problems, which supports efficient end-to-end training with deep learning. Moreover, our theoretical analyses show that: 1) the objective function of the surrogate problems will achieve an upper bound of the original problem under mild conditions, and 2) optimizing the surrogate problems leads to good generalization performance in terms of TPAUC with a high probability. Finally, empirical studies over several benchmark datasets speak to the efficacy of our framework.

preprint2022arXiv

Rethinking Collaborative Metric Learning: Toward an Efficient Alternative without Negative Sampling

The recently proposed Collaborative Metric Learning (CML) paradigm has aroused wide interest in the area of recommendation systems (RS) owing to its simplicity and effectiveness. Typically, the existing literature of CML depends largely on the \textit{negative sampling} strategy to alleviate the time-consuming burden of pairwise computation. However, in this work, by taking a theoretical analysis, we find that negative sampling would lead to a biased estimation of the generalization error. Specifically, we show that the sampling-based CML would introduce a bias term in the generalization bound, which is quantified by the per-user \textit{Total Variance} (TV) between the distribution induced by negative sampling and the ground truth distribution. This suggests that optimizing the sampling-based CML loss function does not ensure a small generalization error even with sufficiently large training data. Moreover, we show that the bias term will vanish without the negative sampling strategy. Motivated by this, we propose an efficient alternative without negative sampling for CML named \textit{Sampling-Free Collaborative Metric Learning} (SFCML), to get rid of the sampling bias in a practical sense. Finally, comprehensive experiments over seven benchmark datasets speak to the superiority of the proposed algorithm.

preprint2020arXiv

Task-Feature Collaborative Learning with Application to Personalized Attribute Prediction

As an effective learning paradigm against insufficient training samples, Multi-Task Learning (MTL) encourages knowledge sharing across multiple related tasks so as to improve the overall performance. In MTL, a major challenge springs from the phenomenon that sharing the knowledge with dissimilar and hard tasks, known as negative transfer, often results in a worsened performance. Though a substantial amount of studies have been carried out against the negative transfer, most of the existing methods only model the transfer relationship as task correlations, with the transfer across features and tasks left unconsidered. Different from the existing methods, our goal is to alleviate negative transfer collaboratively across features and tasks. To this end, we propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL). Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks and suppressing inter-group knowledge sharing. We then propose an optimization method for the model. Extensive theoretical analysis shows that our proposed method has the following benefits: (a) it enjoys the global convergence property and (b) it provides a block-diagonal structure recovery guarantee. As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks. We further apply it to the personalized attribute prediction problem with fine-grained modeling of user behaviors. Finally, experimental results on both simulated dataset and real-world datasets demonstrate the effectiveness of our proposed method

preprint2016arXiv

Energy-spread measurement of triple-pulse electron beams based on the magnetic dispersion principle

The energy-spread of the triple-pulse electron beam generated by the Dragon-II linear induction accelerator is measured using the method of energy dispersion in the magnetic field. A sector magnet is applied for energy analyzing of the electron beam, which has a bending radius of 300 mm and a deflection angle of 90 degrees. For each pulse, both the time-resolved and the integral images of the electron position at the output port of the bending beam line are recorded by a streak camera and a CCD camera, respectively. Experimental results demonstrate an energy-spread of less than +-2.0% for the electron pulses. The cavity voltage waveforms obtained by different detectors are also analyzed for comparison.

preprint2015arXiv

Experimental observation of spatial jitters of a triple-pulse x-ray source based on the pinhole imaging technique

In high-energy flash radiography, scattered photons will degrade the acquiring image, which limits the resolving power of the interface and density of the dense object. The application of large anti-scatter grid is capable of remarkably decreasing scattered photons, whereas requires a very stable source position in order to reduce the loss of signal photons in the grid structure. The pinhole imaging technique is applied to observe spatial jitters of a triple-pulse radiographic source produced by a linear induction accelerator. Numerical simulations are taken to analyze the performance of the imaging technique with same or close parameters of the pinhole object and experimental alignment Experiments are carried out to observe spatial jitters of the source between different measurements. Deviations of the source position between different pulses are also measured in each experiment.

Zhiyong Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

AdAUC: End-to-end Adversarial AUC Optimization Against Long-tail Problems

ER: Equivariance Regularizer for Knowledge Graph Completion

Geometry Interaction Knowledge Graph Embeddings

Meta-Wrapper: Differentiable Wrapping Operator for User Interest Selection in CTR Prediction

Optimizing Two-way Partial AUC with an End-to-end Framework

Rethinking Collaborative Metric Learning: Toward an Efficient Alternative without Negative Sampling

Task-Feature Collaborative Learning with Application to Personalized Attribute Prediction

Energy-spread measurement of triple-pulse electron beams based on the magnetic dispersion principle

Experimental observation of spatial jitters of a triple-pulse x-ray source based on the pinhole imaging technique