Source author record

Yuguang Yue

Yuguang Yue appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computation Information Retrieval math.ST Methodology Statistics Theory

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Learning to Rank For Push Notifications Using Pairwise Expected Regret

Listwise ranking losses have been widely studied in recommender systems. However, new paradigms of content consumption present new challenges for ranking methods. In this work we contribute an analysis of learning to rank for personalized mobile push notifications and discuss the unique challenges this presents compared to traditional ranking problems. To address these challenges, we introduce a novel ranking loss based on weighting the pairwise loss between candidates by the expected regret incurred for misordering the pair. We demonstrate that the proposed method can outperform prior methods both in a simulated environment and in a production experiment on a major social network.

preprint2020arXiv

A Unified Framework for Tuning Hyperparameters in Clustering Problems

Selecting hyperparameters for unsupervised learning problems is challenging in general due to the lack of ground truth for validation. Despite the prevalence of this issue in statistics and machine learning, especially in clustering problems, there are not many methods for tuning these hyperparameters with theoretical guarantees. In this paper, we provide a framework with provable guarantees for selecting hyperparameters in a number of distinct models. We consider both the subgaussian mixture model and network models to serve as examples of i.i.d. and non-i.i.d. data. We demonstrate that the same framework can be used to choose the Lagrange multipliers of penalty terms in semi-definite programming (SDP) relaxations for community detection, and the bandwidth parameter for constructing kernel similarity matrices for spectral clustering. By incorporating a cross-validation procedure, we show the framework can also do consistent model selection for network models. Using a variety of simulated and real data examples, we show that our framework outperforms other widely used tuning procedures in a broad range of parameter settings.

preprint2020arXiv

Discrete Action On-Policy Learning with Action-Value Critic

Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension, making it challenging to apply existing on-policy gradient based deep RL algorithms efficiently. To effectively operate in multidimensional discrete action spaces, we construct a critic to estimate action-value functions, apply it on correlated actions, and combine these critic estimated action values to control the variance of gradient estimation. We follow rigorous statistical analysis to design how to generate and combine these correlated actions, and how to sparsify the gradients by shutting down the contributions from certain dimensions. These efforts result in a new discrete action on-policy RL algorithm that empirically outperforms related on-policy algorithms relying on variance control techniques. We demonstrate these properties on OpenAI Gym benchmark tasks, and illustrate how discretizing the action space could benefit the exploration phase and hence facilitate convergence to a better local optimal solution thanks to the flexibility of discrete policy.

preprint2020arXiv

T-optimal designs for multi-factor polynomial regression models via a semidefinite relaxation method

We consider T-optimal experiment design problems for discriminating multi-factor polynomial regression models where the design space is defined by polynomial inequalities and the regression parameters are constrained to given convex sets. Our proposed optimality criterion is formulated as a convex optimization problem with a moment cone constraint. When the regression models have one factor, an exact semidefinite representation of the moment cone constraint can be applied to obtain an equivalent semidefinite program. When there are two or more factors in the models, we apply a moment relaxation technique and approximate the moment cone constraint by a hierarchy of semidefinite-representable outer approximations. When the relaxation hierarchy converges, an optimal discrimination design can be recovered from the optimal moment matrix, and its optimality is confirmed by an equivalence theorem. The methodology is illustrated with several examples.

Yuguang Yue

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Learning to Rank For Push Notifications Using Pairwise Expected Regret

A Unified Framework for Tuning Hyperparameters in Clustering Problems

Discrete Action On-Policy Learning with Action-Value Critic

T-optimal designs for multi-factor polynomial regression models via a semidefinite relaxation method