Source author record

Yi Su

Yi Su appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.CO Computation and Language Computer Science and Game Theory Artificial Intelligence Computer Vision Information Retrieval math.MG math.RT math.ST Methodology Statistics Theory

Catalog footprint

What is connected

14works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Context-Aware Language Modeling for Goal-Oriented Dialogue Systems

Goal-oriented dialogue systems face a trade-off between fluent language generation and task-specific control. While supervised learning with large language models is capable of producing realistic text, how to steer such responses towards completing a specific task without sacrificing language quality remains an open question. In this work, we formulate goal-oriented dialogue as a partially observed Markov decision process, interpreting the language model as a representation of both the dynamics and the policy. This view allows us to extend techniques from learning-based control, such as task relabeling, to derive a simple and effective method to finetune language models in a goal-aware way, leading to significantly improved task performance. We additionally introduce a number of training strategies that serve to better focus the model on the task at hand. We evaluate our method, Context-Aware Language Models (CALM), on a practical flight-booking task using AirDialogue. Empirically, CALM outperforms the state-of-the-art method by 7% in terms of task success, matching human-level task performance.

preprint2022arXiv

Greykite: Deploying Flexible Forecasting at Scale at LinkedIn

Forecasts help businesses allocate resources and achieve objectives. At LinkedIn, product owners use forecasts to set business targets, track outlook, and monitor health. Engineers use forecasts to efficiently provision hardware. Developing a forecasting solution to meet these needs requires accurate and interpretable forecasts on diverse time series with sub-hourly to quarterly frequencies. We present Greykite, an open-source Python library for forecasting that has been deployed on over twenty use cases at LinkedIn. Its flagship algorithm, Silverkite, provides interpretable, fast, and highly flexible univariate forecasts that capture effects such as time-varying growth and seasonality, autocorrelation, holidays, and regressors. The library enables self-serve accuracy and trust by facilitating data exploration, model configuration, execution, and interpretation. Our benchmark results show excellent out-of-the-box speed and accuracy on datasets from a variety of domains. Over the past two years, Greykite forecasts have been trusted by Finance, Engineering, and Product teams for resource planning and allocation, target setting and progress tracking, anomaly detection and root cause analysis. We expect Greykite to be useful to forecast practitioners with similar applications who need accurate, interpretable forecasts that capture complex dynamics common to time series related to human activity.

preprint2022arXiv

Online Adaptation to Label Distribution Shift

Machine learning models often encounter distribution shifts when deployed in the real world. In this paper, we focus on adaptation to label distribution shift in the online setting, where the test-time label distribution is continually changing and the model must dynamically adapt to it without observing the true label. Leveraging a novel analysis, we show that the lack of true label does not hinder estimation of the expected test loss, which enables the reduction of online label shift adaptation to conventional online learning. Informed by this observation, we propose adaptation algorithms inspired by classical online learning techniques such as Follow The Leader (FTL) and Online Gradient Descent (OGD) and derive their regret bounds. We empirically verify our findings under both simulated and real world label distribution shifts and show that OGD is particularly effective and robust to a variety of challenging label shift scenarios.

preprint2022arXiv

R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

Human language understanding operates at multiple levels of granularity (e.g., words, phrases, and sentences) with increasing levels of abstraction that can be hierarchically combined. However, existing deep models with stacked layers do not explicitly model any sort of hierarchical process. This paper proposes a recursive Transformer model based on differentiable CKY style binary trees to emulate the composition process. We extend the bidirectional language model pre-training objective to this architecture, attempting to predict each word given its left and right abstraction nodes. To scale up our approach, we also introduce an efficient pruned tree induction algorithm to enable encoding in just a linear number of composition steps. Experimental results on language modeling and unsupervised parsing show the effectiveness of our approach.

preprint2022arXiv

Tianshou: a Highly Modularized Deep Reinforcement Learning Library

In this paper, we present Tianshou, a highly modularized Python library for deep reinforcement learning (DRL) that uses PyTorch as its backend. Tianshou intends to be research-friendly by providing a flexible and reliable infrastructure of DRL algorithms. It supports online and offline training with more than 20 classic algorithms through a unified interface. To facilitate related research and prove Tianshou's reliability, we have released Tianshou's benchmark of MuJoCo environments, covering eight classic algorithms with state-of-the-art performance. We open-sourced Tianshou at https://github.com/thu-ml/tianshou/.

preprint2021arXiv

Game-based Pricing and Task Offloading in Mobile Edge Computing Enabled Edge-Cloud Systems

As a momentous enabling of the Internet of things (IoT), mobile edge computing (MEC) provides IoT mobile devices (MD) with powerful external computing and storage resources. However, a mechanism addressing distributed task offloading and price competition for the open exchange marketplace has not been established properly, which has become a huge obstacle to MEC's application in the IoT market. In this paper, we formulate a distributed mechanism to analyze the interaction between OSPs and IoT MDs in the MEC enabled edge-cloud system by appling multi-leader multi-follower two-tier Stackelberg game theory. We first prove the existence of the Stackelberg equilibrium, and then we propose two distributed algorithms, namely iterative proximal offloading algorithm (IPOA) and iterative Stackelberg game pricing algorithm (ISPA). The IPOA solves the follower non-cooperative game among IoT MDs and ISPA uses backward induction to deal with the price competition among OSPs. Experimental results show that IPOA can markedly reduce the disutility of IoT MDs compared with other traditional task offloading schemes and the price of anarchy is always less than 150\%. Besides, results also demonstrate that ISPA is reliable in boosting the revenue of OSPs.

preprint2020arXiv

Accurate Tumor Tissue Region Detection with Accelerated Deep Convolutional Neural Networks

Manual annotation of pathology slides for cancer diagnosis is laborious and repetitive. Therefore, much effort has been devoted to develop computer vision solutions. Our approach, (FLASH), is based on a Deep Convolutional Neural Network (DCNN) architecture. It reduces computational costs and is faster than typical deep learning approaches by two orders of magnitude, making high throughput processing a possibility. In computer vision approaches using deep learning methods, the input image is subdivided into patches which are separately passed through the neural network. Features extracted from these patches are used by the classifier to annotate the corresponding region. Our approach aggregates all the extracted features into a single matrix before passing them to the classifier. Previously, the features are extracted from overlapping patches. Aggregating the features eliminates the need for processing overlapping patches, which reduces the computations required. DCCN and FLASH demonstrate high sensitivity (~ 0.96), good precision (~0.78) and high F1 scores (~0.84). The average time taken to process each sample for FLASH and DCNN is 96.6 seconds and 9489.20 seconds, respectively. Our approach was approximately 100 times faster than the original DCNN approach while simultaneously preserving high accuracy and precision.

preprint2020arXiv

Adaptive Estimator Selection for Off-Policy Evaluation

We develop a generic data-driven method for estimator selection in off-policy policy evaluation settings. We establish a strong performance guarantee for the method, showing that it is competitive with the oracle estimator, up to a constant factor. Via in-depth case studies in contextual bandits and reinforcement learning, we demonstrate the generality and applicability of the method. We also perform comprehensive experiments, demonstrating the empirical efficacy of our approach and comparing with related approaches. In both case studies, our method compares favorably with existing methods.

preprint2020arXiv

Doubly robust off-policy evaluation with shrinkage

We propose a new framework for designing estimators for off-policy evaluation in contextual bandits. Our approach is based on the asymptotically optimal doubly robust estimator, but we shrink the importance weights to minimize a bound on the mean squared error, which results in a better bias-variance tradeoff in finite samples. We use this optimization-based framework to obtain three estimators: (a) a weight-clipping estimator, (b) a new weight-shrinkage estimator, and (c) the first shrinkage-based estimator for combinatorial action sets. Extensive experiments in both standard and combinatorial bandit benchmark problems show that our estimators are highly adaptive and typically outperform state-of-the-art methods.

preprint2020arXiv

Off-policy Bandits with Deficient Support

Learning effective contextual-bandit policies from past actions of a deployed system is highly desirable in many settings (e.g. voice assistants, recommendation, search), since it enables the reuse of large amounts of log data. State-of-the-art methods for such off-policy learning, however, are based on inverse propensity score (IPS) weighting. A key theoretical requirement of IPS weighting is that the policy that logged the data has "full support", which typically translates into requiring non-zero probability for any action in any context. Unfortunately, many real-world systems produce support deficient data, especially when the action space is large, and we show how existing methods can fail catastrophically. To overcome this gap between theory and applications, we identify three approaches that provide various guarantees for IPS-based learning despite the inherent limitations of support-deficient data: restricting the action space, reward extrapolation, and restricting the policy space. We systematically analyze the statistical and computational properties of these three approaches, and we empirically evaluate their effectiveness. In addition to providing the first systematic analysis of support-deficiency in contextual-bandit learning, we conclude with recommendations that provide practical guidance.

preprint2014arXiv

Electrical Lie Algebra of Classical Types

We investigate the structure of electrical Lie algebras of finite Dynkin type. These Lie algebras were introduced by Lam-Pylyavskyy in the study of \textit{circular planar electrical networks}. The corresponding Lie group acts on such networks via some combinatorial operations studied by Curtis-Ingerman-Morrow and Colin de Verdière-Gitler-Vertigan. Lam-Pylyavskyy studied the electrical Lie algebra of type $A$ of even rank in detail, and gave a conjecture for the dimension of electrical Lie algebras of finite Dynkin types. We prove this conjecture for all classical Dynkin types, that is, $A$, $B$, $C$, and $D$. Furthermore, we are able to explicitly describe the structure of the corresponding electrical Lie algebras as the semisimple product of the symplectic Lie algebra with its finite dimensional irreducible representations.

preprint2010arXiv

Induced subgraphs in sparse random graphs with given degree sequence

For any $S\subset [n]$, we compute the probability that the subgraph of $\mathcal{G}_{n,d}$ induced by $S$ is a given graph $H$ on the vertex set $S$. The result holds for any $d=o(n^{1/3})$ and is further extended to $\mathcal{G}_{\bf d}$, the probability space of random graphs with a given degree sequence $\bf d$.

preprint2010arXiv

Structural Solutions For Additively Coupled Sum Constrained Games

We propose and analyze a broad family of games played by resource-constrained players, which are characterized by the following central features: 1) each user has a multi-dimensional action space, subject to a single sum resource constraint; 2) each user's utility in a particular dimension depends on an additive coupling between the user's action in the same dimension and the actions of the other users; and 3) each user's total utility is the sum of the utilities obtained in each dimension. Familiar examples of such multi-user environments in communication systems include power control over frequency-selective Gaussian interference channels and flow control in Jackson networks. In settings where users cannot exchange messages in real-time, we study how users can adjust their actions based on their local observations. We derive sufficient conditions under which a unique Nash equilibrium exists and the best-response algorithm converges globally and linearly to the Nash equilibrium. In settings where users can exchange messages in real-time, we focus on user choices that optimize the overall utility. We provide the convergence conditions of two distributed action update mechanisms, gradient play and Jacobi update.

preprint2010arXiv

The lattice of integer flows of a regular matroid

For a finite multigraph G, let Λ(G) denote the lattice of integer flows of G -- this is a finitely generated free abelian group with an integer-valued positive definite bilinear form. Bacher, de la Harpe, and Nagnibeda show that if G and H are 2-isomorphic graphs then Λ(G) and Λ(H) are isometric, and remark that they were unable to find a pair of nonisomorphic 3-connected graphs for which the corresponding lattices are isometric. We explain this by examining the lattice Λ(M) of integer flows of any regular matroid M. Let M_\bullet be the minor of M obtained by contracting all co-loops. We show that Λ(M) and Λ(N) are isometric if and only if M_\bullet and N_\bullet are isomorphic.

Yi Su

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Context-Aware Language Modeling for Goal-Oriented Dialogue Systems

Greykite: Deploying Flexible Forecasting at Scale at LinkedIn

Online Adaptation to Label Distribution Shift

R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

Tianshou: a Highly Modularized Deep Reinforcement Learning Library

Game-based Pricing and Task Offloading in Mobile Edge Computing Enabled Edge-Cloud Systems

Accurate Tumor Tissue Region Detection with Accelerated Deep Convolutional Neural Networks

Adaptive Estimator Selection for Off-Policy Evaluation

Doubly robust off-policy evaluation with shrinkage

Off-policy Bandits with Deficient Support

Electrical Lie Algebra of Classical Types

Induced subgraphs in sparse random graphs with given degree sequence

Structural Solutions For Additively Coupled Sum Constrained Games

The lattice of integer flows of a regular matroid