Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2025arXiv

Fundamental limits for weighted empirical approximations of tilted distributions

Consider the task of generating samples from a tilted distribution of a random vector whose underlying distribution is unknown, but samples from it are available. This finds applications in fields such as finance and climate science, and in rare event simulation. In this article, we discuss the asymptotic efficiency of a self-normalized importance sampler of the tilted distribution. We provide a sharp characterization of its accuracy, given the number of samples and the degree of tilt. Our findings reveal a surprising dichotomy: while the number of samples needed to accurately tilt a bounded random vector increases polynomially in the tilt amount, it increases at a super polynomial rate for unbounded distributions.

preprint2022arXiv

Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems

We consider the problem of controlling a Linear Quadratic Regulator (LQR) system over a finite horizon $T$ with fixed and known cost matrices $Q,R$, but unknown and non-stationary dynamics $\{A_t, B_t\}$. The sequence of dynamics matrices can be arbitrary, but with a total variation, $V_T$, assumed to be $o(T)$ and unknown to the controller. Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all $t$, we present an algorithm that achieves the optimal dynamic regret of $\tilde{\mathcal{O}}\left(V_T^{2/5}T^{3/5}\right)$. With piece-wise constant dynamics, our algorithm achieves the optimal regret of $\tilde{\mathcal{O}}(\sqrt{ST})$ where $S$ is the number of switches. The crux of our algorithm is an adaptive non-stationarity detection strategy, which builds on an approach recently developed for contextual Multi-armed Bandit problems. We also argue that non-adaptive forgetting (e.g., restarting or using sliding window learning with a static window size) may not be regret optimal for the LQR problem, even when the window size is optimally tuned with the knowledge of $V_T$. The main technical challenge in the analysis of our algorithm is to prove that the ordinary least squares (OLS) estimator has a small bias when the parameter to be estimated is non-stationary. Our analysis also highlights that the key motif driving the regret is that the LQR problem is in spirit a bandit problem with linear feedback and locally quadratic cost. This motif is more universal than the LQR problem itself, and therefore we believe our results should find wider application.

preprint2022arXiv

Online Stochastic Bin Packing

Bin packing is an algorithmic problem that arises in diverse applications such as remnant inventory systems, shipping logistics, and appointment scheduling. In its simplest variant, a sequence of $T$ items (e.g., orders for raw material, packages for delivery) is revealed one at a time, and each item must be packed on arrival in an available bin (e.g., remnant pieces of raw material in inventory, shipping containers). The sizes of items are i.i.d. samples from an unknown distribution, but the sizes are known when the items arrive. The goal is to minimize the number of non-empty bins (equivalently waste, defined to be the total unused space in non-empty bins). This problem has been extensively studied in the Operations Research and Theoretical Computer Science communities, yet all existing heuristics either rely on learning the distribution or exhibit $o(T)$ additive suboptimality compared to the optimal offline algorithm only for certain classes of distributions (those with sublinear optimal expected waste). In this paper, we propose a family of algorithms which are the first truly distribution-oblivious algorithms for stochastic bin packing, and achieve $\mathcal{O}(\sqrt{T})$ additive suboptimality for all item size distributions. Our algorithms are inspired by approximate interior-point algorithms for convex optimization. In addition to regret guarantees for discrete i.i.d. sequences, we extend our results to continuous item size distribution with bounded density, and also prove a family of novel regret bounds for non-i.i.d. input sequences. To the best of our knowledge these are the first such results for non-i.i.d. and non-random-permutation input sequences for online stochastic packing.

preprint2022arXiv

Practical Adversarial Multivalid Conformal Prediction

We give a simple, generic conformal prediction method for sequential prediction that achieves target empirical coverage guarantees against adversarially chosen data. It is computationally lightweight -- comparable to split conformal prediction -- but does not require having a held-out validation set, and so all data can be used for training models from which to derive a conformal score. It gives stronger than marginal coverage guarantees in two ways. First, it gives threshold calibrated prediction sets that have correct empirical coverage even conditional on the threshold used to form the prediction set from the conformal score. Second, the user can specify an arbitrary collection of subsets of the feature space -- possibly intersecting -- and the coverage guarantees also hold conditional on membership in each of these subsets. We call our algorithm MVP, short for MultiValid Prediction. We give both theory and an extensive set of empirical evaluations.

preprint2022arXiv

Voting Rights, Markov Chains, and Optimization by Short Bursts

Finding outlying elements in probability distributions can be a hard problem. Taking a real example from Voting Rights Act enforcement, we consider the problem of maximizing the number of simultaneous majority-minority districts in a political districting plan. An unbiased random walk on districting plans is unlikely to find plans that approach this maximum. A common search approach is to use a biased random walk: preferentially select districting plans with more majority-minority districts. Here, we present a third option, called short bursts, in which an unbiased random walk is performed for a small number of steps (called the burst length), then re-started from the most extreme plan that was encountered in the last burst. We give empirical evidence that short-burst runs outperform biased random walks for the problem of maximizing the number of majority-minority districts, and that there are many values of burst length for which we see this improvement. Abstracting from our use case, we also consider short bursts where the underlying state space is a line with various probability distributions, and then explore some features of more complicated state spaces and how these impact the effectiveness of short bursts.

preprint2021arXiv

Implementing partisan symmetry: Problems and paradoxes

We consider the measures of partisan symmetry proposed for practical use in the political science literature, as clarified and developed in Katz, King, and Rosenblatt (2020). Elementary mathematical manipulation shows the symmetry metrics to have surprising properties that call their meaningfulness into question. To accompany the general analysis, we study measures of partisan symmetry with respect to recent voting patterns in Utah, Texas, and North Carolina, flagging problems in each case. Taken together, these observations should raise major concerns about the available techniques for quantitative scores of partisan symmetry -- including the mean-median score, the partisan bias score, and the more general "partisan symmetry standard" -- as the decennial redistricting begins.

preprint2020arXiv

A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM

We present Korbit, a large-scale, open-domain, mixed-interface, dialogue-based intelligent tutoring system (ITS). Korbit uses machine learning, natural language processing and reinforcement learning to provide interactive, personalized learning online. Korbit has been designed to easily scale to thousands of subjects, by automating, standardizing and simplifying the content creation process. Unlike other ITS, a teacher can develop new learning modules for Korbit in a matter of hours. To facilitate learning across a widerange of STEM subjects, Korbit uses a mixed-interface, which includes videos, interactive dialogue-based exercises, question-answering, conceptual diagrams, mathematical exercises and gamification elements. Korbit has been built to scale to millions of students, by utilizing a state-of-the-art cloud-based micro-service architecture. Korbit launched its first course in 2019 on machine learning, and since then over 7,000 students have enrolled. Although Korbit was designed to be open-domain and highly scalable, A/B testing experiments with real-world students demonstrate that both student learning outcomes and student motivation are substantially improved compared to typical online courses.

preprint2020arXiv

Automated Personalized Feedback Improves Learning Gains in an Intelligent Tutoring System

We investigate how automated, data-driven, personalized feedback in a large-scale intelligent tutoring system (ITS) improves student learning outcomes. We propose a machine learning approach to generate personalized feedback, which takes individual needs of students into account. We utilize state-of-the-art machine learning and natural language processing techniques to provide the students with personalized hints, Wikipedia-based explanations, and mathematical hints. Our model is used in Korbit, a large-scale dialogue-based ITS with thousands of students launched in 2019, and we demonstrate that the personalized feedback leads to considerable improvement in student learning outcomes and in the subjective evaluation of the feedback.

preprint2020arXiv

Estimation of a Low-rank Topic-Based Model for Information Cascades

We consider the problem of estimating the latent structure of a social network based on the observed information diffusion events, or cascades, where the observations for a given cascade consist of only the timestamps of infection for infected nodes but not the source of the infection. Most of the existing work on this problem has focused on estimating a diffusion matrix without any structural assumptions on it. In this paper, we propose a novel model based on the intuition that an information is more likely to propagate among two nodes if they are interested in similar topics which are also prominent in the information content. In particular, our model endows each node with an influence vector (which measures how authoritative the node is on each topic) and a receptivity vector (which measures how susceptible the node is for each topic). We show how this node-topic structure can be estimated from the observed cascades, and prove the consistency of the estimator. Experiments on synthetic and real data demonstrate the improved performance and better interpretability of our model compared to existing state-of-the-art methods.

preprint2020arXiv

Greed Works -- Online Algorithms For Unrelated Machine Stochastic Scheduling

This paper establishes performance guarantees for online algorithms that schedule stochastic, nonpreemptive jobs on unrelated machines to minimize the expected total weighted completion time. Prior work on unrelated machine scheduling with stochastic jobs was restricted to the offline case, and required linear or convex programming relaxations for the assignment of jobs to machines. The algorithms introduced in this paper are purely combinatorial. The performance bounds are of the same order of magnitude as those of earlier work, and depend linearly on an upper bound on the squared coefficient of variation of the jobs' processing times. Specifically for deterministic processing times, without and with release times, the competitive ratios are 4 and 7.216, respectively. As to the technical contribution, the paper shows how dual fitting techniques can be used for stochastic and nonpreemptive scheduling problems.

preprint2020arXiv

On BPS Strings in ${\mathcal N}=4$ Yang-Mills Theory

We study singular time-dependent $\frac{1}{8}$-BPS configurations in the abelian sector of ${{\mathcal N}= 4}$ supersymmetric Yang-Mills theory that represent BPS string-like defects in ${{\mathbb R}\times S^3}$ spacetime. Such BPS strings can be described as intersections of the zeros of holomorphic functions in two complex variables with a 3-sphere. We argue that these BPS strings map to $\frac{1}{8}$-BPS surface operators under the state-operator correspondence of the CFT. We show that the string defects are holographically dual to noncompact probe D3-branes in global $AdS_5\times S^5$ that share supersymmetries with a class of dual-giant gravitons. For simple configurations, we demonstrate how to define a good variational problem and propose a regularization scheme that leads to finite energy and global charges on both sides of the holographic correspondence.

preprint2020arXiv

SiEVE: Semantically Encoded Video Analytics on Edge and Cloud

Recent advances in computer vision and neural networks have made it possible for more surveillance videos to be automatically searched and analyzed by algorithms rather than humans. This happened in parallel with advances in edge computing where videos are analyzed over hierarchical clusters that contain edge devices, close to the video source. However, the current video analysis pipeline has several disadvantages when dealing with such advances. For example, video encoders have been designed for a long time to please human viewers and be agnostic of the downstream analysis task (e.g., object detection). Moreover, most of the video analytics systems leverage 2-tier architecture where the encoded video is sent to either a remote cloud or a private edge server but does not efficiently leverage both of them. In response to these advances, we present SIEVE, a 3-tier video analytics system to reduce the latency and increase the throughput of analytics over video streams. In SIEVE, we present a novel technique to detect objects in compressed video streams. We refer to this technique as semantic video encoding because it allows video encoders to be aware of the semantics of the downstream task (e.g., object detection). Our results show that by leveraging semantic video encoding, we achieve close to 100% object detection accuracy with decompressing only 3.5% of the video frames which results in more than 100x speedup compared to classical approaches that decompress every video frame.

preprint2020arXiv

Simultaneous Inference for Pairwise Graphical Models with Generalized Score Matching

Probabilistic graphical models provide a flexible yet parsimonious framework for modeling dependencies among nodes in networks. There is a vast literature on parameter estimation and consistent model selection for graphical models. However, in many of the applications, scientists are also interested in quantifying the uncertainty associated with the estimated parameters and selected models, which current literature has not addressed thoroughly. In this paper, we propose a novel estimator for statistical inference on edge parameters in pairwise graphical models based on generalized Hyvärinen scoring rule. Hyvärinen scoring rule is especially useful in cases where the normalizing constant cannot be obtained efficiently in a closed form, which is a common problem for graphical models, including Ising models and truncated Gaussian graphical models. Our estimator allows us to perform statistical inference for general graphical models whereas the existing works mostly focus on statistical inference for Gaussian graphical models where finding normalizing constant is computationally tractable. Under mild conditions that are typically assumed in the literature for consistent estimation, we prove that our proposed estimator is $\sqrt{n}$-consistent and asymptotically normal, which allows us to construct confidence intervals and build hypothesis tests for edge parameters. Moreover, we show how our proposed method can be applied to test hypotheses that involve a large number of model parameters simultaneously. We illustrate validity of our estimator through extensive simulation studies on a diverse collection of data-generating processes.

preprint2010arXiv

Stability of the bipartite matching model

We consider the bipartite matching model of customers and servers introduced by Caldentey, Kaplan, and Weiss (Adv. Appl. Probab., 2009). Customers and servers play symmetrical roles. There is a finite set C resp. S, of customer, resp. server, classes. Time is discrete and at each time step, one customer and one server arrive in the system according to a joint probability measure on CxS, independently of the past. Also, at each time step, pairs of matched customer and server, if they exist, depart from the system. Authorized matchings are given by a fixed bipartite graph. A matching policy is chosen, which decides how to match when there are several possibilities. Customers/servers that cannot be matched are stored in a buffer. The evolution of the model can be described by a discrete time Markov chain. We study its stability under various admissible matching policies including: ML (Match the Longest), MS (Match the Shortest), FIFO (match the oldest), priorities. There exist natural necessary conditions for stability (independent of the matching policy) defining the maximal possible stability region. For some bipartite graphs, we prove that the stability region is indeed maximal for any admissible matching policy. For the ML policy, we prove that the stability region is maximal for any bipartite graph. For the MS and priority policies, we exhibit a bipartite graph with a non-maximal stability region.