Source author record

Yaming Yu

Yaming Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory math.PR Computation Methodology Information Theory math.IT math.CA Applications astro-ph.IM astro-ph.SR Machine Learning math.FA

Catalog footprint

What is connected

22works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2016arXiv

On Stochastic Comparisons of Order Statistics from Heterogeneous Exponential Samples

We show that the $k$th order statistic from a heterogeneous sample of $n\geq k$ exponential random variables is larger than that from a homogeneous exponential sample in the sense of star ordering, as conjectured by Xu and Balakrishnan (2012). As a consequence, we establish hazard rate ordering for order statistics between heterogeneous and homogeneous exponential samples, resolving an open problem of Pǎltǎnea (2008). Extensions to general spacings are also presented.

preprint2016arXiv

On the Unique Crossing Conjecture of Diaconis and Perlman on Convolutions of Gamma Random Variables

Diaconis and Perlman (1990) conjecture that the distribution functions of two weighted sums of iid gamma random variables cross exactly once if one weight vector majorizes the other. We disprove this conjecture when the shape parameter of the gamma variates is $α<1$ and prove it when $α\geq 1$.

preprint2013arXiv

Thompson Sampling in Dynamic Systems for Contextual Bandit Problems

We consider the multiarm bandit problems in the timevarying dynamic system for rich structural features. For the nonlinear dynamic model, we propose the approximate inference for the posterior distributions based on Laplace Approximation. For the context bandit problems, Thompson Sampling is adopted based on the underlying posterior distributions of the parameters. More specifically, we introduce the discount decays on the previous samples impact and analyze the different decay rates with the underlying sample dynamics. Consequently, the exploration and exploitation is adaptively tradeoff according to the dynamics in the system.

preprint2012arXiv

A Bayesian Analysis of the Correlations Among Sunspot Cycles

Sunspot numbers form a comprehensive, long-duration proxy of solar activity and have been used numerous times to empirically investigate the properties of the solar cycle. A number of correlations have been discovered over the 24 cycles for which observational records are available. Here we carry out a sophisticated statistical analysis of the sunspot record that reaffirms these correlations, and sets up an empirical predictive framework for future cycles. An advantage of our approach is that it allows for rigorous assessment of both the statistical significance of various cycle features and the uncertainty associated with predictions. We summarize the data into three sequential relations that estimate the amplitude, duration, and time of rise to maximum for any cycle, given the values from the previous cycle. We find that there is no indication of a persistence in predictive power beyond one cycle, and conclude that the dynamo does not retain memory beyond one cycle. Based on sunspot records up to October 2011, we obtain, for Cycle 24, an estimated maximum smoothed monthly sunspot number of 97 +- 15, to occur in January--February 2014 +- 6 months.

preprint2012arXiv

On the inclusion probabilities in some unequal probability sampling plans without replacement

Comparison results are obtained for the inclusion probabilities in some unequal probability sampling plans without replacement. For either successive sampling or Hájek's rejective sampling, the larger the sample size, the more uniform the inclusion probabilities in the sense of majorization. In particular, the inclusion probabilities are more uniform than the drawing probabilities. For the same sample size, and given the same set of drawing probabilities, the inclusion probabilities are more uniform for rejective sampling than for successive sampling. This last result confirms a conjecture of Hájek (Sampling from a Finite Population (1981) Dekker). Results are also presented in terms of the Kullback--Leibler divergence, showing that the inclusion probabilities for successive sampling are more proportional to the drawing probabilities.

preprint2011arXiv

A natural derivative on [0,n] and a binomial Poincaré inequality

We consider probability measures supported on a finite discrete interval $[0,n]$. We introduce a new finitedifference operator $\nabla_n$, defined as a linear combination of left and right finite differences. We show that this operator $\nabla_n$ plays a key role in a new Poincaré (spectral gap) inequality with respect to binomial weights, with the orthogonal Krawtchouk polynomials acting as eigenfunctions of the relevant operator. We briefly discuss the relationship of this operator to the problem of optimal transport of probability measures.

preprint2011arXiv

On Log-concavity of the Generalized Marcum Q Function

It is shown that, if nu >= 1/2 then the generalized Marcum Q function Q_nu(a, b) is log-concave in 0<=b <infty. This proves a conjecture of Sun, Baricz and Zhou (2010). We also point out relevant results in the statistics literature.

preprint2011arXiv

On Normal Variance-Mean Mixtures

Normal variance-mean mixtures encompass a large family of useful distributions such as the generalized hyperbolic distribution, which itself includes the Student t, Laplace, hyperbolic, normal inverse Gaussian, and variance gamma distributions as special cases. We study shape properties of normal variance-mean mixtures, in both the univariate and multivariate cases, and determine conditions for unimodality and log-concavity of the density functions. This leads to a short proof of the unimodality of all generalized hyperbolic densities. We also interpret such results in practical terms and discuss discrete analogues.

preprint2011arXiv

Prior Ordering and Monotonicity in Dirichlet Bandits

One of two independent stochastic processes (arms) are to be selected at each of n stages. The selection is sequential and depends on past observations as well as the prior information. Observations from arm i are independent given a distribution P_i, and, following Clayton and Berry (1985), P_i's have independent Dirichlet process priors. The objective is to maximize the expected future-discounted sum of the n observations. We study structural properties of the bandit, in particular how the maximum expected payoff and the optimal strategy vary with the Dirichlet process priors. The main results are (i) for a particular arm and a fixed prior weight, the maximum expected payoff increases as the mean of the Dirichlet process prior becomes larger in the increasing convex order; (ii) for a fixed prior mean, the maximum expected payoff decreases as the prior weight increases. Specializing to the one-armed bandit, the second result captures the intuition that, given the same immediate payoff, the more is known about an arm, the less desirable it becomes because there is less to learn when selecting that arm. This extends some results of Gittins and Wang (1992) on Bernoulli bandits and settles a conjecture of Clayton and Berry (1985).

preprint2011arXiv

Some stochastic inequalities for weighted sums

We compare weighted sums of i.i.d. positive random variables according to the usual stochastic order. The main inequalities are derived using majorization techniques under certain log-concavity assumptions. Specifically, let $Y_i$ be i.i.d. random variables on $\mathbf{R}_+$. Assuming that $\log Y_i$ has a log-concave density, we show that $\sum a_iY_i$ is stochastically smaller than $\sum b_iY_i$, if $(\log a_1,...,\log a_n)$ is majorized by $(\log b_1,...,\log b_n)$. On the other hand, assuming that $Y_i^p$ has a log-concave density for some $p>1$, we show that $\sum a_iY_i$ is stochastically larger than $\sum b_iY_i$, if $(a_1^q,...,a_n^q)$ is majorized by $(b_1^q,...,b_n^q)$, where $p^{-1}+q^{-1}=1$. These unify several stochastic ordering results for specific distributions. In particular, a conjecture of Hitczenko [Sankhyā A 60 (1998) 171--175] on Weibull variables is proved. Potential applications in reliability and wireless communications are mentioned.

preprint2011arXiv

Structural Properties of Bayesian Bandits with Exponential Family Distributions

We study a bandit problem where observations from each arm have an exponential family distribution and different arms are assigned independent conjugate priors. At each of n stages, one arm is to be selected based on past observations. The goal is to find a strategy that maximizes the expected discounted sum of the $n$ observations. Two structural results hold in broad generality: (i) for a fixed prior weight, an arm becomes more desirable as its prior mean increases; (ii) for a fixed prior mean, an arm becomes more desirable as its prior weight decreases. These generalize and unify several results in the literature concerning specific problems including Bernoulli and normal bandits. The second result captures an aspect of the exploration-exploitation dilemma in precise terms: given the same immediate payoff, the less one knows about an arm, the more desirable it becomes because there remains more information to be gained when selecting that arm. For Bernoulli and normal bandits we also obtain extensions to nonconjugate priors.

preprint2011arXiv

The Shape of the Noncentral Chi-square Density

A noncentral chi-square density is log-concave if the degree of freedom is nu>=2. We complement this known result by showing that, for each 0<nu<2, there exists lambda_nu>0 such that the chi-square with nu degrees of freedom and noncentrality parameter lambda has a decreasing density if lambda <= lambda_nu, and is bi-modal otherwise. The critical lambda_nu is characterized by an equation involving a ratio of modified Bessel functions. When an interior mode exists we derive precise bounds on its location.

preprint2010arXiv

Concave Renewal Functions Do Not Imply DFR Inter-Renewal Times

Brown (1980, 1981) proved that the renewal function is concave if the inter-renewal distribution is DFR (decreasing failure rate), and conjectured the converse. This note settles Brown's conjecture with a class of counter-examples. We also give a short proof of Shanthikumar's (1988) result that the DFR property is closed under geometric compounding.

preprint2010arXiv

D-optimal designs via a cocktail algorithm

A fast new algorithm is proposed for numerical computation of (approximate) D-optimal designs. This "cocktail algorithm" extends the well-known vertex direction method (VDM; Fedorov 1972) and the multiplicative algorithm (Silvey, Titterington and Torsney, 1978), and shares their simplicity and monotonic convergence properties. Numerical examples show that the cocktail algorithm can lead to dramatically improved speed, sometimes by orders of magnitude, relative to either the multiplicative algorithm or the vertex exchange method (a variant of VDM). Key to the improved speed is a new nearest neighbor exchange strategy, which acts locally and complements the global effect of the multiplicative algorithm. Possible extensions to related problems such as nonparametric maximum likelihood estimation are mentioned.

preprint2010arXiv

Improved EM for Mixture Proportions with Applications to Nonparametric ML Estimation for Censored Data

Improved EM strategies, based on the idea of efficient data augmentation (Meng and van Dyk 1997, 1998), are presented for ML estimation of mixture proportions. The resulting algorithms inherit the simplicity, ease of implementation, and monotonic convergence properties of EM, but have considerably improved speed. Because conventional EM tends to be slow when there exists a large overlap between the mixture components, we can improve the speed without sacrificing the simplicity or stability, if we can reformulate the problem so as to reduce the amount of overlap. We propose simple "squeezing" strategies for that purpose. Moreover, for high-dimensional problems, such as computing the nonparametric MLE of the distribution function with censored data, a natural and effective remedy for conventional EM is to add exchange steps (based on improved EM) between adjacent mixture components, where the overlap is most severe. Theoretical considerations show that the resulting EM-type algorithms, when carefully implemented, are globally convergent. Simulated and real data examples show dramatic improvement in speed in realistic situations.

preprint2010arXiv

Monotonic convergence of a general algorithm for computing optimal designs

Monotonic convergence is established for a general class of multiplicative algorithms introduced by Silvey, Titterington and Torsney [Comm. Statist. Theory Methods 14 (1978) 1379--1389] for computing optimal designs. A conjecture of Titterington [Appl. Stat. 27 (1978) 227--234] is confirmed as a consequence. Optimal designs for logistic regression are used as an illustration.

preprint2010arXiv

Monotonicity, thinning and discrete versions of the Entropy Power Inequality

We consider the entropy of sums of independent discrete random variables, in analogy with Shannon's Entropy Power Inequality, where equality holds for normals. In our case, infinite divisibility suggests that equality should hold for Poisson variables. We show that some natural analogues of the Entropy Power Inequality do not in fact hold, but propose an alternative formulation which does always hold. The key to many proofs of Shannon's Entropy Power Inequality is the behaviour of entropy on scaling of continuous random variables. We believe that Rényi's operation of thinning discrete random variables plays a similar role to scaling, and give a sharp bound on how the entropy of ultra log-concave random variables behaves on thinning. In the spirit of the monotonicity results established by Artstein, Ball, Barthe and Naor, we prove a stronger version of concavity of entropy, which implies a strengthened form of our discrete Entropy Power Inequality.

preprint2010arXiv

On a Multiplicative Algorithm for Computing Bayesian D-optimal Designs

We use the minorization-maximization principle (Lange, Hunter and Yang 2000) to establish the monotonicity of a multiplicative algorithm for computing Bayesian D-optimal designs. This proves a conjecture of Dette, Pepelyshev and Zhigljavsky (2008).

preprint2010arXiv

Relative log-concavity and a pair of triangle inequalities

The relative log-concavity ordering $\leq_{\mathrm{lc}}$ between probability mass functions (pmf's) on non-negative integers is studied. Given three pmf's $f,g,h$ that satisfy $f\leq_{\mathrm{lc}}g\leq_{\mathrm{lc}}h$, we present a pair of (reverse) triangle inequalities: if $\sum_iif_i=\sum_iig_i<\infty,$ then \[D(f|h)\geq D(f|g)+D(g|h)\] and if $\sum_iig_i=\sum_iih_i<\infty,$ then \[D(h|f)\geq D(h|g)+D(g|f),\] where $D(\cdot|\cdot)$ denotes the Kullback--Leibler divergence. These inequalities, interesting in themselves, are also applied to several problems, including maximum entropy characterizations of Poisson and binomial distributions and the best binomial approximation in relative entropy. We also present parallel results for continuous distributions and discuss the behavior of $\leq_{\mathrm{lc}}$ under convolution.

preprint2010arXiv

Sharp Bounds on the Entropy of the Poisson Law and Related Quantities

One of the difficulties in calculating the capacity of certain Poisson channels is that H(lambda), the entropy of the Poisson distribution with mean lambda, is not available in a simple form. In this work we derive upper and lower bounds for H(lambda) that are asymptotically tight and easy to compute. The derivation of such bounds involves only simple probabilistic and analytic tools. This complements the asymptotic expansions of Knessl (1998), Jacquet and Szpankowski (1999), and Flajolet (1999). The same method yields tight bounds on the relative entropy D(n, p) between a binomial and a Poisson, thus refining the work of Harremoes and Ruzankin (2004). Bounds on the entropy of the binomial also follow easily.

preprint2010arXiv

Strict Monotonicity and Convergence Rate of Titterington's Algorithm for Computing D-optimal Designs

We study a class of multiplicative algorithms introduced by Silvey et al. (1978) for computing D-optimal designs. Strict monotonicity is established for a variant considered by Titterington (1978). A formula for the rate of convergence is also derived. This is used to explain why modifications considered by Titterington (1978) and Dette et al. (2008) usually converge faster.

preprint2009arXiv

Squeezing the Arimoto-Blahut algorithm for faster convergence

The Arimoto--Blahut algorithm for computing the capacity of a discrete memoryless channel is revisited. A so-called ``squeezing'' strategy is used to design algorithms that preserve its simplicity and monotonic convergence properties, but have provably better rates of convergence.

Yaming Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

On Stochastic Comparisons of Order Statistics from Heterogeneous Exponential Samples

On the Unique Crossing Conjecture of Diaconis and Perlman on Convolutions of Gamma Random Variables

Thompson Sampling in Dynamic Systems for Contextual Bandit Problems

A Bayesian Analysis of the Correlations Among Sunspot Cycles

On the inclusion probabilities in some unequal probability sampling plans without replacement

A natural derivative on [0,n] and a binomial Poincaré inequality

On Log-concavity of the Generalized Marcum Q Function

On Normal Variance-Mean Mixtures

Prior Ordering and Monotonicity in Dirichlet Bandits

Some stochastic inequalities for weighted sums

Structural Properties of Bayesian Bandits with Exponential Family Distributions

The Shape of the Noncentral Chi-square Density

Concave Renewal Functions Do Not Imply DFR Inter-Renewal Times

D-optimal designs via a cocktail algorithm

Improved EM for Mixture Proportions with Applications to Nonparametric ML Estimation for Censored Data

Monotonic convergence of a general algorithm for computing optimal designs

Monotonicity, thinning and discrete versions of the Entropy Power Inequality

On a Multiplicative Algorithm for Computing Bayesian D-optimal Designs

Relative log-concavity and a pair of triangle inequalities

Sharp Bounds on the Entropy of the Poisson Law and Related Quantities

Strict Monotonicity and Convergence Rate of Titterington's Algorithm for Computing D-optimal Designs

Squeezing the Arimoto-Blahut algorithm for faster convergence