Source author record

Wensong Wu

Wensong Wu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory math.NA

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2015arXiv

Computational Modeling of Spectral Data Fitting with Nonlinear Distortions

Substances such as chemical compounds are invisible to human eyes, they are usually captured by sensing equipments with their spectral fingerprints. Though spectra of pure chemicals can be identified by visual inspection, the spectra of their mixtures take a variety of complicated forms. Given the knowledge of spectral references of the constituent chemicals, the task of data fitting is to retrieve their weights, and this usually can be obtained by solving a least squares problem. Complications occur if the basis functions (reference spectra) may not be used directly to best fit the data. In fact, random distortions (spectral variability) such as shifting, compression, and expansion have been observed in some source spectra when the underlying substances are mixed. In this paper, we formulate mathematical model for such nonlinear effects and build them into data fitting algorithms. If minimal knowledge of the distortions is available, a deterministic approach termed {\it augmented least squares} is developed and it fits the spectral references along with their derivatives to the mixtures. If the distribution of the distortions is known a prior, we consider to solve the problem with maximum likelihood estimators which incorporate the shifts into the variance matrix. The proposed methods are substantiated with numerical examples including data from Raman spectroscopy (RS), nuclear magnetic resonance (NMR), and differential optical absorption spectroscopy (DOAS) and show satisfactory results.

preprint2011arXiv

Bayes Multiple Decision Functions

This paper deals with the problem of simultaneously making many (M) binary decisions based on one realization of a random data matrix X. M is typically large and X will usually have M rows associated with each of the M decisions to make, but for each row the data may be low dimensional. A Bayesian decision-theoretic approach for this problem is implemented with the overall loss function being a cost-weighted linear combination of Type I and Type II loss functions. The class of loss functions considered allows for the use of the false discovery rate (FDR), false nondiscovery rate (FNR), and missed discovery rate (MDR) in assessing the decision. Through this Bayesian paradigm, the Bayes multiple decision function (BMDF) is derived and an efficient algorithm to obtain the optimal Bayes action is described. In contrast to many works in the literature where the rows of the matrix X are assumed to be stochastically independent, we allow in this paper a dependent data structure with the associations obtained through a class of frailty-induced Archimedean copulas. In particular, non-Gaussian dependent data structure, which is the norm rather than the exception when dealing with failure-time data, can be entertained. The numerical implementation of the determination of the Bayes optimal action is facilitated through sequential Monte Carlo techniques. The main theory developed could also be extended to the problem of multiple hypotheses testing, multiple classification and prediction, and high-dimensional variable selection. The proposed procedure is illustrated for the simple versus simple and for the composite hypotheses setting via simulation studies. The procedure is also applied to a subset of a real microarray data set from a colon cancer study.

preprint2011arXiv

Power-enhanced multiple decision functions controlling family-wise error and false discovery rates

Improved procedures, in terms of smaller missed discovery rates (MDR), for performing multiple hypotheses testing with weak and strong control of the family-wise error rate (FWER) or the false discovery rate (FDR) are developed and studied. The improvement over existing procedures such as the Šidák procedure for FWER control and the Benjamini--Hochberg (BH) procedure for FDR control is achieved by exploiting possible differences in the powers of the individual tests. Results signal the need to take into account the powers of the individual tests and to have multiple hypotheses decision functions which are not limited to simply using the individual $p$-values, as is the case, for example, with the Šidák, Bonferroni, or BH procedures. They also enhance understanding of the role of the powers of individual tests, or more precisely the receiver operating characteristic (ROC) functions of decision processes, in the search for better multiple hypotheses testing procedures. A decision-theoretic framework is utilized, and through auxiliary randomizers the procedures could be used with discrete or mixed-type data or with rank-based nonparametric tests. This is in contrast to existing $p$-value based procedures whose theoretical validity is contingent on each of these $p$-value statistics being stochastically equal to or greater than a standard uniform variable under the null hypothesis. Proposed procedures are relevant in the analysis of high-dimensional "large $M$, small $n$" data sets arising in the natural, physical, medical, economic and social sciences, whose generation and creation is accelerated by advances in high-throughput technology, notably, but not limited to, microarray technology.

Wensong Wu

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Computational Modeling of Spectral Data Fitting with Nonlinear Distortions

Bayes Multiple Decision Functions

Power-enhanced multiple decision functions controlling family-wise error and false discovery rates