Source author record

Raymond K. W. Wong

Raymond K. W. Wong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Machine Learning Applications astro-ph.HE astro-ph.IM eess.SP math.ST nucl-th Statistics Theory

Catalog footprint

What is connected

14works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Fair Regression under Demographic Parity: A Unified Framework

We propose a unified framework for fair regression tasks formulated as risk minimization problems subject to a demographic parity constraint. Unlike many existing approaches that are limited to specific loss functions or rely on challenging non-convex optimization, our framework is applicable to a broad spectrum of regression tasks. Examples include linear regression with squared loss, binary classification with cross-entropy loss, quantile regression with pinball loss, and robust regression with Huber loss. We derive a novel characterization of the fair risk minimizer, which yields a computationally efficient estimation procedure for general loss functions. Theoretically, we establish the asymptotic consistency of the proposed estimator and derive its convergence rates under mild assumptions. We illustrate the method's versatility through detailed discussions of several common loss functions. Numerical results demonstrate that our approach effectively minimizes risk while satisfying fairness constraints across various regression settings.

preprint2022arXiv

A New Constraint on the Nuclear Equation of State from Statistical Distributions of Compact Remnants of Supernovae

Understanding how matter behaves at the highest densities and temperatures is a major open problem in both nuclear physics and relativistic astrophysics. This physics is often encapsulated in the so-called high-temperature nuclear equation of state, which influences compact binary mergers, core-collapse supernovae, and many more phenomena. One such case is the type (either black hole or neutron star) and mass of the remnant of the core collapse of a massive star. For each of six candidate equations of state, we use a very large suite of spherically symmetric supernova models to generate a suite of synthetic populations of such remnants. We then compare these synthetic populations to the observed remnant population. We thus provide a novel constraint on the high-temperature nuclear equation of state and describe which EOS candidates are more or less favored by this metric.

preprint2022arXiv

Extending the Use of MDL for High-Dimensional Problems: Variable Selection, Robust Fitting, and Additive Modeling

In the signal processing and statistics literature, the minimum description length (MDL) principle is a popular tool for choosing model complexity. Successful examples include signal denoising and variable selection in linear regression, for which the corresponding MDL solutions often enjoy consistent properties and produce very promising empirical results. This paper demonstrates that MDL can be extended naturally to the high-dimensional setting, where the number of predictors $p$ is larger than the number of observations $n$. It first considers the case of linear regression, then allows for outliers in the data, and lastly extends to the robust fitting of nonparametric additive models. Results from numerical experiments are presented to demonstrate the efficiency and effectiveness of the MDL approach.

preprint2022arXiv

Hierarchical nuclear norm penalization for multi-view data

The prevalence of data collected on the same set of samples from multiple sources (i.e., multi-view data) has prompted significant development of data integration methods based on low-rank matrix factorizations. These methods decompose signal matrices from each view into the sum of shared and individual structures, which are further used for dimension reduction, exploratory analyses, and quantifying associations across views. However, existing methods have limitations in modeling partially-shared structures due to either too restrictive models, or restrictive identifiability conditions. To address these challenges, we formulate a new model for partially-shared signals based on grouping the views into so-called hierarchical levels. The proposed hierarchy leads us to introduce a new penalty, hierarchical nuclear norm (HNN), for signal estimation. In contrast to existing methods, HNN penalization avoids scores and loadings factorization of the signals and leads to a convex optimization problem, which we solve using a dual forward-backward algorithm. We propose a simple refitting procedure to adjust the penalization bias and develop an adapted version of bi-cross-validation for selecting tuning parameters. Extensive simulation studies and analysis of the genotype-tissue expression data demonstrate the advantages of our method over existing alternatives.

preprint2022arXiv

Projected State-action Balancing Weights for Offline Reinforcement Learning

Offline policy evaluation (OPE) is considered a fundamental and challenging problem in reinforcement learning (RL). This paper focuses on the value estimation of a target policy based on pre-collected data generated from a possibly different policy, under the framework of infinite-horizon Markov decision processes. Motivated by the recently developed marginal importance sampling method in RL and the covariate balancing idea in causal inference, we propose a novel estimator with approximately projected state-action balancing weights for the policy value estimation. We obtain the convergence rate of these weights and show that the proposed value estimator is semi-parametric efficient under technical conditions. In terms of asymptotics, our results scale with both the number of trajectories and the number of decision points at each trajectory. As such, consistency can still be achieved with a limited number of subjects when the number of decision points diverges. In addition, we develop a necessary and sufficient condition for establishing the well-posedness of the Bellman operator in the off-policy setting, which characterizes the difficulty of OPE and may be of independent interest. Numerical experiments demonstrate the promising performance of our proposed estimator.

preprint2021arXiv

Estimation of Partially Conditional Average Treatment Effect by Hybrid Kernel-covariate Balancing

We study nonparametric estimation for the partially conditional average treatment effect, defined as the treatment effect function over an interested subset of confounders. We propose a hybrid kernel weighting estimator where the weights aim to control the balancing error of any function of the confounders from a reproducing kernel Hilbert space after kernel smoothing over the subset of interested variables. In addition, we present an augmented version of our estimator which can incorporate estimations of outcome mean functions. Based on the representer theorem, gradient-based algorithms can be applied for solving the corresponding infinite-dimensional optimization problem. Asymptotic properties are studied without any smoothness assumptions for propensity score function or the need of data splitting, relaxing certain existing stringent assumptions. The numerical performance of the proposed estimator is demonstrated by a simulation study and an application to the effect of a mother's smoking on a baby's birth weight conditioned on the mother's age.

preprint2020arXiv

Adjusting for Spatial Effects in Genomic Prediction

This paper investigates the problem of adjusting for spatial effects in genomic prediction. Despite being seldomly considered in genomic prediction, spatial effects often affect phenotypic measurements of plants. We consider a Gaussian random field model with an additive covariance structure that incorporates genotype effects, spatial effects and subpopulation effects. An empirical study shows the existence of spatial effects and heterogeneity across different subpopulation families, while simulations illustrate the improvement in selecting genotypically superior plants by adjusting for spatial effects in genomic prediction.

preprint2020arXiv

Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis

A remarkable recent discovery in machine learning has been that deep neural networks can achieve impressive performance (in terms of both lower training error and higher generalization capacity) in the regime where they are massively over-parameterized. Consequently, over the past year, the community has devoted growing interest in analyzing optimization and generalization properties of over-parameterized networks, and several breakthrough works have led to important theoretical progress. However, the majority of existing work only applies to supervised learning scenarios and hence are limited to settings such as classification and regression. In contrast, the role of over-parameterization in the unsupervised setting has gained far less attention. In this paper, we study the gradient dynamics of two-layer over-parameterized autoencoders with ReLU activation. We make very few assumptions about the given training dataset (other than mild non-degeneracy conditions). Starting from a randomly initialized autoencoder network, we rigorously prove the linear convergence of gradient descent in two learning regimes, namely: (i) the weakly-trained regime where only the encoder is trained, and (ii) the jointly-trained regime where both the encoder and the decoder are trained. Our results indicate the considerable benefits of joint training over weak training for finding global optima, achieving a dramatic decrease in the required level of over-parameterization. We also analyze the case of weight-tied autoencoders (which is a commonly used architectural choice in practical settings) and prove that in the over-parameterized setting, training such networks from randomly initialized points leads to certain unexpected degeneracies.

preprint2020arXiv

Matrix Completion under Low-Rank Missing Mechanism

Matrix completion is a modern missing data problem where both the missing structure and the underlying parameter are high dimensional. Although missing structure is a key component to any missing data problems, existing matrix completion methods often assume a simple uniform missing mechanism. In this work, we study matrix completion from corrupted data under a novel low-rank missing mechanism. The probability matrix of observation is estimated via a high dimensional low-rank matrix estimation procedure, and further used to complete the target matrix via inverse probabilities weighting. Due to both high dimensional and extreme (i.e., very small) nature of the true probability matrix, the effect of inverse probability weighting requires careful study. We derive optimal asymptotic convergence rates of the proposed estimators for both the observation probabilities and the target matrix.

preprint2020arXiv

Median Matrix Completion: from Embarrassment to Optimality

In this paper, we consider matrix completion with absolute deviation loss and obtain an estimator of the median matrix. Despite several appealing properties of median, the non-smooth absolute deviation loss leads to computational challenge for large-scale data sets which are increasingly common among matrix completion problems. A simple solution to large-scale problems is parallel computing. However, embarrassingly parallel fashion often leads to inefficient estimators. Based on the idea of pseudo data, we propose a novel refinement step, which turns such inefficient estimators into a rate (near-)optimal matrix completion procedure. The refined estimator is an approximation of a regularized least median estimator, and therefore not an ordinary regularized empirical risk estimator. This leads to a non-standard analysis of asymptotic behaviors. Empirical results are also provided to confirm the effectiveness of the proposed method.

preprint2015arXiv

A Frequentist Approach to Computer Model Calibration

This paper considers the computer model calibration problem and provides a general frequentist solution. Under the proposed framework, the data model is semi-parametric with a nonparametric discrepancy function which accounts for any discrepancy between the physical reality and the computer model. In an attempt to solve a fundamentally important (but often ignored) identifiability issue between the computer model parameters and the discrepancy function, this paper proposes a new and identifiable parametrization of the calibration problem. It also develops a two-step procedure for estimating all the relevant quantities under the new parameterization. This estimation procedure is shown to enjoy excellent rates of convergence and can be straightforwardly implemented with existing software. For uncertainty quantification, bootstrapping is adopted to construct confidence regions for the quantities of interest. The practical performance of the proposed methodology is illustrated through simulation examples and an application to a computational fluid dynamics model.

preprint2015arXiv

Detecting Abrupt Changes in the Spectra of High-Energy Astrophysical Sources

Variable-intensity astronomical sources are the result of complex and often extreme physical processes. Abrupt changes in source intensity are typically accompanied by equally sudden spectral shifts, i.e., sudden changes in the wavelength distribution of the emission. This article develops a method for modeling photon counts collected from observation of such sources. We embed change points into a marked Poisson process, where photon wavelengths are regarded as marks and both the Poisson intensity parameter and the distribution of the marks are allowed to change. To the best of our knowledge this is the first effort to embed change points into a marked Poisson process. Between the change points, the spectrum is modeled non-parametrically using a mixture of a smooth radial basis expansion and a number of local deviations from the smooth term representing spectral emission lines. Because the model is over parameterized we employ an $\ell_1$ penalty. The tuning parameter in the penalty and the number of change points are determined via the minimum description length principle. Our method is validated via a series of simulation studies and its practical utility is illustrated in the analysis of the ultra-fast rotating yellow giant star known as FK Com.

preprint2015arXiv

Fiber Direction Estimation, Smoothing and Tracking in Diffusion MRI

Diffusion magnetic resonance imaging is an imaging technology designed to probe anatomical architectures of biological samples in an in vivo and non-invasive manner through measuring water diffusion. The contribution of this paper is threefold. First it proposes a new method to identify and estimate multiple diffusion directions within a voxel through a new and identifiable parametrization of the widely used multi-tensor model. Unlike many existing methods, this method focuses on the estimation of diffusion directions rather than the diffusion tensors. Second, this paper proposes a novel direction smoothing method which greatly improves direction estimation in regions with crossing fibers. This smoothing method is shown to have excellent theoretical and empirical properties. Lastly, this paper develops a fiber tracking algorithm that can handle multiple directions within a voxel. The overall methodology is illustrated with simulated data and a data set collected for the study of Alzheimer's disease by the Alzheimer's Disease Neuroimaging Initiative (ADNI).

preprint2014arXiv

Automatic estimation of flux distributions of astrophysical source populations

In astrophysics a common goal is to infer the flux distribution of populations of scientifically interesting objects such as pulsars or supernovae. In practice, inference for the flux distribution is often conducted using the cumulative distribution of the number of sources detected at a given sensitivity. The resulting "$\log(N>S)$-$\log (S)$" relationship can be used to compare and evaluate theoretical models for source populations and their evolution. Under restrictive assumptions the relationship should be linear. In practice, however, when simple theoretical models fail, it is common for astrophysicists to use prespecified piecewise linear models. This paper proposes a methodology for estimating both the number and locations of "breakpoints" in astrophysical source populations that extends beyond existing work in this field. An important component of the proposed methodology is a new interwoven EM algorithm that computes parameter estimates. It is shown that in simple settings such estimates are asymptotically consistent despite the complex nature of the parameter space. Through simulation studies it is demonstrated that the proposed methodology is capable of accurately detecting structural breaks in a variety of parameter configurations. This paper concludes with an application of our methodology to the Chandra Deep Field North (CDFN) data set.

Raymond K. W. Wong

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Fair Regression under Demographic Parity: A Unified Framework

A New Constraint on the Nuclear Equation of State from Statistical Distributions of Compact Remnants of Supernovae

Extending the Use of MDL for High-Dimensional Problems: Variable Selection, Robust Fitting, and Additive Modeling

Hierarchical nuclear norm penalization for multi-view data

Projected State-action Balancing Weights for Offline Reinforcement Learning

Estimation of Partially Conditional Average Treatment Effect by Hybrid Kernel-covariate Balancing

Adjusting for Spatial Effects in Genomic Prediction

Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis

Matrix Completion under Low-Rank Missing Mechanism

Median Matrix Completion: from Embarrassment to Optimality

A Frequentist Approach to Computer Model Calibration

Detecting Abrupt Changes in the Spectra of High-Energy Astrophysical Sources

Fiber Direction Estimation, Smoothing and Tracking in Diffusion MRI

Automatic estimation of flux distributions of astrophysical source populations