Researcher profile

Alfred O. Hero

Alfred O. Hero contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2022arXiv

A unified framework for correlation mining in ultra-high dimension

Many applications benefit from theory relevant to the identification of variables having large correlations or partial correlations in high dimension. Recently there has been progress in the ultra-high dimensional setting when the sample size $n$ is fixed and the dimension $p$ tends to infinity. Despite these advances, the correlation screening framework suffers from practical, methodological and theoretical deficiencies. For instance, previous correlation screening theory requires that the population covariance matrix be sparse and block diagonal. This block sparsity assumption is however restrictive in practical applications. As a second example, correlation and partial correlation screening requires the estimation of dependence measures, which can be computationally prohibitive. In this paper, we propose a unifying approach to correlation and partial correlation mining that is not restricted to block diagonal correlation structure, thus yielding a methodology that is suitable for modern applications. By making connections to random geometric graphs, the number of highly correlated or partial correlated variables are shown to have compound Poisson finite-sample characterizations, which hold for both the finite $p$ case and when $p$ tends to infinity. The unifying framework also demonstrates a duality between correlation and partial correlation screening with theoretical and practical consequences.

preprint2022arXiv

An Improvement on the Hotelling $T^2$ Test Using the Ledoit-Wolf Nonlinear Shrinkage Estimator

Hotelling's $T^2$ test is a classical approach for discriminating the means of two multivariate normal samples that share a population covariance matrix. Hotelling's test is not ideal for high-dimensional samples because the eigenvalues of the estimated sample covariance matrix are inconsistent estimators for their population counterparts. We replace the sample covariance matrix with the nonlinear shrinkage estimator of Ledoit and Wolf 2020. We observe empirically for sub-Gaussian data that the resulting algorithm dominates past methods (Bai and Saranadasa 1996, Chen and Qin 2010, and Li et al. 2020) for a family of population covariance matrices that includes matrices with high or low condition number and many or few nontrivial -- i.e., spiked -- eigenvalues.

preprint2020arXiv

A Geometric Approach to Online Streaming Feature Selection

Online Streaming Feature Selection (OSFS) is a sequential learning problem where individual features across all samples are made available to algorithms in a streaming fashion. In this work, firstly, we assert that OSFS's main assumption of having data from all the samples available at runtime is unrealistic and introduce a new setting where features and samples are streamed concurrently called OSFS with Streaming Samples (OSFS-SS). Secondly, the primary OSFS method, SAOLA utilizes an unbounded mutual information measure and requires multiple comparison steps between the stored and incoming feature sets to evaluate a feature's importance. We introduce Geometric Online Adaption, an algorithm that requires relatively less feature comparison steps and uses a bounded conditional geometric dependency measure. Our algorithm outperforms several OSFS baselines including SAOLA on a variety of datasets. We also extend SAOLA to work in the OSFS-SS setting and show that GOA continues to achieve the best results. Thirdly, the current paradigm of the OSFS algorithm comparison is flawed. Algorithms are measured by comparing the number of features used and the accuracy obtained by the learner, two properties that are fundamentally at odds with one another. Without fixing a limit on either of these properties, the qualities of features obtained by different algorithms are incomparable. We try to rectify this inconsistency by fixing the maximum number of features available to the learner and comparing algorithms in terms of their accuracy. Additionally, we characterize the behaviour of SAOLA and GOA on feature sets derived from popular deep convolutional featurizers.

preprint2020arXiv

Learning to Bound the Multi-class Bayes Error

In the context of supervised learning, meta learning uses features, metadata and other information to learn about the difficulty, behavior, or composition of the problem. Using this knowledge can be useful to contextualize classifier results or allow for targeted decisions about future data sampling. In this paper, we are specifically interested in learning the Bayes error rate (BER) based on a labeled data sample. Providing a tight bound on the BER that is also feasible to estimate has been a challenge. Previous work[1] has shown that a pairwise bound based on the sum of Henze-Penrose (HP) divergence over label pairs can be directly estimated using a sum of Friedman-Rafsky (FR) multivariate run test statistics. However, in situations in which the dataset and number of classes are large, this bound is computationally infeasible to calculate and may not be tight. Other multi-class bounds also suffer from computationally complex estimation procedures. In this paper, we present a generalized HP divergence measure that allows us to estimate the Bayes error rate with log-linear computation. We prove that the proposed bound is tighter than both the pairwise method and a bound proposed by Lin [2]. We also empirically show that these bounds are close to the BER. We illustrate the proposed method on the MNIST dataset, and show its utility for the evaluation of feature reduction strategies. We further demonstrate an approach for evaluation of deep learning architectures using the proposed bounds.

preprint2020arXiv

OrthoReg: Robust Network Pruning Using Orthonormality Regularization

Network pruning in Convolutional Neural Networks (CNNs) has been extensively investigated in recent years. To determine the impact of pruning a group of filters on a network's accuracy, state-of-the-art pruning methods consistently assume filters of a CNN are independent. This allows the importance of a group of filters to be estimated as the sum of importances of individual filters. However, overparameterization in modern networks results in highly correlated filters that invalidate this assumption, thereby resulting in incorrect importance estimates. To address this issue, we propose OrthoReg, a principled regularization strategy that enforces orthonormality on a network's filters to reduce inter-filter correlation, thereby allowing reliable, efficient determination of group importance estimates, improved trainability of pruned networks, and efficient, simultaneous pruning of large groups of filters. When used for iterative pruning on VGG-13, MobileNet-V1, and ResNet-34, OrthoReg consistently outperforms five baseline techniques, including the state-of-the-art, on CIFAR-100 and Tiny-ImageNet. For the recently proposed Early-Bird Ticket hypothesis, which claims networks become amenable to pruning early-on in training and can be pruned after a few epochs to minimize training expenditure, we find OrthoReg significantly outperforms prior work. Code available at https://github.com/EkdeepSLubana/OrthoReg.

preprint2020arXiv

Pattern-Based Analysis of Time Series: Estimation

While Internet of Things (IoT) devices and sensors create continuous streams of information, Big Data infrastructures are deemed to handle the influx of data in real-time. One type of such a continuous stream of information is time series data. Due to the richness of information in time series and inadequacy of summary statistics to encapsulate structures and patterns in such data, development of new approaches to learn time series is of interest. In this paper, we propose a novel method, called pattern tree, to learn patterns in the times-series using a binary-structured tree. While a pattern tree can be used for many purposes such as lossless compression, prediction and anomaly detection, in this paper we focus on its application in time series estimation and forecasting. In comparison to other methods, our proposed pattern tree method improves the mean squared error of estimation.

preprint2020arXiv

Predicting solar flares with machine learning: investigating solar cycle dependence

A deep learning network, Long-Short Term Memory (LSTM) network, is used in this work to predict whether the maximum flare class an active region (AR) will produce in the next 24 hours is class $Γ$. We considered $Γ$ are $\ge M$, $\ge C$ and any flare class. The essence of using LSTM, which is a recurrent neural network, is its capability to capture temporal information of the data samples. The input features are time sequences of 20 magnetic parameters from SHARPs - Space-weather HMI Active Region Patches. We analyzed active regions from June 2010 to Dec 2018, using the Geostationary Operational Environmental Satellite (GOES) X-ray flare catalogs and label the data samples with identified ARs in the GOES X-ray flare catalogs. Our results (i) shows consistent skill scores with recently published results using LSTMs and better than the previous work using single time input (eg. DeFN) (ii) The skill scores from the model show essential differences when different years of data was chosen for training and testing.

preprint2020arXiv

Robust Distributed Fixed-Time Economic Dispatch under Time-Varying Topology

The centralized power generation infrastructure that defines the North American electric grid is slowly moving to the distributed architecture due to the explosion in use of renewable generation and distributed energy resources (DERs), such as residential solar, wind turbines and battery storage. Furthermore, variable pricing policies and profusion of flexible loads entail frequent and severe changes in power outputs required from the individual generation units, requiring fast availability of power allocation. To this end, a fixed-time convergent, fully distributed economic dispatch algorithm for scheduling optimal power generation among a set of DERs is proposed. The proposed algorithm incorporates both load balance and generation capacity constraints.

preprint2020arXiv

Testing that a Local Optimum of the Likelihood is Globally Optimum using Reparameterized Embeddings

Many mathematical imaging problems are posed as non-convex optimization problems. When numerically tractable global optimization procedures are not available, one is often interested in testing ex post facto whether or not a locally convergent algorithm has found the globally optimal solution. When the problem is formulated in terms of maximizing the likelihood function under a statistical model for the measurements, one can construct a statistical test that a local maximum is in fact the global maximum. A one-sided test is proposed for the case that the statistical model is a member of the generalized location family of probability distributions, a condition often satisfied in imaging and other inverse problems. We propose a general method for improving the accuracy of the test by reparameterizing the likelihood function to embed its domain into a higher dimensional parameter space. We show that the proposed global maximum testing method results in improved accuracy and reduced computation for a physically-motivated joint-inverse problem arising in camera-blur estimation.

preprint2019arXiv

Identifying Solar Flare Precursors Using Time Series of SDO/HMI Images and SHARP Parameters

We present several methods towards construction of precursors, which show great promise towards early predictions, of solar flare events in this paper. A data pre-processing pipeline is built to extract useful data from multiple sources, Geostationary Operational Environmental Satellites (GOES) and Solar Dynamics Observatory (SDO)/Helioseismic and Magnetic Imager (HMI), to prepare inputs for machine learning algorithms. Two classification models are presented: classification of flares from quiet times for active regions and classification of strong versus weak flare events. We adopt deep learning algorithms to capture both the spatial and temporal information from HMI magnetogram data. Effective feature extraction and feature selection with raw magnetogram data using deep learning and statistical algorithms enable us to train classification models to achieve almost as good performance as using active region parameters provided in HMI/Space-Weather HMI-Active Region Patch (SHARP) data files. Case studies show a significant increase in the prediction score around 20 hours before strong solar flare events.

preprint2009arXiv

Joint Bayesian endmember extraction and linear unmixing for hyperspectral imagery

This paper studies a fully Bayesian algorithm for endmember extraction and abundance estimation for hyperspectral imagery. Each pixel of the hyperspectral image is decomposed as a linear combination of pure endmember spectra following the linear mixing model. The estimation of the unknown endmember spectra is conducted in a unified manner by generating the posterior distribution of abundances and endmember parameters under a hierarchical Bayesian model. This model assumes conjugate prior distributions for these parameters, accounts for non-negativity and full-additivity constraints, and exploits the fact that the endmember proportions lie on a lower dimensional simplex. A Gibbs sampler is proposed to overcome the complexity of evaluating the resulting posterior distribution. This sampler generates samples distributed according to the posterior distribution and estimates the unknown parameters using these generated samples. The accuracy of the joint Bayesian estimator is illustrated by simulations conducted on synthetic and real AVIRIS images.