Source author record

Howard Bondell

Howard Bondell appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Machine Learning physics.soc-ph math.ST Methodology stat.OT Statistics Theory

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

FedDAG: Federated DAG Structure Learning

To date, most directed acyclic graphs (DAGs) structure learning approaches require data to be stored in a central server. However, due to the consideration of privacy protection, data owners gradually refuse to share their personalized raw data to avoid private information leakage, making this task more troublesome by cutting off the first step. Thus, a puzzle arises: \textit{how do we discover the underlying DAG structure from decentralized data?} In this paper, focusing on the additive noise models (ANMs) assumption of data generation, we take the first step in developing a gradient-based learning framework named FedDAG, which can learn the DAG structure without directly touching the local data and also can naturally handle the data heterogeneity. Our method benefits from a two-level structure of each local model. The first level structure learns the edges and directions of the graph and communicates with the server to get the model information from other clients during the learning procedure, while the second level structure approximates the mechanisms among variables and personally updates on its own data to accommodate the data heterogeneity. Moreover, FedDAG formulates the overall learning task as a continuous optimization problem by taking advantage of an equality acyclicity constraint, which can be solved by gradient descent methods to boost the searching efficiency. Extensive experiments on both synthetic and real-world datasets verify the efficacy of the proposed method.

preprint2023arXiv

MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models

State-of-the-art causal discovery methods usually assume that the observational data is complete. However, the missing data problem is pervasive in many practical scenarios such as clinical trials, economics, and biology. One straightforward way to address the missing data problem is first to impute the data using off-the-shelf imputation methods and then apply existing causal discovery methods. However, such a two-step method may suffer from suboptimality, as the imputation algorithm may introduce bias for modeling the underlying data distribution. In this paper, we develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations. Focusing mainly on the assumptions of ignorable missingness and the identifiable additive noise models (ANMs), MissDAG maximizes the expected likelihood of the visible part of observations under the expectation-maximization (EM) framework. In the E-step, in cases where computing the posterior distributions of parameters in closed-form is not feasible, Monte Carlo EM is leveraged to approximate the likelihood. In the M-step, MissDAG leverages the density transformation to model the noise distributions with simpler and specific formulations by virtue of the ANMs and uses a likelihood-based causal discovery algorithm with directed acyclic graph constraint. We demonstrate the flexibility of MissDAG for incorporating various causal discovery algorithms and its efficacy through extensive simulations and real data experiments.

preprint2022arXiv

Comparing the dynamics of COVID-19 infection and mortality in the United States, India, and Brazil

This paper compares and contrasts the spread and impact of COVID-19 in the three countries most heavily impacted by the pandemic: the United States (US), India and Brazil. All three of these countries have a federal structure, in which the individual states have largely determined the response to the pandemic. Thus, we perform an extensive analysis of the individual states of these three countries to determine patterns of similarity within each. First, we analyse structural similarity and anomalies in the trajectories of cases and deaths as multivariate time series. Next, we study the lengths of the different waves of the virus outbreaks across the three countries and their states. Finally, we investigate suitable time offsets between cases and deaths as a function of the distinct outbreak waves. In all these analyses, we consistently reveal more characteristically distinct behaviour between US and Indian states, while Brazilian states exhibit less structure in their wave behaviour and changing progression between cases and deaths.

preprint2022arXiv

In search of peak human athletic potential: A mathematical investigation

This paper applies existing and new approaches to study trends in the performance of elite athletes over time. We study both track and field scores of men and women athletes on a yearly basis from 2001 to 2019, revealing several trends and findings. First, we perform a detailed regression study to reveal the existence of an "Olympic effect", where average performance improves during Olympic years. Next, we study the rate of change in athlete performance and fail to reject the notion that athlete scores are leveling off, at least among the top 100 annual scores. Third, we examine the relationship in performance trends among men and women's categories of the same event, revealing striking similarity, together with some anomalous events. Finally, we analyze the geographic composition of the world's top athletes, attempting to understand how the diversity by country and continent varies over time across events. We challenge a widely held conception of athletics, that certain events are more geographically dominated than others. Our methods and findings could be applied more generally to identify evolutionary dynamics in group performance and highlight spatio-temporal trends in group composition.

preprint2022arXiv

Temporal and spectral governing dynamics of Australian hydrological streamflow time series

We use new and established methodologies in multivariate time series analysis to study the dynamics of 414 Australian hydrological stations' streamflow. First, we analyze our collection of time series in the temporal domain, and compare the similarity in hydrological stations' candidate trajectories. Then, we introduce a Whittle Likelihood-based optimization framework to study the collective similarity in periodic phenomena among our collection of stations. Having identified noteworthy similarity in the temporal and spectral domains, we introduce an algorithmic procedure to estimate a governing hydrological streamflow process across Australia. To determine the stability of such behaviours over time, we then study the evolution of the governing dynamics and underlying time series with time-varying applications of principal components analysis (PCA) and spectral analysis.

preprint2018arXiv

Bayesian inference in high-dimensional linear models using an empirical correlation-adaptive prior

In the context of a high-dimensional linear regression model, we propose the use of an empirical correlation-adaptive prior that makes use of information in the observed predictor variable matrix to adaptively address high collinearity, determining if parameters associated with correlated predictors should be shrunk together or kept apart. Under suitable conditions, we prove that this empirical Bayes posterior concentrates around the true sparse parameter at the optimal rate asymptotically. A simplified version of a shotgun stochastic search algorithm is employed to implement the variable selection procedure, and we show, via simulation experiments across different settings and a real-data application, the favorable performance of the proposed method compared to existing methods.

Howard Bondell

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

FedDAG: Federated DAG Structure Learning

MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models

Comparing the dynamics of COVID-19 infection and mortality in the United States, India, and Brazil

In search of peak human athletic potential: A mathematical investigation

Temporal and spectral governing dynamics of Australian hydrological streamflow time series

Bayesian inference in high-dimensional linear models using an empirical correlation-adaptive prior