Researcher profile

Joseph McDonald

Joseph McDonald contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

An Evaluation of Low Overhead Time Series Preprocessing Techniques for Downstream Machine Learning

In this paper we address the application of pre-processing techniques to multi-channel time series data with varying lengths, which we refer to as the alignment problem, for downstream machine learning. The misalignment of multi-channel time series data may occur for a variety of reasons, such as missing data, varying sampling rates, or inconsistent collection times. We consider multi-channel time series data collected from the MIT SuperCloud High Performance Computing (HPC) center, where different job start times and varying run times of HPC jobs result in misaligned data. This misalignment makes it challenging to build AI/ML approaches for tasks such as compute workload classification. Building on previous supervised classification work with the MIT SuperCloud Dataset, we address the alignment problem via three broad, low overhead approaches: sampling a fixed subset from a full time series, performing summary statistics on a full time series, and sampling a subset of coefficients from time series mapped to the frequency domain. Our best performing models achieve a classification accuracy greater than 95%, outperforming previous approaches to multi-channel time series classification with the MIT SuperCloud Dataset by 5%. These results indicate our low overhead approaches to solving the alignment problem, in conjunction with standard machine learning techniques, are able to achieve high levels of classification accuracy, and serve as a baseline for future approaches to addressing the alignment problem, such as kernel methods.

preprint2022arXiv

Benchmarking Resource Usage for Efficient Distributed Deep Learning

Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. Neural architecture searches, hyperparameter sweeps, and rapid prototyping consume immense resources that can prevent resource-constrained researchers from experimenting with large models and carry considerable environmental impact. As such, it becomes essential to understand how different deep neural networks (DNNs) and training leverage increasing compute and energy resources -- especially specialized computationally-intensive models across different domains and applications. In this paper, we conduct over 3,400 experiments training an array of deep networks representing various domains/tasks -- natural language processing, computer vision, and chemistry -- on up to 424 graphics processing units (GPUs). During training, our experiments systematically vary compute resource characteristics and energy-saving mechanisms such as power utilization and GPU clock rate limits to capture and illustrate the different trade-offs and scaling behaviors each representative model exhibits under various resource and energy-constrained regimes. We fit power law models that describe how training time scales with available compute resources and energy constraints. We anticipate that these findings will help inform and guide high-performance computing providers in optimizing resource utilization, by selectively reducing energy consumption for different deep learning tasks/workflows with minimal impact on training.

preprint2022arXiv

Choice-free duality for orthocomplemented lattices by means of spectral spaces

The existing topological representation of an orthocomplemented lattice via the clopen orthoregular subsets of a Stone space depends upon Alexander's Subbase Theorem, which asserts that a topological space $X$ is compact if every subbasic open cover of $X$ admits of a finite subcover. This is an easy consequence of the Ultrafilter Theorem - whose proof depends upon Zorn's Lemma, which is well known to be equivalent to the Axiom of Choice. Within this work, we give a choice-free topological representation of orthocomplemented lattices by means of a special subclass of spectral spaces; choice-free in the sense that our representation avoids use of Alexander's Subbase Theorem, along with its associated nonconstructive choice principles. We then introduce a new subclass of spectral spaces which we call \emph{upper Vietoris orthospaces} in order to characterize (up to homeomorphism and isomorphism) the spectral space of proper lattice filters used in our representation. It is then shown how our constructions give rise to a choice-free dual equivalence of categories between the category of orthocomplemented lattices and the dual category of upper Vietoris orthospaces. Our duality combines Bezhanishvili and Holliday's choice-free spectral space approach to Stone duality for Boolean algebras with Goldblatt and Bimbó's choice-dependent orthospace approach to Stone duality for orthocomplemented lattices.

preprint2022arXiv

The MIT Supercloud Workload Classification Challenge

High-Performance Computing (HPC) centers and cloud providers support an increasingly diverse set of applications on heterogenous hardware. As Artificial Intelligence (AI) and Machine Learning (ML) workloads have become an increasingly larger share of the compute workloads, new approaches to optimized resource usage, allocation, and deployment of new AI frameworks are needed. By identifying compute workloads and their utilization characteristics, HPC systems may be able to better match available resources with the application demand. By leveraging datacenter instrumentation, it may be possible to develop AI-based approaches that can identify workloads and provide feedback to researchers and datacenter operators for improving operational efficiency. To enable this research, we released the MIT Supercloud Dataset, which provides detailed monitoring logs from the MIT Supercloud cluster. This dataset includes CPU and GPU usage by jobs, memory usage, and file system logs. In this paper, we present a workload classification challenge based on this dataset. We introduce a labelled dataset that can be used to develop new approaches to workload classification and present initial results based on existing approaches. The goal of this challenge is to foster algorithmic innovations in the analysis of compute workloads that can achieve higher accuracy than existing methods. Data and code will be made publicly available via the Datacenter Challenge website : https://dcc.mit.edu.

preprint2020arXiv

A Sampling Theorem for Deconvolution in Two Dimensions

This work studies the problem of estimating a two-dimensional superposition of point sources or spikes from samples of their convolution with a Gaussian kernel. Our results show that minimizing a continuous counterpart of the $\ell_1$ norm exactly recovers the true spikes if they are sufficiently separated, and the samples are sufficiently dense. In addition, we provide numerical evidence that our results extend to non-Gaussian kernels relevant to microscopy and telescopy.