Source author record

Sarah Michalak

Sarah Michalak appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.IM Machine Learning

Catalog footprint

What is connected

2works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks

Mixup~\cite{zhang2017mixup} is a recently proposed method for training deep neural networks where additional samples are generated during training by convexly combining random pairs of images and their associated labels. While simple to implement, it has been shown to be a surprisingly effective method of data augmentation for image classification: DNNs trained with mixup show noticeable gains in classification performance on a number of image classification benchmarks. In this work, we discuss a hitherto untouched aspect of mixup training -- the calibration and predictive uncertainty of models trained with mixup. We find that DNNs trained with mixup are significantly better calibrated -- i.e., the predicted softmax scores are much better indicators of the actual likelihood of a correct prediction -- than DNNs trained in the regular fashion. We conduct experiments on a number of image classification architectures and datasets -- including large-scale datasets like ImageNet -- and find this to be the case. Additionally, we find that merely mixing features does not result in the same calibration benefit and that the label smoothing in mixup training plays a significant role in improving calibration. Finally, we also observe that mixup-trained DNNs are less prone to over-confident predictions on out-of-distribution and random-noise data. We conclude that the typical overconfidence seen in neural networks, even on in-distribution data is likely a consequence of training with hard labels, suggesting that mixup be employed for classification tasks where predictive uncertainty is a significant concern.

preprint2012arXiv

Comparison of RFI Mitigation Strategies for Dispersed Pulse Detection

Impulsive radio-frequency signals from astronomical sources are dispersed by the frequency dependent index of refraction of the interstellar media and so appear as chirped signals when they reach earth. Searches for dispersed impulses have been limited by false detections due to radio frequency interference (RFI) and, in some cases, artifacts of the instrumentation. Many authors have discussed techniques to excise or mitigate RFI in searches for fast transients, but comparisons between different approaches are lacking. This work develops RFI mitigation techniques for use in searches for dispersed pulses, employing data recorded in a "Fly's Eye" mode of the Allen Telescope Array as a test case. We gauge the performance of several RFI mitigation techniques by adding dispersed signals to data containing RFI and comparing false alarm rates at the observed signal-to-noise ratios of the added signals. We find that Huber filtering is most effective at removing broadband interferers, while frequency centering is most effective at removing narrow frequency interferers. Neither of these methods is effective over a broad range of interferers. A method that combines Huber filtering and adaptive interference cancellation provides the lowest number of false positives over the interferers considered here. The methods developed here have application to other searches for dispersed pulses in incoherent spectra, especially those involving multiple beam systems.