Graph explorer

Variational Classification

We present a latent variable model for classification that provides a novel probabilistic interpretation of neural network softmax classifiers. We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders, that generalises the softmax cross-entropy loss. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency between their anticipated distribution, required for accurate label predictions, and their empirical distribution found in practice. We augment the variational objective to mitigate such inconsistency and induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer. Overall, we provide new theoretical insight into the inner workings of widely-used softmax classifiers. Empirical evaluation on image and text classification datasets demonstrates that our proposed approach, variational classification, maintains classification accuracy while the reshaped latent space improves other desirable properties of a classifier, such as calibration, adversarial robustness, robustness to distribution shift and sample efficiency useful in low data settings.

7 nodes11 linksoverview previewVariational Classification
7 nodes11 links
Variational Classification7 visible / 7 total nodes / 14 links
Related contextRelated contextRelated contextCo-authorshipCo-authorshipCo-authorshipAuthorshipWorks onWorks onAuthorshipAuthorshipTopic signalTopic signalTopic signalWVariational Classificationpreprint / 2024AShehzaad DhuliawalaResearcherAMrinmaya SachanResearcherACarl AllenResearcherTArtificial Intelligence22915 worksTMachine Learning49008 worksTComputer Vision30606 works
PaperSignal 106 links

Variational Classification

preprint / 2024

Open