Graph explorer

Submodular Benchmark Selection

Evaluating large language models across many benchmarks is expensive, yet many benchmarks are highly correlated. We formalize the selection of a small, informative subset as submodular maximization under a multivariate Gaussian model. Entropy (log-determinant covariance) and mutual information between selected and remaining benchmarks arise as natural objectives. Both are submodular; entropy selection coincides with pivoted Cholesky and has spectral residual bounds, while mutual information is non-monotone in general but empirically monotone for small subsets, so we optimize it greedily. Experiments on three matrices from ten public leaderboards show that mutual information selection outperforms entropy for imputation at small subsets.

4 nodes6 linksoverview previewSubmodular Benchmark Selection
4 nodes6 links
Submodular Benchmark Selection4 visible / 4 total nodes / 6 links
Related contextAuthorshipWorks onWorks onTopic signalTopic signalWSubmodular Benchmark Selectionpreprint / 2026AAlexander SmolaResearcherTArtificial Intelligence22915 worksTMachine Learning49008 works
PaperSignal 103 links

Submodular Benchmark Selection

preprint / 2026

Open