Researcher profile

Antti Koskela

Antti Koskela contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Membership Inference Attacks for Retrieval Based In-Context Learning for Document Question Answering

We show that remotely hosted applications employing in-context learning when augmented with a retrieval function to select in-context examples can be vulnerable to membership-inference attacks even when the service provider and users are separate parties. We propose two black-box membership inference attacks that exploit query text prefixes to distinguish member from non-member inputs. The first attack uses a reference model to estimate an otherwise unavailable loss metric. The second attack improves upon it by eliminating the reference model and instead computing a membership statistic through a simple but novel weighted-averaging scheme. Our comprehensive empirical evaluations consider a stricter case in which the adversary has a paraphrased version of the text in the queries and show that our attacks can exhibit stronger resilience to paraphrasing and outperform three prior attacks in many cases with small number of prefixes. We also adapt an existing ensemble prompting defense to our setting, demonstrating that it substantially mitigates the privacy leakage caused by our second attack.

preprint2024arXiv

Auditing Differential Privacy Guarantees Using Density Estimation

We present a novel method for accurately auditing the differential privacy (DP) guarantees of DP mechanisms. In particular, our solution is applicable to auditing DP guarantees of machine learning (ML) models. Previous auditing methods tightly capture the privacy guarantees of DP-SGD trained models in the white-box setting where the auditor has access to all intermediate models; however, the success of these methods depends on a priori information about the parametric form of the noise and the subsampling ratio used for sampling the gradients. We present a method that does not require such information and is agnostic to the randomization used for the underlying mechanism. Similarly to several previous DP auditing methods, we assume that the auditor has access to a set of independent observations from two one-dimensional distributions corresponding to outputs from two neighbouring datasets. Furthermore, our solution is based on a simple histogram-based density estimation technique to find lower bounds for the statistical distance between these distributions when measured using the hockey-stick divergence. We show that our approach also naturally generalizes the previously considered class of threshold membership inference auditing methods. We improve upon accurate auditing methods such as the $f$-DP auditing. Moreover, we address an open problem on how to accurately audit the subsampled Gaussian mechanism without any knowledge of the parameters of the underlying mechanism.

preprint2023arXiv

Improving the Privacy and Practicality of Objective Perturbation for Differentially Private Linear Learners

In the arena of privacy-preserving machine learning, differentially private stochastic gradient descent (DP-SGD) has outstripped the objective perturbation mechanism in popularity and interest. Though unrivaled in versatility, DP-SGD requires a non-trivial privacy overhead (for privately tuning the model's hyperparameters) and a computational complexity which might be extravagant for simple models such as linear and logistic regression. This paper revamps the objective perturbation mechanism with tighter privacy analyses and new computational tools that boost it to perform competitively with DP-SGD on unconstrained convex generalized linear problems.

preprint2022arXiv

Tight Accounting in the Shuffle Model of Differential Privacy

Shuffle model of differential privacy is a novel distributed privacy model based on a combination of local privacy mechanisms and a secure shuffler. It has been shown that the additional randomisation provided by the shuffler improves privacy bounds compared to the purely local mechanisms. Accounting tight bounds, however, is complicated by the complexity brought by the shuffler. The recently proposed numerical techniques for evaluating $(\varepsilon,δ)$-differential privacy guarantees have been shown to give tighter bounds than commonly used methods for compositions of various complex mechanisms. In this paper, we show how to obtain accurate bounds for adaptive compositions of general $\varepsilon$-LDP shufflers using the analysis by Feldman et al. (2021) and tight bounds for adaptive compositions of shufflers of $k$-randomised response mechanisms, using the analysis by Balle et al. (2019). We show how to speed up the evaluation of the resulting privacy loss distribution from $\mathcal{O}(n^2)$ to $\mathcal{O}(n)$, where $n$ is the number of users, without noticeable change in the resulting $δ(\varepsilon)$-upper bounds. We also demonstrate looseness of the existing bounds and methods found in the literature, improving previous composition results significantly.

preprint2020arXiv

Computing low-rank approximations of the Fréchet derivative of a matrix function using Krylov subspace methods

The Fréchet derivative $L_f(A,E)$ of the matrix function $f(A)$ plays an important role in many different applications, including condition number estimation and network analysis. We present several different Krylov subspace methods for computing low-rank approximations of $L_f(A,E)$ when the direction term $E$ is of rank one (which can easily be extended to general low-rank). We analyze the convergence of the resulting method for the important special case that $A$ is Hermitian and $f$ is either the exponential, the logarithm or a Stieltjes function. In a number of numerical tests, both including matrices from benchmark collections and from real-world applications, we demonstrate and compare the accuracy and efficiency of the proposed methods.

preprint2020arXiv

Differentially private cross-silo federated learning

Strict privacy is of paramount importance in distributed machine learning. Federated learning, with the main idea of communicating only what is needed for learning, has been recently introduced as a general approach for distributed learning to enhance learning and improve security. However, federated learning by itself does not guarantee any privacy for data subjects. To quantify and control how much privacy is compromised in the worst-case, we can use differential privacy. In this paper we combine additively homomorphic secure summation protocols with differential privacy in the so-called cross-silo federated learning setting. The goal is to learn complex models like neural networks while guaranteeing strict privacy for the individual data subjects. We demonstrate that our proposed solutions give prediction accuracy that is comparable to the non-distributed setting, and are fast enough to enable learning models with millions of parameters in a reasonable time. To enable learning under strict privacy guarantees that need privacy amplification by subsampling, we present a general algorithm for oblivious distributed subsampling. However, we also argue that when malicious parties are present, a simple approach using distributed Poisson subsampling gives better privacy. Finally, we show that by leveraging random projections we can further scale-up our approach to larger models while suffering only a modest performance loss.

preprint2020arXiv

Sampling of Stochastic Differential Equations using the Karhunen-Loève Expansion and Matrix Functions

We consider linearizations of stochastic differential equations with additive noise using the Karhunen-Loève expansion. We obtain our linearizations by truncating the expansion and writing the solution as a series of matrix-vector products using the theory of matrix functions. Moreover, we restate the solution as the solution of a system of linear differential equations. We obtain strong and weak error bounds for the truncation procedure and show that, under suitable conditions, the mean square error has order of convergence $\mathcal{O}(\frac{1}{m})$ and the second moment has a weak order of convergence $\mathcal{O}(\frac{1}{m})$, where $m$ denotes the size of the expansion. We also discuss efficient numerical linear algebraic techniques to approximate the series of matrix functions and the linearized system of differential equations. These theoretical results are supported by experiments showing the effectiveness of our algorithms when compared to standard methods such as the Euler-Maruyama scheme.

preprint2019arXiv

Computing Tight Differential Privacy Guarantees Using FFT

Differentially private (DP) machine learning has recently become popular. The privacy loss of DP algorithms is commonly reported using $(\varepsilon,δ)$-DP. In this paper, we propose a numerical accountant for evaluating the privacy loss for algorithms with continuous one dimensional output. This accountant can be applied to the subsampled multidimensional Gaussian mechanism which underlies the popular DP stochastic gradient descent. The proposed method is based on a numerical approximation of an integral formula which gives the exact $(\varepsilon,δ)$-values. The approximation is carried out by discretising the integral and by evaluating discrete convolutions using the fast Fourier transform algorithm. We give both theoretical error bounds and numerical error estimates for the approximation. Experimental comparisons with state-of-the-art techniques demonstrate significant improvements in bound tightness and/or computation time. Python code for the method can be found in Github (https://github.com/DPBayes/PLD-Accountant/).

preprint2019arXiv

Learning Rate Adaptation for Federated and Differentially Private Learning

We propose an algorithm for the adaptation of the learning rate for stochastic gradient descent (SGD) that avoids the need for validation set use. The idea for the adaptiveness comes from the technique of extrapolation: to get an estimate for the error against the gradient flow which underlies SGD, we compare the result obtained by one full step and two half-steps. The algorithm is applied in two separate frameworks: federated and differentially private learning. Using examples of deep neural networks we empirically show that the adaptive algorithm is competitive with manually tuned commonly used optimisation methods for differentially privately training. We also show that it works robustly in the case of federated learning unlike commonly used optimisation methods.

preprint2015arXiv

Krylov approximation of ODEs with polynomial parameterization

We propose a new numerical method to solve linear ordinary differential equations of the type $\frac{\partial u}{\partial t}(t,\varepsilon) = A(\varepsilon) \, u(t,\varepsilon)$, where $A:\mathbb{C}\rightarrow\mathbb{C}^{n\times n}$ is a matrix polynomial with large and sparse matrix coefficients. The algorithm computes an explicit parameterization of approximations of $u(t,\varepsilon)$ such that approximations for many different values of $\varepsilon$ and $t$ can be obtained with a very small additional computational effort. The derivation of the algorithm is based on a reformulation of the parameterization as a linear parameter-free ordinary differential equation and on approximating the product of the matrix exponential and a vector with a Krylov method. The Krylov approximation is generated with Arnoldi's method and the structure of the coefficient matrix turns out to have an independence on the truncation parameter so that it can also be interpreted as Arnoldi's method applied to an infinite dimensional matrix. We prove the superlinear convergence of the algorithm and provide a posteriori error estimates to be used as termination criteria. The behavior of the algorithm is illustrated with examples stemming from spatial discretizations of partial differential equations.