Source author record

Ayaka Sakata

Ayaka Sakata appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.dis-nn cond-mat.stat-mech Information Theory Machine Learning math.IT Biological Physics Computational Complexity Numerical Analysis

Catalog footprint

What is connected

15works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Stabilizing Private LASSO under Heterogeneous Covariates via Anisotropic Objective Perturbation

We study high-dimensional LASSO under differential privacy via objective perturbation with heterogeneous covariate scales. In practical scenarios, covariates often exhibit diverse scales; however, standard preprocessing is problematic under privacy constraints, as it consumes additional privacy budget. This heterogeneity induces effective anisotropy in the objective perturbation via the inverse Gram matrix of covariates, which can degrade the stability and accuracy of algorithms. To address this, we propose a Gram-based anisotropic objective perturbation, a ``pre-distortion" strategy that counteracts the distortion from the covariate structure to restore isotropy in the estimation process. Using an Approximate Message Passing (AMP) framework and state evolution analysis, we demonstrate that our proposed perturbation significantly stabilizes convergence and improves both statistical efficiency and privacy performance compared to standard uniform noise injection. Our results provide theoretical insights into designing stable and efficient private estimators without relying on data-dependent preprocessing.

preprint2020arXiv

Active pooling design in group testing based on Bayesian posterior prediction

In identifying infected patients in a population, group testing is an effective method to reduce the number of tests and correct the test errors. In the group testing procedure, tests are performed on pools of specimens collected from patients, where the number of pools is lower than that of patients. The performance of group testing heavily depends on the design of pools and algorithms that are used in inferring the infected patients from the test outcomes. In this paper, an adaptive design method of pools based on the predictive distribution is proposed in the framework of Bayesian inference. The proposed method executed using the belief propagation algorithm results in more accurate identification of the infected patients, as compared to the group testing performed on random pools determined in advance.

preprint2020arXiv

Bayesian inference of infected patients in group testing with prevalence estimation

Group testing is a method of identifying infected patients by performing tests on a pool of specimens collected from patients. For the case in which the test returns a false result with finite probability, we propose Bayesian inference and a corresponding belief propagation (BP) algorithm to identify the infected patients from the results of tests performed on the pool. We show that the true-positive rate is improved by taking into account the credible interval of a point estimate of each patient. Further, the prevalence and the error probability in the test are estimated by combining an expectation-maximization method with the BP algorithm. As another approach, we introduce a hierarchical Bayes model to identify the infected patients and estimate the prevalence. By comparing these methods, we formulate a guide for practical usage.

preprint2020arXiv

Dimensional reduction in evolving spin-glass model: correlation of phenotypic responses to environmental and mutational changes

The evolution of high-dimensional phenotypes is investigated using a statistical physics model consists of interacting spins, in which genotypes, phenotypes, and environments are represented by spin configurations, interaction matrices, and external fields, respectively. We found that phenotypic changes upon diverse environmental change and genetic variation are highly correlated across all spins, consistent with recent experimental observations of biological systems. The dimension reduction in phenotypic changes is shown to be a result of the evolution of the robustness to thermal noise, achieved at the replica symmetric phase.

preprint2019arXiv

Cross validation in sparse linear regression with piecewise continuous nonconvex penalties and its acceleration

We investigate the signal reconstruction performance of sparse linear regression in the presence of noise when piecewise continuous nonconvex penalties are used. Among such penalties, we focus on the SCAD penalty. The contributions of this study are three-fold: We first present a theoretical analysis of a typical reconstruction performance, using the replica method, under the assumption that each component of the design matrix is given as an independent and identically distributed (i.i.d.) Gaussian variable. This clarifies the superiority of the SCAD estimator compared with $\ell_1$ in a wide parameter range, although the nonconvex nature of the penalty tends to lead to solution multiplicity in certain regions. This multiplicity is shown to be connected to replica symmetry breaking in the spin-glass theory. We also show that the global minimum of the mean square error between the estimator and the true signal is located in the replica symmetric phase. Second, we develop an approximate formula efficiently computing the cross-validation error without actually conducting the cross-validation, which is also applicable to the non-i.i.d. design matrices. It is shown that this formula is only applicable to the unique solution region and tends to be unstable in the multiple solution region. We implement instability detection procedures, which allows the approximate formula to stand alone and resultantly enables us to draw phase diagrams for any specific dataset. Third, we propose an annealing procedure, called nonconvexity annealing, to obtain the solution path efficiently. Numerical simulations are conducted on simulated datasets to examine these results to verify the theoretical results consistency and the approximate formula efficiency. Another numerical experiment on a real-world dataset is conducted; its results are consistent with those of earlier studies using the $\ell_0$ formulation.

preprint2016arXiv

Evaluation of Generalized Degrees of Freedom for Sparse Estimation by Replica Method

We develop a method to evaluate the generalized degrees of freedom (GDF), which is a key quantity of a model selection criterion, for linear regression with sparse regularization. Using the replica method, GDF is expressed by the variables that characterize the saddle point of the free energy without depending on the form of the regularization. Within the framework of replica symmetric (RS) analysis, GDF is provided with a physical meaning as the effective density of non-zero components. The validity of our method in the RS phase is supported by the consistency of our results with previous mathematical results. The analytical results in the RS phase are numerically achieved by the belief propagation algorithm.

preprint2016arXiv

Phase transitions and sample complexity in Bayes-optimal matrix factorization

We analyse the matrix factorization problem. Given a noisy measurement of a product of two matrices, the problem is to estimate back the original matrices. It arises in many applications such as dictionary learning, blind matrix calibration, sparse principal component analysis, blind source separation, low rank matrix completion, robust principal component analysis or factor analysis. It is also important in machine learning: unsupervised representation learning can often be studied through matrix factorization. We use the tools of statistical mechanics - the cavity and replica methods - to analyze the achievability and computational tractability of the inference problems in the setting of Bayes-optimal inference, which amounts to assuming that the two matrices have random independent elements generated from some known distribution, and this information is available to the inference algorithm. In this setting, we compute the minimal mean-squared-error achievable in principle in any computational time, and the error that can be achieved by an efficient approximate message passing algorithm. The computation is based on the asymptotic state-evolution analysis of the algorithm. The performance that our analysis predicts, both in terms of the achieved mean-squared-error, and in terms of sample complexity, is extremely promising and motivating for a further development of the algorithm.

preprint2015arXiv

Replica Symmetric Bound for Restricted Isometry Constant

We develop a method for evaluating restricted isometry constants (RICs). This evaluation is reduced to the identification of the zero-points of entropy, which is defined for submatrices that are composed of columns selected from a given measurement matrix. Using the replica method developed in statistical mechanics, we assess RICs for Gaussian random matrices under the replica symmetric (RS) assumption. In order to numerically validate the adequacy of our analysis, we employ the exchange Monte Carlo (EMC) method, which has been empirically demonstrated to achieve much higher numerical accuracy than naive Monte Carlo methods. The EMC method suggests that our theoretical estimation of an RIC corresponds to an upper bound that is tighter than in preceding studies. Physical consideration indicates that our assessment of the RIC could be improved by taking into account the replica symmetry breaking.

preprint2014arXiv

Error correcting codes and spatial coupling

These are notes from the lecture of Rüdiger Urbanke given at the autumn school "Statistical Physics, Optimization, Inference, and Message-Passing Algorithms", that took place in Les Houches, France from Monday September 30th, 2013, till Friday October 11th, 2013. The school was organized by Florent Krzakala from UPMC and ENS Paris, Federico Ricci-Tersenghi from La Sapienza Roma, Lenka Zdeborovà from CEA Saclay and CNRS, and Riccardo Zecchina from Politecnico Torino. The first three sections cover the basics of polar codes and low density parity check codes. In the last three sections, we see how the spatial coupling helps belief propagation decoding.

preprint2014arXiv

Sample Complexity of Bayesian Optimal Dictionary Learning

We consider a learning problem of identifying a dictionary matrix D (M times N dimension) from a sample set of M dimensional vectors Y = N^{-1/2} DX, where X is a sparse matrix (N times P dimension) in which the density of non-zero entries is 0<rho< 1. In particular, we focus on the minimum sample size P_c (sample complexity) necessary for perfectly identifying D of the optimal learning scheme when D and X are independently generated from certain distributions. By using the replica method of statistical mechanics, we show that P_c=O(N) holds as long as alpha = M/N >rho is satisfied in the limit of N to infinity. Our analysis also implies that the posterior distribution given Y is condensed only at the correct dictionary D when the compression rate alpha is greater than a certain critical value alpha_M(rho). This suggests that belief propagation may allow us to learn D with a low computational complexity using O(N) samples.

preprint2013arXiv

Statistical Mechanics of Dictionary Learning

Finding a basis matrix (dictionary) by which objective signals are represented sparsely is of major relevance in various scientific and technological fields. We consider a problem to learn a dictionary from a set of training signals. We employ techniques of statistical mechanics of disordered systems to evaluate the size of the training set necessary to typically succeed in the dictionary learning. The results indicate that the necessary size is much smaller than previously estimated, which theoretically supports and/or encourages the use of dictionary learning in practical situations.

preprint2012arXiv

A Mean-field Approach for an Intercarrier Interference Canceller for OFDM

The similarity of the mathematical description of random-field spin systems to orthogonal frequency-division multiplexing (OFDM) scheme for wireless communication is exploited in an intercarrier-interference (ICI) canceller used in the demodulation of OFDM. The translational symmetry in the Fourier domain generically concentrates the major contribution of ICI from each subcarrier in the subcarrier's neighborhood. This observation in conjunction with mean field approach leads to a development of an ICI canceller whose necessary cost of computation scales linearly with respect to the number of subcarriers. It is also shown that the dynamics of the mean-field canceller are well captured by a discrete map of a single macroscopic variable, without taking the spatial and time correlations of estimated variables into account.

preprint2011arXiv

Replica symmetry breaking in an adiabatic spin-glass model of adaptive evolution

We study evolutionary canalization using a spin-glass model with replica theory, where spins and their interactions are dynamic variables whose configurations correspond to phenotypes and genotypes, respectively. The spins are updated under temperature T_S, and the genotypes evolve under temperature T_J, according to the evolutionary fitness. It is found that adaptation occurs at T_S < T_S^{RS}, and a replica symmetric phase emerges at T_S^{RSB} < T_S < T_S^{RS}. The replica symmetric phase implies canalization, and replica symmetry breaking at lower temperatures indicates loss of robustness.

preprint2010arXiv

Partial annealing of a coupled mean-field spin-glass model with an embedded pattern

A partially annealed mean-field spin-glass model with a locally embedded pattern is studied. The model consists of two dynamical variables, spins and interactions, that are in contact with thermal baths at temperatures T_S and T_J, respectively. Unlike the quenched system, characteristic correlations among the interactions are induced by the partial annealing. The model exhibits three phases, which are paramagnetic, ferromagnetic and spin-glass phases. In the ferromagnetic phase, the embedded pattern is stably realized. The phase diagram depends significantly on the ratio of two temperatures n=T_J/T_S. In particular, a reentrant transition from the embedded ferromagnetic to the spin-glass phases with T_S decreasing is found only below at a certain value of n. This indicates that above the critical value n_c the embedded pattern is supported by local field from a non-embedded region. Some equilibrium properties of the interactions in the partial annealing are also discussed in terms of frustration.

preprint2009arXiv

A statistical-mechanical study of evolution of robustness in noisy environment

In biological systems, expression dynamics that can provide fitted phenotype patterns with respect to a specific function have evolved through mutations. This has been observed in the evolution of proteins for realizing folding dynamics through which a target structure is shaped. We study this evolutionary process by introducing a statistical-mechanical model of interacting spins, where a configuration of spins and their interactions $\bm{J}$ represent a phenotype and genotype, respectively. The phenotype dynamics are given by a stochastic process with temperature $T_{S}$ under a Hamiltonian with $\bm{J}$. The evolution of $\bm{J}$ is also stochastic with temperature $T_{J}$ and follows mutations introduced into $\bm{J}$ and selection based on a fitness defined for a configuration of a given set of target spins. Below a certain temperature $T_{S}^{c2}$, the interactions $\bm{J}$ that achieve the target pattern evolve, whereas another phase transition is observed at $T_{S}^{c1}<T_{S}^{c2}$. At low temperatures $T_{S}<T_{S}^{c1}$, the Hamiltonian exhibits a spin-glass like phase, where the dynamics toward the target pattern require long time steps, and the fitness often decreases drastically as a result of a single mutation to $\bm{J}$. In the intermediate-temperature region, the dynamics to shape the target pattern proceed rapidly and are robust to mutations of $\bm{J}$. The interactions in this region have no frustration around the target pattern and results in funnel-type dynamics. We propose that the ubiquity of funnel-type dynamics, as observed in protein folding, is a consequence of evolution subjected to thermal noise beyond a certain level; this also leads to mutational robustness of the fitness.

Ayaka Sakata

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Stabilizing Private LASSO under Heterogeneous Covariates via Anisotropic Objective Perturbation

Active pooling design in group testing based on Bayesian posterior prediction

Bayesian inference of infected patients in group testing with prevalence estimation

Dimensional reduction in evolving spin-glass model: correlation of phenotypic responses to environmental and mutational changes

Cross validation in sparse linear regression with piecewise continuous nonconvex penalties and its acceleration

Evaluation of Generalized Degrees of Freedom for Sparse Estimation by Replica Method

Phase transitions and sample complexity in Bayes-optimal matrix factorization

Replica Symmetric Bound for Restricted Isometry Constant

Error correcting codes and spatial coupling

Sample Complexity of Bayesian Optimal Dictionary Learning

Statistical Mechanics of Dictionary Learning

A Mean-field Approach for an Intercarrier Interference Canceller for OFDM

Replica symmetry breaking in an adiabatic spin-glass model of adaptive evolution

Partial annealing of a coupled mean-field spin-glass model with an embedded pattern

A statistical-mechanical study of evolution of robustness in noisy environment