Researcher profile

Tomoyuki Obuchi

Tomoyuki Obuchi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2022arXiv

Assessing transfer entropy from biochemical data

We address the problem of evaluating the transfer entropy (TE) produced by biochemical reactions from experimentally measured data. Although these reactions are generally non-linear and non-stationary processes making it challenging to achieve accurate modeling, Gaussian approximation can facilitate the TE assessment only by estimating covariance matrices using multiple data obtained from simultaneously measured time series representing the activation levels of biomolecules such as proteins. Nevertheless, the non-stationary nature of biochemical signals makes it difficult to theoretically assess the sampling distributions of TE, which are necessary for evaluating the statistical confidence and significance of the data-driven estimates. We resolve this difficulty by computationally assessing the sampling distributions using techniques from computational statistics. The computational methods are tested by using them in analyzing data generated from a theoretically tractable time-varying signal model, which leads to the development of a method to screen only statistically significant estimates. The usefulness of the developed method is examined by applying it to real biological data experimentally measured from the ERBB-RAS-MAPK system that superintends diverse cell fate decisions. A comparison between cells containing wild-type and mutant proteins exhibits a distinct difference in the time evolution of TE while apparent difference is hardly found in average profiles of the raw signals. Such comparison may help in unveiling important pathways of biochemical reactions.

preprint2021arXiv

Reconstructing Sparse Signals via Greedy Monte-Carlo Search

We propose a Monte-Carlo-based method for reconstructing sparse signals in the formulation of sparse linear regression in a high-dimensional setting. The basic idea of this algorithm is to explicitly select variables or covariates to represent a given data vector or responses and accept randomly generated updates of that selection if and only if the energy or cost function decreases. This algorithm is called the greedy Monte-Carlo (GMC) search algorithm. Its performance is examined via numerical experiments, which suggests that in the noiseless case, GMC can achieve perfect reconstruction in undersampling situations of a reasonable level: it can outperform the $\ell_1$ relaxation but does not reach the algorithmic limit of MC-based methods theoretically clarified by an earlier analysis. The necessary computational time is also examined and compared with that of an algorithm using simulated annealing. Additionally, experiments on the noisy case are conducted on synthetic datasets and on a real-world dataset, supporting the practicality of GMC.

preprint2020arXiv

Inferring neuronal couplings from spiking data using a systematic procedure with a statistical criterion

Recent remarkable advances in the experimental techniques have provided a background for inferring neuronal couplings from point process data that includes a great number of neurons. Here, we propose a systematic procedure for pre- and post-processing generic point process data in an objective manner, to handle data in the framework of a binary simple statistical model, the Ising or generalized McCulloch--Pitts model. The procedure involves two steps: (1) determining time-bin size for transforming the point-process data into discrete-time binary data and (2) screening relevant couplings from the estimated couplings. For the first step, we decide the optimal time-bin size by introducing the null hypothesis that all neurons would fire independently, then choosing a time-bin size so that the null hypothesis is rejected with the most strict criterion. The likelihood associated with the null hypothesis is analytically evaluated and used for the rejection process. For the second post-processing step, after a certain estimator of coupling is obtained based on the pre-processed dataset, the estimate is compared with many other estimates derived from datasets obtained by randomizing the original dataset in the time direction. We accept the original estimate as relevant only if its absolute value is sufficiently larger than them of randomized datasets. These manipulations suppress false positive couplings induced by statistical noise. We apply this inference procedure to spiking data from synthetic and in vitro neuronal networks. The results show that the proposed procedure identifies the presence/absence of synaptic couplings fairly well including their signs, for the synthetic and experimental data. In particular, the results support that we can infer the physical connections of underlying systems in favorable situations, even when using the simple statistical model.

preprint2020arXiv

Learning performance in inverse Ising problems with sparse teacher couplings

We investigate the learning performance of the pseudolikelihood maximization method for inverse Ising problems. In the teacher-student scenario under the assumption that the teacher's couplings are sparse and the student does not know the graphical structure, the learning curve and order parameters are assessed in the typical case using the replica and cavity methods from statistical mechanics. Our formulation is also applicable to a certain class of cost functions having locality; the standard likelihood does not belong to that class. The derived analytical formulas indicate that the perfect inference of the presence/absence of the teacher's couplings is possible in the thermodynamic limit taking the number of spins $N$ as infinity while keeping the dataset size $M$ proportional to $N$, as long as $α=M/N > 2$. Meanwhile, the formulas also show that the estimated coupling values corresponding to the truly existing ones in the teacher tend to be overestimated in the absolute value, manifesting the presence of estimation bias. These results are considered to be exact in the thermodynamic limit on locally tree-like networks, such as the regular random or Erdős--Rényi graphs. Numerical simulation results fully support the theoretical predictions. Additional biases in the estimators on loopy graphs are also discussed.

preprint2019arXiv

Cross validation in sparse linear regression with piecewise continuous nonconvex penalties and its acceleration

We investigate the signal reconstruction performance of sparse linear regression in the presence of noise when piecewise continuous nonconvex penalties are used. Among such penalties, we focus on the SCAD penalty. The contributions of this study are three-fold: We first present a theoretical analysis of a typical reconstruction performance, using the replica method, under the assumption that each component of the design matrix is given as an independent and identically distributed (i.i.d.) Gaussian variable. This clarifies the superiority of the SCAD estimator compared with $\ell_1$ in a wide parameter range, although the nonconvex nature of the penalty tends to lead to solution multiplicity in certain regions. This multiplicity is shown to be connected to replica symmetry breaking in the spin-glass theory. We also show that the global minimum of the mean square error between the estimator and the true signal is located in the replica symmetric phase. Second, we develop an approximate formula efficiently computing the cross-validation error without actually conducting the cross-validation, which is also applicable to the non-i.i.d. design matrices. It is shown that this formula is only applicable to the unique solution region and tends to be unstable in the multiple solution region. We implement instability detection procedures, which allows the approximate formula to stand alone and resultantly enables us to draw phase diagrams for any specific dataset. Third, we propose an annealing procedure, called nonconvexity annealing, to obtain the solution path efficiently. Numerical simulations are conducted on simulated datasets to examine these results to verify the theoretical results consistency and the approximate formula efficiency. Another numerical experiment on a real-world dataset is conducted; its results are consistent with those of earlier studies using the $\ell_0$ formulation.

preprint2019arXiv

Empirical Bayes Method for Boltzmann Machines

In this study, we consider an empirical Bayes method for Boltzmann machines and propose an algorithm for it. The empirical Bayes method allows estimation of the values of the hyperparameters of the Boltzmann machine by maximizing a specific likelihood function referred to as the empirical Bayes likelihood function in this study. However, the maximization is computationally hard because the empirical Bayes likelihood function involves intractable integrations of the partition function. The proposed algorithm avoids this computational problem by using the replica method and the Plefka expansion. Our method does not require any iterative procedures and is quite simple and fast, though it introduces a bias to the estimate, which exhibits an unnatural behavior with respect to the size of the dataset. This peculiar behavior is supposed to be due to the approximate treatment by the Plefka expansion. A possible extension to overcome this behavior is also discussed.

preprint2018arXiv

Mean-field theory of graph neural networks in graph partitioning

A theoretical performance analysis of the graph neural network (GNN) is presented. For classification tasks, the neural network approach has the advantage in terms of flexibility that it can be employed in a data-driven manner, whereas Bayesian inference requires the assumption of a specific model. A fundamental question is then whether GNN has a high accuracy in addition to this flexibility. Moreover, whether the achieved performance is predominately a result of the backpropagation or the architecture itself is a matter of considerable interest. To gain a better insight into these questions, a mean-field theory of a minimal GNN architecture is developed for the graph partitioning problem. This demonstrates a good agreement with numerical experiments.

preprint2018arXiv

Objective and efficient inference for couplings in neuronal networks

Inferring directional couplings from the spike data of networks is desired in various scientific fields such as neuroscience. Here, we apply a recently proposed objective procedure to the spike data obtained from the Hodgkin--Huxley type models and in vitro neuronal networks cultured in a circular structure. As a result, we succeed in reconstructing synaptic connections accurately from the evoked activity as well as the spontaneous one. To obtain the results, we invent an analytic formula approximately implementing a method of screening relevant couplings. This significantly reduces the computational cost of the screening method employed in the proposed objective procedure, making it possible to treat large-size systems as in this study.