Researcher profile

Takashi Tanaka

Takashi Tanaka contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2026arXiv

Mean Field Analysis of Blockchain Systems

We present a novel framework for analyzing blockchain consensus mechanisms by modeling blockchain growth as a Partially Observable Stochastic Game (POSG) which we reduce to a set of Partially Observable Markov Decision Processes (POMDPs) through the use of the mean field approximation. This approach formalizes the decision-making process of miners in Proof-of-Work (PoW) systems and enables a principled examination of block selection strategies as well as steady state analysis of the induced Markov chain. By leveraging a mean field game formulation, we efficiently characterize the information asymmetries that arise in asynchronous blockchain networks. Our first main result is an exact characterization of the tradeoff between network delay and PoW efficiency--the fraction of blocks which end up in the longest chain. We demonstrate that the tradeoff observed in our model at steady state aligns closely with theoretical findings, validating our use of the mean field approximation. Our second main result is a rigorous equilibrium analysis of the Longest Chain Rule (LCR). We show that the LCR is a mean field equilibrium and that it is uniquely optimal in maximizing PoW efficiency under certain mild assumptions. This result provides the first formal justification for continued use of the LCR in decentralized consensus protocols, offering both theoretical validation and practical insights. Beyond these core results, our framework supports flexible experimentation with alternative block selection strategies, system dynamics, and reward structures. It offers a systematic and scalable substitute for expensive test-net deployments or ad hoc analysis. While our primary focus is on Nakamoto-style blockchains, the model is general enough to accommodate other architectures through modifications to the underlying MDP.

preprint2025arXiv

Model Predictive Path Integral Control for Roll-to-Roll Manufacturing

Roll-to-roll (R2R) manufacturing is a continuous processing technology essential for scalable production of thin-film materials and printed electronics, but precise control remains challenging due to subsystem interactions, nonlinearities, and process disturbances. This paper proposes a Model Predictive Path Integral (MPPI) control formulation for R2R systems, leveraging a GPU-based Monte-Carlo sampling approach to efficiently approximate optimal controls online. Crucially, MPPI easily handles non-differentiable cost functions, enabling the incorporation of complex performance criteria relevant to advanced manufacturing processes. A case study is presented that demonstrates that MPPI significantly improves tension regulation performance compared to conventional model predictive control (MPC), highlighting its suitability for real-time control in advanced manufacturing.

preprint2022arXiv

A Lower-bound for Variable-length Source Coding in Linear-Quadratic-Gaussian Control with Shared Randomness

In this letter, we consider a Linear Quadratic Gaussian (LQG) control system where feedback occurs over a noiseless binary channel and derive lower bounds on the minimum communication cost (quantified via the channel bitrate) required to attain a given control performance. We assume that at every time step an encoder can convey a packet containing a variable number of bits over the channel to a decoder at the controller. Our system model provides for the possibility that the encoder and decoder have shared randomness, as is the case in systems using dithered quantizers. We define two extremal prefix-free requirements that may be imposed on the message packets; such constraints are useful in that they allow the decoder, and potentially other agents to uniquely identify the end of a transmission in an online fashion. We then derive a lower bound on the rate of prefix-free coding in terms of directed information; in particular we show that a previously known bound still holds in the case with shared randomness. We generalize the bound for when prefix constraints are relaxed, and conclude with a rate-distortion formulation.

preprint2022arXiv

Attack Impact Evaluation by Exact Convexification through State Space Augmentation

We address the attack impact evaluation problem for control system security. We formulate the problem as a Markov decision process with a temporally joint chance constraint that forces the adversary to avoid being detected throughout the considered time period. Owing to the joint constraint, the optimal control policy depends not only on the current state but also on the entire history, which leads to the explosion of the search space and makes the problem generally intractable. It is shown that whether an alarm has been triggered or not, in addition to the current state is sufficient for specifying the optimal decision at each time step. Augmentation of the information to the state space induces an equivalent convex optimization problem, which is tractable using standard solvers.

preprint2022arXiv

Continuous-Time Channel Gain Control for Minimum-Information Kalman-Bucy Filtering

We consider the problem of estimating a continuous-time Gauss-Markov source process observed through a vector Gaussian channel with an adjustable channel gain matrix. For a given (generally time-varying) channel gain matrix, we provide formulas to compute (i) the mean-square estimation error attainable by the classical Kalman-Bucy filter, and (ii) the mutual information between the source process and its Kalman-Bucy estimate. We then formulate a novel "optimal channel gain control problem" where the objective is to control the channel gain matrix strategically to minimize the weighted sum of these two performance metrics. To develop insights into the optimal solution, we first consider the problem of controlling a time-varying channel gain over a finite time interval. A necessary optimality condition is derived based on Pontryagin's minimum principle. For a scalar system, we show that the optimal channel gain is a piece-wise constant signal with at most two switches. We also consider the problem of designing the optimal time-invariant gain to minimize the average cost over an infinite time horizon. A novel semidefinite programming (SDP) heuristic is proposed and the exactness of the solution is discussed.

preprint2022arXiv

Optimal Sensor Gain Control for Minimum-Information Estimation of Continuous-Time Gauss-Markov Processes

We consider the scenario in which a continuous-time Gauss-Markov process is estimated by the Kalman-Bucy filter over a Gaussian channel (sensor) with a variable sensor gain. The problem of scheduling the sensor gain over a finite time interval to minimize the weighted sum of the data rate (the mutual information between the sensor output and the underlying Gauss-Markov process) and the distortion (the mean-square estimation error) is formulated as an optimal control problem. A necessary optimality condition for a scheduled sensor gain is derived based on Pontryagin's minimum principle. For a scalar problem, we show that an optimal sensor gain control is of bang-bang type, except the possibility of taking an intermediate value when there exists a stationary point on the switching surface in the phase space of canonical dynamics. Furthermore, we show that the number of switches is at most two and the time instants at which the optimal gain must be switched can be computed from the analytical solutions to the canonical equations.

preprint2022arXiv

Time-invariant prefix-free source coding for MIMO LQG control

In this work we consider discrete-time multiple-input multiple-output (MIMO) linear-quadratic-Gaussian (LQG) control where the feedback consists of variable length binary codewords. To simplify the decoder architecture, we enforce a strict prefix constraint on the codewords. We develop a data compression architecture that provably achieves a near minimum time-average expected bitrate for a fixed constraint on the LQG performance. The architecture conforms to the strict prefix constraint and does not require time-varying lossless source coding, in contrast to the prior art.

preprint2022arXiv

Upper Bounds for Continuous-Time End-to-End Risks in Stochastic Robot Navigation

We present an analytical method to estimate the continuous-time collision probability of motion plans for autonomous agents with linear controlled Ito dynamics. Motion plans generated by planning algorithms cannot be perfectly executed by autonomous agents in reality due to the inherent uncertainties in the real world. Estimating end-to-end risk is crucial to characterize the safety of trajectories and plan risk optimal trajectories. In this paper, we derive upper bounds for the continuous-time risk in stochastic robot navigation using the properties of Brownian motion as well as Boole and Hunter's inequalities from probability theory. Using a ground robot navigation example, we numerically demonstrate that our method is considerably faster than the naive Monte Carlo sampling method and the proposed bounds perform better than the discrete-time risk bounds.

preprint2021arXiv

SARSA(0) Reinforcement Learning over Fully Homomorphic Encryption

We consider a cloud-based control architecture in which the local plants outsource the control synthesis task to the cloud. In particular, we consider a cloud-based reinforcement learning (RL), where updating the value function is outsourced to the cloud. To achieve confidentiality, we implement computations over Fully Homomorphic Encryption (FHE). We use a CKKS encryption scheme and a modified SARSA(0) reinforcement learning to incorporate the encryption-induced delays. We then give a convergence result for the delayed updated rule of SARSA(0) with a blocking mechanism. We finally present a numerical demonstration via implementing on a classical pole-balancing problem.

preprint2020arXiv

Closed-loop Parameter Identification of Linear Dynamical Systems through the Lens of Feedback Channel Coding Theory

This paper considers the problem of closed-loop identification of linear scalar systems with Gaussian process noise, where the system input is determined by a deterministic state feedback policy. The regularized least-square estimate (LSE) algorithm is adopted, seeking to find the best estimate of unknown model parameters based on noiseless measurements of the state. We are interested in the fundamental limitation of the rate at which unknown parameters can be learned, in the sense of the D-optimality scalarization criterion subject to a quadratic control cost. We first establish a novel connection between a closed-loop identification problem of interest and a channel coding problem involving an additive white Gaussian noise (AWGN) channel with feedback and a certain structural constraint. Based on this connection, we show that the learning rate is fundamentally upper bounded by the capacity of the corresponding AWGN channel. Although the optimal design of the feedback policy remains challenging, we derive conditions under which the upper bound is achieved. Finally, we show that the obtained upper bound implies that super-linear convergence is unattainable for any choice of the policy.

preprint2020arXiv

Linearly Solvable Mean-Field Traffic Routing Games

We consider a dynamic traffic routing game over an urban road network involving a large number of drivers in which each driver selecting a particular route is subject to a penalty that is affine in the logarithm of the number of drivers selecting the same route. We show that the mean-field approximation of such a game leads to the so-called linearly solvable Markov decision process, implying that its mean-field equilibrium (MFE) can be found simply by solving a finite-dimensional linear system backward in time. Based on this backward-only characterization, it is further shown that the obtained MFE has the notable property of strong time-consistency. A connection between the obtained MFE and a particular class of fictitious play is also discussed.

preprint2020arXiv

Rationally Inattentive Path-Planning via RRT*

We consider a path-planning scenario for a mobile robot traveling in a configuration space with obstacles under the presence of stochastic disturbances. A novel path length metric is proposed on the uncertain configuration space and then integrated with the existing RRT* algorithm. The metric is a weighted sum of two terms which capture both the Euclidean distance traveled by the robot and the perception cost, i.e., the amount of information the robot must perceive about the environment to follow the path safely. The continuity of the path length function with respect to the topology of the total variation metric is shown and the optimality of the Rationally Inattentive RRT* algorithm is discussed. Three numerical studies are presented which display the utility of the new algorithm.

preprint2020arXiv

Scalable Synthesis of Minimum-Information Linear-Gaussian Control by Distributed Optimization

We consider a discrete-time linear-quadratic Gaussian control problem in which we minimize a weighted sum of the directed information from the state of the system to the control input and the control cost. The optimal control and sensing policies can be synthesized jointly by solving a semidefinite programming problem. However, the existing solutions typically scale cubic with the horizon length. We leverage the structure in the problem to develop a distributed algorithm that decomposes the synthesis problem into a set of smaller problems, one for each time step. We prove that the algorithm runs in time linear in the horizon length. As an application of the algorithm, we consider a path-planning problem in a state space with obstacles under the presence of stochastic disturbances. The algorithm computes a locally optimal solution that jointly minimizes the perception and control cost while ensuring the safety of the path. The numerical examples show that the algorithm can scale to thousands of horizon length and compute locally optimal solutions.

preprint2020arXiv

Sequential Source Coding for Stochastic Systems Subject to Finite Rate Constraints

In this paper, we revisit the sequential source coding framework to analyze fundamental performance limitations of discrete-time stochastic control systems subject to feedback data-rate constraints in finite-time horizon. The basis of our results is a new characterization of the lower bound on the minimum total-rate achieved by sequential codes subject to a total (across time) distortion constraint and a computational algorithm that allocates optimally the rate-distortion for any fixed finite-time horizon. This characterization facilitates the derivation of analytical, non-asymptotic, and finite-dimensional lower and upper bounds in two control-related scenarios. (a) A parallel time-varying Gauss-Markov process with identically distributed spatial components that is quantized and transmitted through a noiseless channel to a minimum mean-squared error (MMSE) decoder. (b) A time-varying quantized LQG closed-loop control system, with identically distributed spatial components and with a random data-rate allocation. Our non-asymptotic lower bound on the quantized LQG control problem, reveals the absolute minimum data-rates for (mean square) stability of our time-varying plant for any fixed finite time horizon. We supplement our framework with illustrative simulation experiments.

preprint2020arXiv

Transfer-Entropy-Regularized Markov Decision Processes

We consider the framework of transfer-entropy-regularized Markov Decision Process (TERMDP) in which the weighted sum of the classical state-dependent cost and the transfer entropy from the state random process to the control random process is minimized. Although TERMDPs are generally formulated as nonconvex optimization problems, we derive an analytical necessary optimality condition expressed as a finite set of nonlinear equations, based on which an iterative forward-backward computational procedure similar to the Arimoto-Blahut algorithm is proposed. It is shown that every limit point of the sequence generated by the proposed algorithm is a stationary point of the TERMDP. Applications of TERMDPs are discussed in the context of networked control systems theory and non-equilibrium thermodynamics. The proposed algorithm is applied to an information-constrained maze navigation problem, whereby we study how the price of information qualitatively alters the optimal decision polices.