Source author record

Takashi Tanaka

Takashi Tanaka appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.SY Systems and Control math.OC Information Theory math.IT Computer Science and Game Theory Robotics Cryptography and Security eess.SP Machine Learning math.DS

Catalog footprint

What is connected

21works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Mean Field Analysis of Blockchain Systems

We present a novel framework for analyzing blockchain consensus mechanisms by modeling blockchain growth as a Partially Observable Stochastic Game (POSG) which we reduce to a set of Partially Observable Markov Decision Processes (POMDPs) through the use of the mean field approximation. This approach formalizes the decision-making process of miners in Proof-of-Work (PoW) systems and enables a principled examination of block selection strategies as well as steady state analysis of the induced Markov chain. By leveraging a mean field game formulation, we efficiently characterize the information asymmetries that arise in asynchronous blockchain networks. Our first main result is an exact characterization of the tradeoff between network delay and PoW efficiency--the fraction of blocks which end up in the longest chain. We demonstrate that the tradeoff observed in our model at steady state aligns closely with theoretical findings, validating our use of the mean field approximation. Our second main result is a rigorous equilibrium analysis of the Longest Chain Rule (LCR). We show that the LCR is a mean field equilibrium and that it is uniquely optimal in maximizing PoW efficiency under certain mild assumptions. This result provides the first formal justification for continued use of the LCR in decentralized consensus protocols, offering both theoretical validation and practical insights. Beyond these core results, our framework supports flexible experimentation with alternative block selection strategies, system dynamics, and reward structures. It offers a systematic and scalable substitute for expensive test-net deployments or ad hoc analysis. While our primary focus is on Nakamoto-style blockchains, the model is general enough to accommodate other architectures through modifications to the underlying MDP.

preprint2025arXiv

Model Predictive Path Integral Control for Roll-to-Roll Manufacturing

Roll-to-roll (R2R) manufacturing is a continuous processing technology essential for scalable production of thin-film materials and printed electronics, but precise control remains challenging due to subsystem interactions, nonlinearities, and process disturbances. This paper proposes a Model Predictive Path Integral (MPPI) control formulation for R2R systems, leveraging a GPU-based Monte-Carlo sampling approach to efficiently approximate optimal controls online. Crucially, MPPI easily handles non-differentiable cost functions, enabling the incorporation of complex performance criteria relevant to advanced manufacturing processes. A case study is presented that demonstrates that MPPI significantly improves tension regulation performance compared to conventional model predictive control (MPC), highlighting its suitability for real-time control in advanced manufacturing.

preprint2022arXiv

A Lower-bound for Variable-length Source Coding in Linear-Quadratic-Gaussian Control with Shared Randomness

In this letter, we consider a Linear Quadratic Gaussian (LQG) control system where feedback occurs over a noiseless binary channel and derive lower bounds on the minimum communication cost (quantified via the channel bitrate) required to attain a given control performance. We assume that at every time step an encoder can convey a packet containing a variable number of bits over the channel to a decoder at the controller. Our system model provides for the possibility that the encoder and decoder have shared randomness, as is the case in systems using dithered quantizers. We define two extremal prefix-free requirements that may be imposed on the message packets; such constraints are useful in that they allow the decoder, and potentially other agents to uniquely identify the end of a transmission in an online fashion. We then derive a lower bound on the rate of prefix-free coding in terms of directed information; in particular we show that a previously known bound still holds in the case with shared randomness. We generalize the bound for when prefix constraints are relaxed, and conclude with a rate-distortion formulation.

preprint2022arXiv

Attack Impact Evaluation by Exact Convexification through State Space Augmentation

We address the attack impact evaluation problem for control system security. We formulate the problem as a Markov decision process with a temporally joint chance constraint that forces the adversary to avoid being detected throughout the considered time period. Owing to the joint constraint, the optimal control policy depends not only on the current state but also on the entire history, which leads to the explosion of the search space and makes the problem generally intractable. It is shown that whether an alarm has been triggered or not, in addition to the current state is sufficient for specifying the optimal decision at each time step. Augmentation of the information to the state space induces an equivalent convex optimization problem, which is tractable using standard solvers.

preprint2022arXiv

Continuous-Time Channel Gain Control for Minimum-Information Kalman-Bucy Filtering

We consider the problem of estimating a continuous-time Gauss-Markov source process observed through a vector Gaussian channel with an adjustable channel gain matrix. For a given (generally time-varying) channel gain matrix, we provide formulas to compute (i) the mean-square estimation error attainable by the classical Kalman-Bucy filter, and (ii) the mutual information between the source process and its Kalman-Bucy estimate. We then formulate a novel "optimal channel gain control problem" where the objective is to control the channel gain matrix strategically to minimize the weighted sum of these two performance metrics. To develop insights into the optimal solution, we first consider the problem of controlling a time-varying channel gain over a finite time interval. A necessary optimality condition is derived based on Pontryagin's minimum principle. For a scalar system, we show that the optimal channel gain is a piece-wise constant signal with at most two switches. We also consider the problem of designing the optimal time-invariant gain to minimize the average cost over an infinite time horizon. A novel semidefinite programming (SDP) heuristic is proposed and the exactness of the solution is discussed.

preprint2022arXiv

Optimal Sensor Gain Control for Minimum-Information Estimation of Continuous-Time Gauss-Markov Processes

We consider the scenario in which a continuous-time Gauss-Markov process is estimated by the Kalman-Bucy filter over a Gaussian channel (sensor) with a variable sensor gain. The problem of scheduling the sensor gain over a finite time interval to minimize the weighted sum of the data rate (the mutual information between the sensor output and the underlying Gauss-Markov process) and the distortion (the mean-square estimation error) is formulated as an optimal control problem. A necessary optimality condition for a scheduled sensor gain is derived based on Pontryagin's minimum principle. For a scalar problem, we show that an optimal sensor gain control is of bang-bang type, except the possibility of taking an intermediate value when there exists a stationary point on the switching surface in the phase space of canonical dynamics. Furthermore, we show that the number of switches is at most two and the time instants at which the optimal gain must be switched can be computed from the analytical solutions to the canonical equations.

preprint2022arXiv

Time-invariant prefix-free source coding for MIMO LQG control

In this work we consider discrete-time multiple-input multiple-output (MIMO) linear-quadratic-Gaussian (LQG) control where the feedback consists of variable length binary codewords. To simplify the decoder architecture, we enforce a strict prefix constraint on the codewords. We develop a data compression architecture that provably achieves a near minimum time-average expected bitrate for a fixed constraint on the LQG performance. The architecture conforms to the strict prefix constraint and does not require time-varying lossless source coding, in contrast to the prior art.

preprint2022arXiv

Upper Bounds for Continuous-Time End-to-End Risks in Stochastic Robot Navigation

We present an analytical method to estimate the continuous-time collision probability of motion plans for autonomous agents with linear controlled Ito dynamics. Motion plans generated by planning algorithms cannot be perfectly executed by autonomous agents in reality due to the inherent uncertainties in the real world. Estimating end-to-end risk is crucial to characterize the safety of trajectories and plan risk optimal trajectories. In this paper, we derive upper bounds for the continuous-time risk in stochastic robot navigation using the properties of Brownian motion as well as Boole and Hunter's inequalities from probability theory. Using a ground robot navigation example, we numerically demonstrate that our method is considerably faster than the naive Monte Carlo sampling method and the proposed bounds perform better than the discrete-time risk bounds.

preprint2021arXiv

SARSA(0) Reinforcement Learning over Fully Homomorphic Encryption

We consider a cloud-based control architecture in which the local plants outsource the control synthesis task to the cloud. In particular, we consider a cloud-based reinforcement learning (RL), where updating the value function is outsourced to the cloud. To achieve confidentiality, we implement computations over Fully Homomorphic Encryption (FHE). We use a CKKS encryption scheme and a modified SARSA(0) reinforcement learning to incorporate the encryption-induced delays. We then give a convergence result for the delayed updated rule of SARSA(0) with a blocking mechanism. We finally present a numerical demonstration via implementing on a classical pole-balancing problem.

preprint2020arXiv

Closed-loop Parameter Identification of Linear Dynamical Systems through the Lens of Feedback Channel Coding Theory

This paper considers the problem of closed-loop identification of linear scalar systems with Gaussian process noise, where the system input is determined by a deterministic state feedback policy. The regularized least-square estimate (LSE) algorithm is adopted, seeking to find the best estimate of unknown model parameters based on noiseless measurements of the state. We are interested in the fundamental limitation of the rate at which unknown parameters can be learned, in the sense of the D-optimality scalarization criterion subject to a quadratic control cost. We first establish a novel connection between a closed-loop identification problem of interest and a channel coding problem involving an additive white Gaussian noise (AWGN) channel with feedback and a certain structural constraint. Based on this connection, we show that the learning rate is fundamentally upper bounded by the capacity of the corresponding AWGN channel. Although the optimal design of the feedback policy remains challenging, we derive conditions under which the upper bound is achieved. Finally, we show that the obtained upper bound implies that super-linear convergence is unattainable for any choice of the policy.

preprint2020arXiv

Linearly Solvable Mean-Field Traffic Routing Games

We consider a dynamic traffic routing game over an urban road network involving a large number of drivers in which each driver selecting a particular route is subject to a penalty that is affine in the logarithm of the number of drivers selecting the same route. We show that the mean-field approximation of such a game leads to the so-called linearly solvable Markov decision process, implying that its mean-field equilibrium (MFE) can be found simply by solving a finite-dimensional linear system backward in time. Based on this backward-only characterization, it is further shown that the obtained MFE has the notable property of strong time-consistency. A connection between the obtained MFE and a particular class of fictitious play is also discussed.

preprint2020arXiv

Rationally Inattentive Path-Planning via RRT*

We consider a path-planning scenario for a mobile robot traveling in a configuration space with obstacles under the presence of stochastic disturbances. A novel path length metric is proposed on the uncertain configuration space and then integrated with the existing RRT* algorithm. The metric is a weighted sum of two terms which capture both the Euclidean distance traveled by the robot and the perception cost, i.e., the amount of information the robot must perceive about the environment to follow the path safely. The continuity of the path length function with respect to the topology of the total variation metric is shown and the optimality of the Rationally Inattentive RRT* algorithm is discussed. Three numerical studies are presented which display the utility of the new algorithm.

preprint2020arXiv

Scalable Synthesis of Minimum-Information Linear-Gaussian Control by Distributed Optimization

We consider a discrete-time linear-quadratic Gaussian control problem in which we minimize a weighted sum of the directed information from the state of the system to the control input and the control cost. The optimal control and sensing policies can be synthesized jointly by solving a semidefinite programming problem. However, the existing solutions typically scale cubic with the horizon length. We leverage the structure in the problem to develop a distributed algorithm that decomposes the synthesis problem into a set of smaller problems, one for each time step. We prove that the algorithm runs in time linear in the horizon length. As an application of the algorithm, we consider a path-planning problem in a state space with obstacles under the presence of stochastic disturbances. The algorithm computes a locally optimal solution that jointly minimizes the perception and control cost while ensuring the safety of the path. The numerical examples show that the algorithm can scale to thousands of horizon length and compute locally optimal solutions.

preprint2020arXiv

Sequential Source Coding for Stochastic Systems Subject to Finite Rate Constraints

In this paper, we revisit the sequential source coding framework to analyze fundamental performance limitations of discrete-time stochastic control systems subject to feedback data-rate constraints in finite-time horizon. The basis of our results is a new characterization of the lower bound on the minimum total-rate achieved by sequential codes subject to a total (across time) distortion constraint and a computational algorithm that allocates optimally the rate-distortion for any fixed finite-time horizon. This characterization facilitates the derivation of analytical, non-asymptotic, and finite-dimensional lower and upper bounds in two control-related scenarios. (a) A parallel time-varying Gauss-Markov process with identically distributed spatial components that is quantized and transmitted through a noiseless channel to a minimum mean-squared error (MMSE) decoder. (b) A time-varying quantized LQG closed-loop control system, with identically distributed spatial components and with a random data-rate allocation. Our non-asymptotic lower bound on the quantized LQG control problem, reveals the absolute minimum data-rates for (mean square) stability of our time-varying plant for any fixed finite time horizon. We supplement our framework with illustrative simulation experiments.

preprint2020arXiv

Transfer-Entropy-Regularized Markov Decision Processes

We consider the framework of transfer-entropy-regularized Markov Decision Process (TERMDP) in which the weighted sum of the classical state-dependent cost and the transfer entropy from the state random process to the control random process is minimized. Although TERMDPs are generally formulated as nonconvex optimization problems, we derive an analytical necessary optimality condition expressed as a finite set of nonlinear equations, based on which an iterative forward-backward computational procedure similar to the Arimoto-Blahut algorithm is proposed. It is shown that every limit point of the sequence generated by the proposed algorithm is a stationary point of the TERMDP. Applications of TERMDPs are discussed in the context of networked control systems theory and non-equilibrium thermodynamics. The proposed algorithm is applied to an information-constrained maze navigation problem, whereby we study how the price of information qualitatively alters the optimal decision polices.

preprint2016arXiv

Incentivizing Truth-Telling in MPC-based Load Frequency Control

We present a mechanism for socially efficient implementation of model predictive control (MPC) algorithms for load frequency control (LFC) in the presence of self-interested power generators. Specifically, we consider a situation in which the system operator seeks to implement an MPC-based LFC for aggregated social cost minimization, but necessary information such as individual generators' cost functions is privately owned. Without appropriate monetary compensation mechanisms that incentivize truth-telling, self-interested market participants may be inclined to misreport their private parameters in an effort to maximize their own profits, which may result in a loss of social welfare. The main challenge in our framework arises from the fact that every participant's strategy at any time affects the future state of other participants; the consequences of such dynamic coupling has not been fully addressed in the literature on online mechanism design. We propose a class of real-time monetary compensation schemes that incentivize market participants to report their private parameters truthfully at every time step, which enables the system operator to implement MPC-based LFC in a socially optimal manner.

preprint2016arXiv

Rate of Prefix-free Codes in LQG Control Systems

In this paper, we consider a discrete time linear quadratic Gaussian (LQG) control problem in which state information of the plant is encoded in a variable-length binary codeword at every time step, and a control input is determined based on the codewords generated in the past. We derive a lower bound of the rate achievable by the class of prefix-free codes attaining the required LQG control performance. This lower bound coincides with the infimum of a certain directed information expression, and is computable by semidefinite programming (SDP). Based on a technique by Silva et al., we also provide an upper bound of the best achievable rate by constructing a controller equipped with a uniform quantizer with subtractive dither and Shannon-Fano coding. The gap between the obtained lower and upper bounds is less than $0.754r+1$ bits per time step regardless of the required LQG control performance, where $r$ is the rank of a signal-to-noise ratio matrix obtained by SDP, which is no greater than the dimension of the state.

preprint2015arXiv

Faithful Implementations of Distributed Algorithms and Control Laws

When a distributed algorithm must be executed by strategic agents with misaligned interests, a social leader needs to introduce an appropriate tax/subsidy mechanism to incentivize agents to faithfully implement the intended algorithm so that a correct outcome is obtained. We discuss the incentive issues of implementing economically efficient distributed algorithms using the framework of indirect mechanism design theory. In particular, we show that indirect Groves mechanisms are not only sufficient but also necessary to achieve incentive compatibility. This result can be viewed as a generalization of the Green-Laffont theorem to indirect mechanisms. Then we introduce the notion of asymptotic incentive compatibility as an appropriate solution concept to faithfully implement distributed and iterative optimization algorithms. We consider two special types of optimization algorithms: dual decomposition algorithms for resource allocation and average consensus algorithms.

preprint2015arXiv

SDP-based Joint Sensor and Controller Design for Information-regularized Optimal LQG Control

We consider a joint sensor and controller design problem for linear Gaussian stochastic systems in which a weighted sum of quadratic control cost and the amount of information acquired by the sensor is minimized. This problem formulation is motivated by situations where a control law must be designed in the presence of sensing, communication, and privacy constraints. We show that the optimal joint sensor-controller design is relatively easy when the sensing policy is restricted to be linear. Namely, an explicit form of the optimal linear sensor equation, the Kalman filter, and the certainty equivalence controller that jointly solves the problem can be efficiently found by semidefinite programming (SDP). Whether the linearity assumption in our design is restrictive or not is currently an open problem.

preprint2014arXiv

Optimal Output Feedback Architecture for Triangular LQG Problems

Distributed control problems under some specific information constraints can be formulated as (possibly infinite dimensional) convex optimization problems. The underlying motivation of this work is to develop an understanding of the optimal decision making architecture for such problems. In this paper, we particularly focus on the N-player triangular LQG problems and show that the optimal output feedback controllers have attractive state space realizations. The optimal controller can be synthesized using a set of stabilizing solutions to 2N linearly coupled algebraic Riccati equations, which turn out to be easily solvable under reasonable assumptions.

preprint2013arXiv

A Faithful Distributed Implementation of Dual Decomposition and Average Consensus Algorithms

We consider large scale cost allocation problems and consensus seeking problems for multiple agents, in which agents are suggested to collaborate in a distributed algorithm to find a solution. If agents are strategic to minimize their own individual cost rather than the global social cost, they are endowed with an incentive not to follow the intended algorithm, unless the tax/subsidy mechanism is carefully designed. Inspired by the classical Vickrey-Clarke-Groves mechanism and more recent algorithmic mechanism design theory, we propose a tax mechanism that incentivises agents to faithfully implement the intended algorithm. In particular, a new notion of asymptotic incentive compatibility is introduced to characterize a desirable property of such class of mechanisms. The proposed class of tax mechanisms provides a sequence of mechanisms that gives agents a diminishing incentive to deviate from suggested algorithm.

Takashi Tanaka

What is connected

Connect this record

See the researcher in context

Building this map preview

21 published item(s)

Mean Field Analysis of Blockchain Systems

Model Predictive Path Integral Control for Roll-to-Roll Manufacturing

A Lower-bound for Variable-length Source Coding in Linear-Quadratic-Gaussian Control with Shared Randomness

Attack Impact Evaluation by Exact Convexification through State Space Augmentation

Continuous-Time Channel Gain Control for Minimum-Information Kalman-Bucy Filtering

Optimal Sensor Gain Control for Minimum-Information Estimation of Continuous-Time Gauss-Markov Processes

Time-invariant prefix-free source coding for MIMO LQG control

Upper Bounds for Continuous-Time End-to-End Risks in Stochastic Robot Navigation

SARSA(0) Reinforcement Learning over Fully Homomorphic Encryption

Closed-loop Parameter Identification of Linear Dynamical Systems through the Lens of Feedback Channel Coding Theory

Linearly Solvable Mean-Field Traffic Routing Games

Rationally Inattentive Path-Planning via RRT*

Scalable Synthesis of Minimum-Information Linear-Gaussian Control by Distributed Optimization

Sequential Source Coding for Stochastic Systems Subject to Finite Rate Constraints

Transfer-Entropy-Regularized Markov Decision Processes

Incentivizing Truth-Telling in MPC-based Load Frequency Control

Rate of Prefix-free Codes in LQG Control Systems

Faithful Implementations of Distributed Algorithms and Control Laws

SDP-based Joint Sensor and Controller Design for Information-regularized Optimal LQG Control

Optimal Output Feedback Architecture for Triangular LQG Problems

A Faithful Distributed Implementation of Dual Decomposition and Average Consensus Algorithms