Researcher profile

Toru Takisaka

Toru Takisaka contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

Dynamic Shielding for Reinforcement Learning in Black-Box Environments

It is challenging to use reinforcement learning (RL) in cyber-physical systems due to the lack of safety guarantees during learning. Although there have been various proposals to reduce undesired behaviors during learning, most of these techniques require prior system knowledge, and their applicability is limited. This paper aims to reduce undesired behaviors during learning without requiring any prior system knowledge. We propose dynamic shielding: an extension of a model-based safe RL technique called shielding using automata learning. The dynamic shielding technique constructs an approximate system model in parallel with RL using a variant of the RPNI algorithm and suppresses undesired explorations due to the shield constructed from the learned model. Through this combination, potentially unsafe actions can be foreseen before the agent experiences them. Experiments show that our dynamic shield significantly decreases the number of undesired events during training.

preprint2021arXiv

Moment Propagation of Discrete-Time Stochastic Polynomial Systems using Truncated Carleman Linearization

We propose a method to compute an approximation of the moments of a discrete-time stochastic polynomial system. We use the Carleman linearization technique to transform this finite-dimensional polynomial system into an infinite-dimensional linear one. After taking expectation and truncating the induced deterministic dynamics, we obtain a finite-dimensional linear deterministic system, which we then use to iteratively compute approximations of the moments of the original polynomial system at different time steps. We provide upper bounds on the approximation error for each moment and show that, for large enough truncation limits, the proposed method precisely computes moments for sufficiently small degrees and numbers of time steps. We use our proposed method for safety analysis to compute bounds on the probability of the system state being outside a given safety region. Finally, we illustrate our results on two concrete examples, a stochastic logistic map and a vehicle dynamics under stochastic disturbance.

preprint2020arXiv

Widest Paths and Global Propagation in Bounded Value Iteration for Stochastic Games

Solving stochastic games with the reachability objective is a fundamental problem, especially in quantitative verification and synthesis. For this purpose, bounded value iteration (BVI) attracts attention as an efficient iterative method. However, BVI's performance is often impeded by costly end component (EC) computation that is needed to ensure convergence. Our contribution is a novel BVI algorithm that conducts, in addition to local propagation by the Bellman update that is typical of BVI, global propagation of upper bounds that is not hindered by ECs. To conduct global propagation in a computationally tractable manner, we construct a weighted graph and solve the widest path problem in it. Our experiments show the algorithm's performance advantage over the previous BVI algorithms that rely on EC computation.