Researcher profile

Zhongyao Ma

Zhongyao Ma contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

Multi-ship cooperative air defense model based on queuing theory

The study of the multi-ship air defense model is of great significance in the simulation and evaluation of the actual combat process, the demonstration of air defense tactics, and the improvement of the security of important targets. The traditional multi-ship air defense model does not consider the coordination between ships, and the model assumptions are often too simple to effectively describe the capabilities of the multi-ship cooperative air defense system in realistic combat scenarios. In response to the above problems, this paper proposes a multi-ship cooperative air defense model, which effectively integrates the attack and defense parameters of both sides such as missile launch rate, missile flight speed, missile launch direction, ship interception rate, ship interception range, and the number of ship interception fire units. Then, the cooperative interception capability among ships is modeled by the method of task assignment. Based on the queuing theory, this paper strictly deduces the penetration probability of the cooperative air defense system, and provides an analytical calculation model for the analysis and design of the cooperative air defense system. Finally, through simulation experiments in typical scenarios, this paper studies and compares the air defense capabilities of the system in two different modes with and without coordination, and verifies the superiority of the multi-ship cooperative air defense model in reducing the probability of missile penetration; Further, the ability changes of the defense system under different parameters such as missile speed, speed, angle, ship interception rate, range, and number of fire units are studied, and the weak points of the defense formation, defense range settings, and interception settings are obtained.

preprint2020arXiv

Multi-Agent Reinforcement Learning in a Realistic Limit Order Book Market Simulation

Optimal order execution is widely studied by industry practitioners and academic researchers because it determines the profitability of investment decisions and high-level trading strategies, particularly those involving large volumes of orders. However, complex and unknown market dynamics pose significant challenges for the development and validation of optimal execution strategies. In this paper, we propose a model-free approach by training Reinforcement Learning (RL) agents in a realistic market simulation environment with multiple agents. First, we configure a multi-agent historical order book simulation environment for execution tasks built on an Agent-Based Interactive Discrete Event Simulation (ABIDES) [arXiv:1904.12066]. Second, we formulate the problem of optimal execution in an RL setting where an intelligent agent can make order execution and placement decisions based on market microstructure trading signals in High Frequency Trading (HFT). Third, we develop and train an RL execution agent using the Double Deep Q-Learning (DDQL) algorithm in the ABIDES environment. In some scenarios, our RL agent converges towards a Time-Weighted Average Price (TWAP) strategy. Finally, we evaluate the simulation with our RL agent by comparing it with a market replay simulation using real market Limit Order Book (LOB) data.