Researcher profile

Zhimin Yao

Zhimin Yao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - Baseline
2works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2013arXiv

High Performance Risk Aggregation: Addressing the Data Processing Challenge the Hadoop MapReduce Way

Monte Carlo simulations employed for the analysis of portfolios of catastrophic risk process large volumes of data. Often times these simulations are not performed in real-time scenarios as they are slow and consume large data. Such simulations can benefit from a framework that exploits parallelism for addressing the computational challenge and facilitates a distributed file system for addressing the data challenge. To this end, the Apache Hadoop framework is chosen for the simulation reported in this paper so that the computational challenge can be tackled using the MapReduce model and the data challenge can be addressed using the Hadoop Distributed File System. A parallel algorithm for the analysis of aggregate risk is proposed and implemented using the MapReduce model in this paper. An evaluation of the performance of the algorithm indicates that the Hadoop MapReduce model offers a framework for processing large data in aggregate risk analysis. A simulation of aggregate risk employing 100,000 trials with 1000 catastrophic events per trial on a typical exposure set and contract structure is performed on multiple worker nodes in less than 6 minutes. The result indicates the scope and feasibility of MapReduce for tackling the computational and data challenge in the analysis of aggregate risk for real-time use.

preprint2013arXiv

QuPARA: Query-Driven Large-Scale Portfolio Aggregate Risk Analysis on MapReduce

Stochastic simulation techniques are used for portfolio risk analysis. Risk portfolios may consist of thousands of reinsurance contracts covering millions of insured locations. To quantify risk each portfolio must be evaluated in up to a million simulation trials, each capturing a different possible sequence of catastrophic events over the course of a contractual year. In this paper, we explore the design of a flexible framework for portfolio risk analysis that facilitates answering a rich variety of catastrophic risk queries. Rather than aggregating simulation data in order to produce a small set of high-level risk metrics efficiently (as is often done in production risk management systems), the focus here is on allowing the user to pose queries on unaggregated or partially aggregated data. The goal is to provide a flexible framework that can be used by analysts to answer a wide variety of unanticipated but natural ad hoc queries. Such detailed queries can help actuaries or underwriters to better understand the multiple dimensions (e.g., spatial correlation, seasonality, peril features, construction features, and financial terms) that can impact portfolio risk. We implemented a prototype system, called QuPARA (Query-Driven Large-Scale Portfolio Aggregate Risk Analysis), using Hadoop, which is Apache's implementation of the MapReduce paradigm. This allows the user to take advantage of large parallel compute servers in order to answer ad hoc risk analysis queries efficiently even on very large data sets typically encountered in practice. We describe the design and implementation of QuPARA and present experimental results that demonstrate its feasibility. A full portfolio risk analysis run consisting of a 1,000,000 trial simulation, with 1,000 events per trial, and 3,200 risk transfer contracts can be completed on a 16-node Hadoop cluster in just over 20 minutes.