Source author record

Jianbing Ding

Jianbing Ding appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

2works
1topics
4close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2015arXiv

DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams

In a data stream management system (DSMS), users register continuous queries, and receive result updates as data arrive and expire. We focus on applications with real-time constraints, in which the user must receive each result update within a given period after the update occurs. To handle fast data, the DSMS is commonly placed on top of a cloud infrastructure. Because stream properties such as arrival rates can fluctuate unpredictably, cloud resources must be dynamically provisioned and scheduled accordingly to ensure real-time response. It is quite essential, for the existing systems or future developments, to possess the ability of scheduling resources dynamically according to the current workload, in order to avoid wasting resources, or failing in delivering correct results on time. Motivated by this, we propose DRS, a novel dynamic resource scheduler for cloud-based DSMSs. DRS overcomes three fundamental challenges: (a) how to model the relationship between the provisioned resources and query response time (b) where to best place resources; and (c) how to measure system load with minimal overhead. In particular, DRS includes an accurate performance model based on the theory of \emph{Jackson open queueing networks} and is capable of handling \emph{arbitrary} operator topologies, possibly with loops, splits and joins. Extensive experiments with real data confirm that DRS achieves real-time response with close to optimal resource consumption.

preprint2015arXiv

Optimal Operator State Migration for Elastic Data Stream Processing

A cloud-based data stream management system (DSMS) handles fast data by utilizing the massively parallel processing capabilities of the underlying platform. An important property of such a DSMS is elasticity, meaning that nodes can be dynamically added to or removed from an application to match the latter's workload, which may fluctuate in an unpredictable manner. For an application involving stateful operations such as aggregates, the addition / removal of nodes necessitates the migration of operator states. Although the importance of migration has been recognized in existing systems, two key problems remain largely neglected, namely how to migrate and what to migrate, i.e., the migration mechanism that reduces synchronization overhead and result delay during migration, and the selection of the optimal task assignment that minimizes migration costs. Consequently, migration in current systems typically incurs a high spike in result delay caused by expensive synchronization barriers and suboptimal task assignments. Motivated by this, we present the first comprehensive study on efficient operator states migration, and propose designs and algorithms that enable live, progressive, and optimized migrations. Extensive experiments using real data justify our performance claims.