Source author record

Yogesh Simmhan

Yogesh Simmhan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing cs.CY Robotics Computational Engineering, Finance, and Science Databases Networking and Internet Architecture Social and Information Networks Systems and Control

Catalog footprint

What is connected

18works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

RIPPLE++: An Incremental Framework for Efficient GNN Inference on Evolving Graphs

Real-world graphs are dynamic, with frequent updates to their structure and features due to evolving vertex and edge properties. These continual changes pose significant challenges for efficient inference in graph neural networks (GNNs). Existing vertex-wise and layer-wise inference approaches are ill-suited for dynamic graphs, as they incur redundant computations, large neighborhood traversals, and high communication costs, especially in distributed settings. Additionally, while sampling-based approaches can be adopted to approximate final layer embeddings, these are often not preferred in critical applications due to their non-determinism. These limitations hinder low-latency inference required in real-time applications. To address this, we propose RIPPLE++, a framework for streaming GNN inference that efficiently and accurately updates embeddings in response to changes in the graph structure or features. RIPPLE++ introduces a generalized incremental programming model that captures the semantics of GNN aggregation functions and incrementally propagates updates to affected neighborhoods. RIPPLE++ accommodates all common graph updates, including vertex/edge addition/deletions and vertex feature updates. RIPPLE++ supports both single-machine and distributed deployments. On a single machine, it achieves up to $56$K updates/sec on sparse graphs like Arxiv ($169$K vertices, $1.2$M edges), and about $7.6$K updates/sec on denser graphs like Products ($2.5$M vertices, $123.7$M edges), with latencies of $0.06$--$960$ms, and outperforming state-of-the-art baselines by $2.2$--$24\times$ on throughput. In distributed settings, RIPPLE++ offers up to $\approx25\times$ higher throughput and $20\times$ lower communication costs compared to recomputing baselines.

preprint2022arXiv

CORNET 2.0: A Co-Simulation Middleware for Robot Networks

We present a networked co-simulation framework for multi-robot systems applications. We require a simulation framework that captures both physical interactions and communications aspects to effectively design such complex systems. This is necessary to co-design the multi-robots' autonomy logic and the communication protocols. The proposed framework extends existing tools to simulate the robot's autonomy and network-related aspects. We have used Gazebo with ROS/ROS2 to develop the autonomy logic for robots and mininet-WiFi as the network simulator to capture the cyber-physical systems properties of the multi-robot system. This framework addresses the need to seamlessly integrate the two simulation environments by synchronizing mobility and time, allowing for easy migration of the algorithms to real platforms. The framework supports container-based virtualization and extends a generic robotic framework by decoupling the data plane and control plane.

preprint2022arXiv

Resilient Execution of Data-triggered Applications on Edge, Fog and Cloud Resources

Internet of Things (IoT) is leading to the pervasive availability of streaming data about the physical world, coupled with edge computing infrastructure deployed as part of smart cities and 5G rollout. These constrained, less reliable but cheap resources are complemented by fog resources that offer federated management and accelerated computing, and pay-as-you-go cloud resources. There is a lack of intuitive means to deploy application pipelines to consume such diverse streams, and to execute them reliably on edge and fog resources. We propose an innovative application model to declaratively specify queries to match streams of micro-batch data from stream sources and trigger the distributed execution of data pipelines. We also design a resilient scheduling strategy using advanced reservation on reliable fogs to guarantee dataflow completion within a deadline while minimizing the execution cost. Our detailed experiments on over 100 virtual IoT resources and for $\approx 10k$ task executions, with comparison against baseline scheduling strategies, illustrates the cost-effectiveness, resilience and scalability of our framework.

preprint2021arXiv

DiPETrans: A Framework for Distributed Parallel Execution of Transactions of Blocks in Blockchain

Contemporary blockchain such as Bitcoin and Ethereum execute transactions serially by miners and validators and determine the Proof-of-Work (PoW). Such serial execution is unable to exploit modern multi-core resources efficiently, hence limiting the system throughput and increasing the transaction acceptance latency. The objective of this work is to increase the transaction throughput by introducing parallel transaction execution using a static analysis technique. We propose a framework DiPETrans for the distributed execution of the transactions in a block. Here, peers in the blockchain network form a community to execute the transactions and find the PoW parallelly, using a leader-follower approach. During mining, the leader statically analyzes the transactions, creates different groups (shards) of independent transactions, and distributes them to followers to execute them in parallel. After the transaction executes, the community's compute power is utilized to solve the PoW concurrently. When a block is successfully created, the leader broadcasts the proposed block to other peers in the network for validation. On receiving a block, validators re-execute the block transactions and accept the block if they reach the same state as shared by the miner. Validation can also be done as a community, in parallel, following the same leader-follower approach as mining. We report experiments using over 5 Million real transactions from the Ethereum blockchain and execute them using our DiPETrans framework to empirically validate the benefits of our techniques over traditional sequential execution. We achieve a maximum speedup of 2.2x for the miner and 2.0x for the validator, with 100 to 500 transactions per block. Further, we achieve a peak of 5x end-to-end block creation speedup using a parallel miner over a serial miner when using 6 machines in the community.

preprint2021arXiv

Heuristic Algorithms for Co-scheduling of Edge Analytics and Routes for UAV Fleet Missions

Unmanned Aerial Vehicles (UAVs) or drones are increasingly used for urban applications like traffic monitoring and construction surveys. Autonomous navigation allows drones to visit waypoints and accomplish activities as part of their mission. A common activity is to hover and observe a location using on-board cameras. Advances in Deep Neural Networks (DNNs) allow such videos to be analyzed for automated decision making. UAVs also host edge computing capability for on-board inferencing by such DNNs. To this end, for a fleet of drones, we propose a novel Mission Scheduling Problem (MSP) that co-schedules the flight routes to visit and record video at waypoints, and their subsequent on-board edge analytics. The proposed schedule maximizes the utility from the activities while meeting activity deadlines as well as energy and computing constraints. We first prove that MSP is NP-hard and then optimally solve it by formulating a mixed integer linear programming (MILP) problem. Next, we design two efficient heuristic algorithms, JSC and VRC, that provide fast sub-optimal solutions. Evaluation of these three schedulers using real drone traces demonstrate utility-runtime trade-offs under diverse workloads.

preprint2020arXiv

A Distributed Path Query Engine for Temporal Property Graphs

Property graphs are a common form of linked data, with path queries used to traverse and explore them for enterprise transactions and mining. Temporal property graphs are a recent variant where time is a first-class entity to be queried over, and their properties and structure vary over time. These are seen in social, telecom, transit and epidemic networks. However, current graph databases and query engines have limited support for temporal relations among graph entities, no support for time-varying entities and/or do not scale on distributed resources. We address this gap by extending a linear path query model over property graphs to include intuitive temporal predicates and aggregation operators over temporal graphs. We design a distributed execution model for these temporal path queries using the interval-centric computing model, and develop a novel cost model to select an efficient execution plan from several. We perform detailed experiments of our Granite distributed query engine using both static and dynamic temporal property graphs as large as 52M vertices, 218M edges and 325M properties, and a 1600-query workload, derived from the LDBC benchmark. We often offer sub-second query latencies on a commodity cluster, which is 149x-1140x faster compared to industry-leading Neo4J shared-memory graph database and the JanusGraph / Spark distributed graph query engine. Granite also completes 100% of the queries for all graphs, compared to only 32-92% workload completion by the baseline systems. Further, our cost model selects a query plan that is within 10% of the optimal execution time in 90% of the cases. Despite the irregular nature of graph processing, we exhibit a weak-scaling efficiency >= 60% on 8 nodes and >= 40% on 16 nodes, for most query workloads.

preprint2020arXiv

A Scalable Platform for Distributed Object Tracking across a Many-camera Network

Advances in deep neural networks (DNN) and computer vision (CV) algorithms have made it feasible to extract meaningful insights from large-scale deployments of urban cameras. Tracking an object of interest across the camera network in near real-time is a canonical problem. However, current tracking platforms have two key limitations: 1) They are monolithic, proprietary and lack the ability to rapidly incorporate sophisticated tracking models; and 2) They are less responsive to dynamism across wide-area computing resources that include edge, fog and cloud abstractions. We address these gaps using Anveshak, a runtime platform for composing and coordinating distributed tracking applications. It provides a domain-specific dataflow programming model to intuitively compose a tracking application, supporting contemporary CV advances like query fusion and re-identification, and enabling dynamic scoping of the camera network's search space to avoid wasted computation. We also offer tunable batching and data-dropping strategies for dataflow blocks deployed on distributed resources to respond to network and compute variability. These balance the tracking accuracy, its real-time performance and the active camera-set size. We illustrate the concise expressiveness of the programming model for $4$ tracking applications. Our detailed experiments for a network of 1000 camera-feeds on modest resources exhibit the tunable scalability, performance and quality trade-offs enabled by our dynamic tracking, batching and dropping strategies.

preprint2020arXiv

GoCoronaGo: Privacy Respecting Contact Tracing for COVID-19 Management

The COVID-19 pandemic is imposing enormous global challenges in managing the spread of the virus. A key pillar to mitigation is contact tracing, which complements testing and isolation. Digital apps for contact tracing using Bluetooth technology available in smartphones have gained prevalence globally. In this article, we discuss various capabilities of such digital contact tracing, and its implication on community safety and individual privacy, among others. We further describe the GoCoronaGo institutional contact tracing app that we have developed, and the conscious and sometimes contrarian design choices we have made. We offer a detailed overview of the app, backend platform and analytics, and our early experiences with deploying the app to over 1000 users within the Indian Institute of Science campus in Bangalore. We also highlight research opportunities and open challenges for digital contact tracing and analytics over temporal networks constructed from them.

preprint2019arXiv

A Partition-centric Distributed Algorithm for Identifying Euler Circuits in Large Graphs

Finding the Eulerian circuit in graphs is a classic problem, but inadequately explored for parallel computation. With such cycles finding use in neuroscience and Internet of Things for large graphs, designing a distributed algorithm for finding the Euler circuit is important. Existing parallel algorithms are impractical for commodity clusters and Clouds. We propose a novel partition-centric algorithm to find the Euler circuit, over large graphs partitioned across distributed machines and executed iteratively using a Bulk Synchronous Parallel (BSP) model. The algorithm finds partial paths and cycles within each partition, and refines these into longer paths by recursively merging the partitions. We describe the algorithm, analyze its complexity, validate it on Apache Spark for large graphs, and offer experimental results. We also identify memory bottlenecks in the algorithm and propose an enhanced design to address it.

preprint2019arXiv

Characterizing Application Scheduling on Edge, Fog and Cloud Computing Resources

Cloud computing has grown to become a popular distributed computing service offered by commercial providers. More recently, Edge and Fog computing resources have emerged on the wide-area network as part of Internet of Things (IoT) deployments. These three resource abstraction layers are complementary, and provide distinctive benefits. Scheduling applications on clouds has been an active area of research, with workflow and dataflow models serving as a flexible abstraction to specify applications for execution. However, the application programming and scheduling models for edge and fog are still maturing, and can benefit from learnings on cloud resources. At the same time, there is also value in using these resources cohesively for application execution. In this article, we present a taxonomy of concepts essential for specifying and solving the problem of scheduling applications on edge, for and cloud computing resources. We first characterize the resource capabilities and limitations of these infrastructure, and design a taxonomy of application models, Quality of Service (QoS) constraints and goals, and scheduling techniques, based on a literature review. We also tabulate key research prototypes and papers using this taxonomy. This survey benefits developers and researchers on these distributed resources in designing and categorizing their applications, selecting the relevant computing abstraction(s), and developing or selecting the appropriate scheduling algorithm. It also highlights gaps in literature where open problems remain.

preprint2019arXiv

ElfStore: A Resilient Data Storage Service for Federated Edge and Fog Resources

Edge and fog computing have grown popular as IoT deployments become wide-spread. While application composition and scheduling on such resources are being explored, there exists a gap in a distributed data storage service on the edge and fog layer, instead depending solely on the cloud for data persistence. Such a service should reliably store and manage data on fog and edge devices, even in the presence of failures, and offer transparent discovery and access to data for use by edge computing applications. Here, we present Elfstore, a first-of-its-kind edge-local federated store for streams of data blocks. It uses reliable fog devices as a super-peer overlay to monitor the edge resources, offers federated metadata indexing using Bloom filters, locates data within 2-hops, and maintains approximate global statistics about the reliability and storage capacity of edges. Edges host the actual data blocks, and we use a unique differential replication scheme to select edges on which to replicate blocks, to guarantee a minimum reliability and to balance storage utilization. Our experiments on two IoT virtual deployments with 20 and 272 devices show that ElfStore has low overheads, is bound only by the network bandwidth, has scalable performance, and offers tunable resilience.

preprint2016arXiv

Introducing Distributed Dynamic Data-intensive (D3) Science: Understanding Applications and Infrastructure

A common feature across many science and engineering applications is the amount and diversity of data and computation that must be integrated to yield insights. Data sets are growing larger and becoming distributed; and their location, availability and properties are often time-dependent. Collectively, these characteristics give rise to dynamic distributed data-intensive applications. While "static" data applications have received significant attention, the characteristics, requirements, and software systems for the analysis of large volumes of dynamic, distributed data, and data-intensive applications have received relatively less attention. This paper surveys several representative dynamic distributed data-intensive application scenarios, provides a common conceptual framework to understand them, and examines the infrastructure used in support of applications.

preprint2016arXiv

Towards a Practical Architecture for the Next Generation Internet of Things

The Internet of Things, or the IoT is a vision for a ubiquitous society wherein people and "Things" are connected in an immersively networked computing environment, with the connected "Things" providing utility to people/enterprises and their digital shadows, through intelligent social and commercial services. Translating this idea to a conceivable reality is a work in progress for more than a decade. Current IoT architectures are predicated on optimistic assumptions on the evolution and deployment of IoT technologies. We believe many of these assumptions will not be met, consequently impeding the practical and sustainable deployment of IoT. In this article, we explore use-cases across different applications domains that can potentially benefit from an IoT infrastructure, and analyze them in the context of an alternative world-view that is more grounded in reality. Despite this more conservative approach, we argue that adopting certain design paradigms when architecting an IoT ecosystem can achieve much of the promised benefits in a practical and sustainable manner.

preprint2015arXiv

An Interoperable Realization of Smart Cities with Plug and Play based Device Management

The primal problem with Internet of Things (IoT) solutions for smart cities is the lack of interoperability at various levels, and more predominately at the device level. While there exist multitude of platforms from multiple manufacturers, the existing ecosystem still remains highly closed. In this paper, we propose SNaaS or Sensor/Network as a Service: a service layer that enables the creation of the plug-n-play infrastructure, across platforms from multiple vendors, necessary for interoperability and successful deployment of large-scale city wide systems. In order to correctly position the new service layer, we present a high level reference IoT architecture for smart city implementations, and follow it up with the workflow details of SNaaS along with preliminary microbenchmarks.

preprint2014arXiv

Floe: A Continuous Dataflow Framework for Dynamic Cloud Applications

Applications in cyber-physical systems are increasingly coupled with online instruments to perform long running, continuous data processing. Such "always on" dataflow applications are dynamic, where they need to change the applications logic and performance at runtime, in response to external operational needs. Floe is a continuous dataflow framework that is designed to be adaptive for dynamic applications on Cloud infrastructure. It offers advanced dataflow patterns like BSP and MapReduce for flexible and holistic composition of streams and files, and supports dynamic recomposition at runtime with minimal impact on the execution. Adaptive resource allocation strategies allow our framework to effectively use elastic Cloud resources to meet varying data rates. We illustrate the design patterns of Floe by running an integration pipeline and a tweet clustering application from the Smart Power Grids domain on a private Eucalyptus Cloud. The responsiveness of our resource adaptation is validated through simulations for periodic, bursty and random workloads.

preprint2014arXiv

Scalable Analytics over Distributed Time-series Graphs using GoFFish

Graphs are a key form of Big Data, and performing scalable analytics over them is invaluable to many domains. As our ability to collect data grows, there is an emerging class of inter-connected data which accumulates or varies over time, and on which novel analytics - both over the network structure and across the time-variant attribute values - is necessary. We introduce the notion of time-series graph analytics and propose Gopher, a scalable programming abstraction to develop algorithms and analytics on such datasets. Our abstraction leverages a sub-graph centric programming model and extends it to the temporal dimension using an iterative BSP (Bulk Synchronous Parallel) approach. Gopher is co-designed with GoFS, a distributed storage specialized for time-series graphs, as part of the GoFFish distributed analytics platform. We examine storage optimizations for GoFS, design patterns in Gopher to leverage the distributed data layout, and evaluate the GoFFish platform using time-series graph data and applications on a commodity cluster.

preprint2013arXiv

On Using Complex Event Processing for Dynamic Demand Response Optimization in Microgrid

Demand-side load reduction is a key benefit of Smart Grids. However, existing demand response optimization (DR) programs fail to effectively leverage the near-realtime information available from smart meters and Building Area Networks to respond dynamically to changing energy use profiles. We investigate the use of semantic Complex Event Processing (CEP) patterns to model and detect dynamic situations in a campus microgrid to facilitate adaptive DR. Our focus is on demand-side management rather than supply-side constraints. Continuous data from information sources like smart meters and building sensors are abstracted as event streams. Event patterns for situations that assist with DR are detected from them. Specifically, we offer a taxonomy of event patterns that can guide operators to define situations of interest and we illustrate its efficacy for DR by applying these patterns on realtime events in the USC Campus microgrid using our CEP framework

preprint2013arXiv

Sustainable Software Development for Next-Gen Sequencing (NGS) Bioinformatics on Emerging Platforms

DNA sequence analysis is fundamental to life science research. The rapid development of next generation sequencing (NGS) technologies, and the richness and diversity of applications it makes feasible, have created an enormous gulf between the potential of this technology and the development of computational methods to realize this potential. Bridging this gap holds possibilities for broad impacts toward multiple grand challenges and offers unprecedented opportunities for software innovation and research. We argue that NGS-enabled applications need a critical mass of sustainable software to benefit from emerging computing platforms' transformative potential. Accumulating the necessary critical mass will require leaders in computational biology, bioinformatics, computer science, and computer engineering work together to identify core opportunity areas, critical software infrastructure, and software sustainability challenges. Furthermore, due to the quickly changing nature of both bioinformatics software and accelerator technology, we conclude that creating sustainable accelerated bioinformatics software means constructing a sustainable bridge between the two fields. In particular, sustained collaboration between domain developers and technology experts is needed to develop the accelerated kernels, libraries, frameworks and middleware that could provide the needed flexible link from NGS bioinformatics applications to emerging platforms.

Yogesh Simmhan

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

RIPPLE++: An Incremental Framework for Efficient GNN Inference on Evolving Graphs

CORNET 2.0: A Co-Simulation Middleware for Robot Networks

Resilient Execution of Data-triggered Applications on Edge, Fog and Cloud Resources

DiPETrans: A Framework for Distributed Parallel Execution of Transactions of Blocks in Blockchain

Heuristic Algorithms for Co-scheduling of Edge Analytics and Routes for UAV Fleet Missions

A Distributed Path Query Engine for Temporal Property Graphs

A Scalable Platform for Distributed Object Tracking across a Many-camera Network

GoCoronaGo: Privacy Respecting Contact Tracing for COVID-19 Management

A Partition-centric Distributed Algorithm for Identifying Euler Circuits in Large Graphs

Characterizing Application Scheduling on Edge, Fog and Cloud Computing Resources

ElfStore: A Resilient Data Storage Service for Federated Edge and Fog Resources

Introducing Distributed Dynamic Data-intensive (D3) Science: Understanding Applications and Infrastructure

Towards a Practical Architecture for the Next Generation Internet of Things

An Interoperable Realization of Smart Cities with Plug and Play based Device Management

Floe: A Continuous Dataflow Framework for Dynamic Cloud Applications

Scalable Analytics over Distributed Time-series Graphs using GoFFish

On Using Complex Event Processing for Dynamic Demand Response Optimization in Microgrid

Sustainable Software Development for Next-Gen Sequencing (NGS) Bioinformatics on Emerging Platforms