Source author record

Rupesh Nasre

Rupesh Nasre appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Software Engineering

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Code Generation for a Variety of Accelerators for a Graph DSL

Sparse graphs are ubiquitous in real and virtual worlds. With the phenomenal growth in semi-structured and unstructured data, sizes of the underlying graphs have witnessed a rapid growth over the years. Analyzing such large structures necessitates parallel processing, which is challenged by the intrinsic irregularity of sparse computation, memory access, and communication. It would be ideal if programmers and domain-experts get to focus only on the sequential computation and a compiler takes care of auto-generating the parallel code. On the other side, there is a variety in the number of target hardware devices, and achieving optimal performance often demands coding in specific languages or frameworks. Our goal in this work is to focus on a graph DSL which allows the domain-experts to write almost-sequential code, and generate parallel code for different accelerators from the same algorithmic specification. In particular, we illustrate code generation from the StarPlat graph DSL for NVIDIA, AMD, and Intel GPUs using CUDA, OpenCL, SYCL, and OpenACC programming languages. Using a suite of ten large graphs and four popular algorithms, we present the efficacy of StarPlat's versatile code generator.

preprint2020arXiv

A Fine-Grained Hybrid CPU-GPU Algorithm for Betweenness Centrality Computations

Betweenness centrality (BC) is an important graph analytical application for large-scale graphs. While there are many efforts for parallelizing betweenness centrality algorithms on multi-core CPUs and many-core GPUs, in this work, we propose a novel fine-grained CPU-GPU hybrid algorithm that partitions a graph into CPU and GPU partitions, and performs BC computations for the graph on both the CPU and GPU resources simultaneously with very small number of CPU-GPU communications. The forward phase in our hybrid BC algorithm leverages the multi-source property inherent in the BC problem. We also perform a novel hybrid and asynchronous backward phase that performs minimal CPU-GPU synchronizations. Evaluations using a large number of graphs with different characteristics show that our hybrid approach gives 80% improvement in performance, and 80-90% less CPU-GPU communications than an existing hybrid algorithm based on the popular Bulk Synchronous Paradigm (BSP) approach.

preprint2020arXiv

BOLD: An Ontology-based Log Debugger for C Programs

The different activities related to debugging such as program instrumentation, representation of execution trace and analysis of trace are not typically performed in an unified framework. We propose \textit{BOLD}, an Ontology-based Log Debugger to unify and standardize the activities in debugging. The syntactical information of programs can be represented in the from of Resource Description Framework (RDF) triples. Using the BOLD framework, the programs can be automatically instrumented by using declarative specifications over these triples. A salient feature of the framework is to store the execution trace of the program also as RDF triples called \textit{trace triples}. These triples can be queried to implement the common debug operations. The novelty of the framework is to abstract these triples as \textit{spans} for high-level reasoning. A span gives a way of examining the values of a particular variable over certain portion of the program execution. The properties of the spans are defined formally as a Web Ontology Language (OWL) ontology called \textit{Program Debug (PD) Ontology}. Using the span abstraction and PD ontology, end-users can debug a given buggy program in a standard manner. A notable feature of using ontology is that users can accurately debug in some cases of missing information, which can be practically useful. To demonstrate the feasibility of the proposed framework, we have debugged the programs in a standard bug benchmark suite Software-artifact Infrastructure Repository (SIR). Experiments show that the querying time is almost the same as in \texttt{gdb}. The reasoning time depends on the sub-language of OWL. We find that the expressibility offered by OWL-DL language is sufficient for the bugs in SIR programs; but to achieve scalability in reasoning, a restricted OWL-RL language is required.

Rupesh Nasre

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Code Generation for a Variety of Accelerators for a Graph DSL

A Fine-Grained Hybrid CPU-GPU Algorithm for Betweenness Centrality Computations

BOLD: An Ontology-based Log Debugger for C Programs