Researcher profile

Shantenu Jha

Shantenu Jha contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2022arXiv

A Scalable Solution for Running Ensemble Simulations for Photovoltaic Energy

This chapter proposes and provides an in-depth discussion of a scalable solution for running ensemble simulation for solar energy production. Generating a forecast ensemble is computationally expensive. But with the help of Analog Ensemble, forecast ensembles can be generated with a single deterministic run of a weather forecast model. Weather ensembles are then used to simulate 11 10 KW photovoltaic solar power systems to study the simulation uncertainty under a wide range of panel configuration and weather conditions. This computational workflow has been deployed onto the NCAR supercomputer, Cheyenne, with more than 7,000 cores. Results show that, spring and summer are typically associated with a larger simulation uncertainty. Optimizing the panel configuration based on their individual performance under changing weather conditions can improve the simulation accuracy by more than 12%. This work also shows how panel configuration can be optimized based on geographic locations.

preprint2022arXiv

AI-coupled HPC Workflows

Increasingly, scientific discovery requires sophisticated and scalable workflows. Workflows have become the ``new applications,'' wherein multi-scale computing campaigns comprise multiple and heterogeneous executable tasks. In particular, the introduction of AI/ML models into the traditional HPC workflows has been an enabler of highly accurate modeling, typically reducing computational needs compared to traditional methods. This chapter discusses various modes of integrating AI/ML models to HPC computations, resulting in diverse types of AI-coupled HPC workflows. The increasing need of coupling AI/ML and HPC across scientific domains is motivated, and then exemplified by a number of production-grade use cases for each mode. We additionally discuss the primary challenges of extreme-scale AI-coupled HPC campaigns -- task heterogeneity, adaptivity, performance -- and several framework and middleware solutions which aim to address them. While both HPC workflow and AI/ML computing paradigms are independently effective, we highlight how their integration, and ultimate convergence, is leading to significant improvements in scientific performance across a range of domains, ultimately resulting in scientific explorations otherwise unattainable.

preprint2022arXiv

Coupling streaming AI and HPC ensembles to achieve 100-1000x faster biomolecular simulations

Machine learning (ML)-based steering can improve the performance of ensemble-based simulations by allowing for online selection of more scientifically meaningful computations. We present DeepDriveMD, a framework for ML-driven steering of scientific simulations that we have used to achieve orders-of-magnitude improvements in molecular dynamics (MD) performance via effective coupling of ML and HPC on large parallel computers. We discuss the design of DeepDriveMD and characterize its performance. We demonstrate that DeepDriveMD can achieve between 100-1000x acceleration for protein folding simulations relative to other methods, as measured by the amount of simulated time performed, while covering the same conformational landscape as quantified by the states sampled during a simulation. Experiments are performed on leadership-class platforms on up to 1020 nodes. The results establish DeepDriveMD as a high-performance framework for ML-driven HPC simulation scenarios, that supports diverse MD simulation and ML back-ends, and which enables new scientific insights by improving the length and time scales accessible with current computing capacity.

preprint2022arXiv

Optimal Decision Making in High-Throughput Virtual Screening Pipelines

The need for efficient computational screening of molecular candidates that possess desired properties frequently arises in various scientific and engineering problems, including drug discovery and materials design. However, the large size of the search space containing the candidates and the substantial computational cost of high-fidelity property prediction models makes screening practically challenging. In this work, we propose a general framework for constructing and optimizing a virtual screening (HTVS) pipeline that consists of multi-fidelity models. The central idea is to optimally allocate the computational resources to models with varying costs and accuracy to optimize the return-on-computational-investment (ROCI). Based on both simulated as well as real data, we demonstrate that the proposed optimal HTVS framework can significantly accelerate screening virtually without any degradation in terms of accuracy. Furthermore, it enables an adaptive operational strategy for HTVS, where one can trade accuracy for efficiency.

preprint2022arXiv

Pipeline for Automating Compliance-based Elimination and Extension (PACE2): A Systematic Framework for High-throughput Biomolecular Material Simulation Workflows

The formation of biomolecular materials via dynamical interfacial processes such as self-assembly and fusion, for diverse compositions and external conditions, can be efficiently probed using ensemble Molecular Dynamics. However, this approach requires a large number of simulations when investigating a large composition phase space. In addition, there is difficulty in predicting whether each simulation is yielding biomolecular materials with the desired properties or outcomes and how long each simulation will run for. These difficulties can be overcome by rules-based management systems which include intermittent inspection, variable sampling, premature termination and extension of the individual Molecular Dynamics simulations. The automation of such a management system can significantly reduce the overhead of managing large ensembles of Molecular Dynamics simulations. To this end, a high-throughput workflows-based computational framework, Pipeline for Automating Compliance-based Elimination and Extension (PACE2), for biomolecular materials simulations is proposed. The PACE2 framework encompasses Simulation-Analysis Pipelines. Each Pipeline includes temporally separated simulation and analysis tasks. When a Molecular Dynamics simulation completes, an analysis task is triggered which evaluates the Molecular Dynamics trajectory for compliance. Compliant Molecular Dynamics simulations are extended to the next Molecular Dynamics phase with a suitable sample rate to allow additional, detailed analysis. Non-compliant Molecular Dynamics simulations are eliminated, and their computational resources are either reallocated or released. The framework is designed to run on local desktop computers and high performance computing resources. In the future, the framework will be extended to address generalized workflows and investigate other classes of materials.

preprint2022arXiv

RADICAL-Pilot and Parsl: Executing Heterogeneous Workflows on HPC Platforms

Workflows applications are becoming increasingly important to support scientific discovery. That is leading to a proliferation of workflow management systems and, thus, to a fragmented software ecosystem. Integration among existing workflow tools can improve development efficiency and, ultimately, increase the sustainability of scientific workflow software. We describe our experience with integrating RADICAL-Pilot (RP) and Parsl as a way to enable users to develop and execute workflow applications with heterogeneous tasks on heterogeneous high-performance computing resources. We describe our approach to the integration of the two systems and detail the development of RPEX, a Parsl executor which uses RP as its workload manager. We develop an RP executor that executes heterogeneous MPI Python functions on CPU cores and GPUs. We measure the weak and strong scaling of RPEX, RP, and Parsl when providing new capabilities to two paradigmatic use cases: Colmena and Ice Wedge Polygons

preprint2022arXiv

RAPTOR: Ravenous Throughput Computing

We describe the design, implementation and performance of the RADICAL-Pilot task overlay (RAPTOR). RAPTOR enables the execution of heterogeneous tasks -- i.e., functions and executables with arbitrary duration -- on HPC platforms, providing high throughput and high resource utilization. RAPTOR supports the high throughput virtual screening requirements of DOE's National Virtual Biotechnology Laboratory effort to find therapeutic solutions for COVID-19. RAPTOR has been used on $>8000$ compute nodes to sustain 144M/hour docking hits, and to screen $\sim$10$^{11}$ ligands. To the best of our knowledge, both the throughput rate and aggregated number of executed tasks are a factor of two greater than previously reported in literature. RAPTOR represents important progress towards improvement of computational drug discovery, in terms of size of libraries screened, and for the possibility of generating training data fast enough to serve the last generation of docking surrogate models.

preprint2022arXiv

The Ghost of Performance Reproducibility Past

The importance of ensemble computing is well established. However, executing ensembles at scale introduces interesting performance fluctuations that have not been well investigated. In this paper, we trace our experience uncovering performance fluctuations of ensemble applications (primarily constituting a workflow of GROMACS tasks), and unsuccessful attempts, so far, at trying to discern the underlying cause(s) of performance fluctuations. Is the failure to discern the causative or contributing factors a failure of capability? Or imagination? Do the fluctuations have their genesis in some inscrutable aspect of the system or software? Does it warrant a fundamental reassessment and rethinking of how we assume and conceptualize performance reproducibility? Answers to these questions are not straightforward, nor are they immediate or obvious. We conclude with a discussion about the performance of ensemble applications and ruminate over the implications for how we define and measure application performance.

preprint2022arXiv

Workflows Community Summit: Tightening the Integration between Computing Facilities and Scientific Workflows

The importance of workflows is highlighted by the fact that they have underpinned some of the most significant discoveries of the past decades. Many of these workflows have significant computational, storage, and communication demands, and thus must execute on a range of large-scale computer systems, from local clusters to public clouds and upcoming exascale HPC platforms. Historically, infrastructures for workflow execution consisted of complex, integrated systems, developed in-house by workflow practitioners with strong dependencies on a range of legacy technologies. Due to the increasing need to support workflows, dedicated workflow systems were developed to provide abstractions for creating, executing, and adapting workflows conveniently and efficiently while ensuring portability. While these efforts are all worthwhile individually, there are now hundreds of independent workflow systems. The resulting workflow system technology landscape is fragmented, which may present significant barriers for future workflow users due to many seemingly comparable, yet usually mutually incompatible, systems that exist. In order to tackle some of these challenges, the DOE-funded ExaWorks and NSF-funded WorkflowsRI projects have organized in 2021 a series of events entitled the "Workflows Community Summit". The third edition of the ``Workflows Community Summit" explored workflows challenges and opportunities from the perspective of computing centers and facilities. The third summit brought together a small group of facilities representatives with the aim to understand how workflows are currently being used at each facility, how facilities would like to interact with workflow developers and users, how workflows fit with facility roadmaps, and what opportunities there are for tighter integration between facilities and workflows. More information at: https://workflowsri.org/summits/facilities/

preprint2021arXiv

A Community Roadmap for Scientific Workflows Research and Development

The landscape of workflow systems for scientific applications is notoriously convoluted with hundreds of seemingly equivalent workflow systems, many isolated research claims, and a steep learning curve. To address some of these challenges and lay the groundwork for transforming workflows research and development, the WorkflowsRI and ExaWorks projects partnered to bring the international workflows community together. This paper reports on discussions and findings from two virtual "Workflows Community Summits" (January and April, 2021). The overarching goals of these workshops were to develop a view of the state of the art, identify crucial research challenges in the workflows community, articulate a vision for potential community efforts, and discuss technical approaches for realizing this vision. To this end, participants identified six broad themes: FAIR computational workflows; AI workflows; exascale challenges; APIs, interoperability, reuse, and standards; training and education; and building a workflows community. We summarize discussions and recommendations for each of these themes.

preprint2020arXiv

Methods and Experiences for Developing Abstractions for Data-intensive, Scientific Applications

Developing software for scientific applications that require the integration of diverse types of computing, instruments, and data present challenges that are distinct from commercial software. These applications require scale, and the need to integrate various programming and computational models with evolving and heterogeneous infrastructure. Pervasive and effective abstractions for distributed infrastructures are thus critical; however, the process of developing abstractions for scientific applications and infrastructures is not well understood. While theory-based approaches for system development are suited for well-defined, closed environments, they have severe limitations for designing abstractions for scientific systems and applications. The design science research (DSR) method provides the basis for designing practical systems that can handle real-world complexities at all levels. In contrast to theory-centric approaches, DSR emphasizes both practical relevance and knowledge creation by building and rigorously evaluating all artifacts. We show how DSR provides a well-defined framework for developing abstractions and middleware systems for distributed systems. Specifically, we address the critical problem of distributed resource management on heterogeneous infrastructure over a dynamic range of scales, a challenge that currently limits many scientific applications. We use the pilot-abstraction, a widely used resource management abstraction for high-performance, high throughput, big data, and streaming applications, as a case study for evaluating the DSR activities. For this purpose, we analyze the research process and artifacts produced during the design and evaluation of the pilot-abstraction. We find DSR provides a concise framework for iteratively designing and evaluating systems. Finally, we capture our experiences and formulate different lessons learned.

preprint2020arXiv

Parallel Performance of Molecular Dynamics Trajectory Analysis

The performance of biomolecular molecular dynamics simulations has steadily increased on modern high performance computing resources but acceleration of the analysis of the output trajectories has lagged behind so that analyzing simulations is becoming a bottleneck. To close this gap, we studied the performance of parallel trajectory analysis with MPI and the Python MDAnalysis library on three different XSEDE supercomputers where trajectories were read from a Lustre parallel file system. Strong scaling performance was impeded by stragglers, MPI processes that were slower than the typical process. Stragglers were less prevalent for compute-bound workloads, thus pointing to file reading as a bottleneck for scaling. However, a more complicated picture emerged in which both the computation and the data ingestion exhibited close to ideal strong scaling behavior whereas stragglers were primarily caused by either large MPI communication costs or long times to open the single shared trajectory file. We improved overall strong scaling performance by either subfiling (splitting the trajectory into separate files) or MPI-IO with Parallel HDF5 trajectory files. The parallel HDF5 approach resulted in near ideal strong scaling on up to 384 cores (16 nodes), thus reducing trajectory analysis times by two orders of magnitude compared to the serial approach.

preprint2020arXiv

Targeting SARS-CoV-2 with AI- and HPC-enabled Lead Generation: A First Data Release

Researchers across the globe are seeking to rapidly repurpose existing drugs or discover new drugs to counter the the novel coronavirus disease (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). One promising approach is to train machine learning (ML) and artificial intelligence (AI) tools to screen large numbers of small molecules. As a contribution to that effort, we are aggregating numerous small molecules from a variety of sources, using high-performance computing (HPC) to computer diverse properties of those molecules, using the computed properties to train ML/AI models, and then using the resulting models for screening. In this first data release, we make available 23 datasets collected from community sources representing over 4.2 B molecules enriched with pre-computed: 1) molecular fingerprints to aid similarity searches, 2) 2D images of molecules to enable exploration and application of image-based deep learning methods, and 3) 2D and 3D molecular descriptors to speed development of machine learning models. This data release encompasses structural information on the 4.2 B molecules and 60 TB of pre-computed data. Future releases will expand the data to include more detailed molecular simulations, computed models, and other products.

preprint2020arXiv

Workflow Design Analysis for High Resolution Satellite Image Analysis

Ecological sciences are using imagery from a variety of sources to monitor and survey populations and ecosystems. Very High Resolution (VHR) satellite imagery provide an effective dataset for large scale surveys. Convolutional Neural Networks have successfully been employed to analyze such imagery and detect large animals. As the datasets increase in volume, O(TB), and number of images, O(1k), utilizing High Performance Computing (HPC) resources becomes necessary. In this paper, we investigate a task-parallel data-driven workflows design to support imagery analysis pipelines with heterogeneous tasks on HPC. We analyze the capabilities of each design when processing a dataset of 3,000 VHR satellite images for a total of 4~TB. We experimentally model the execution time of the tasks of the image processing pipeline. We perform experiments to characterize the resource utilization, total time to completion, and overheads of each design. Based on the model, overhead and utilization analysis, we show which design approach to is best suited in scientific pipelines with similar characteristics.

preprint2019arXiv

Adaptive Ensemble Biomolecular Simulations at Scale

Recent advances in both theory and methods have created opportunities to simulate biomolecular processes more efficiently using adaptive ensemble simulations. Ensemble-based simulations are used widely to compute a number of individual simulation trajectories and analyze statistics across them. Adaptive ensemble simulations offer a further level of sophistication and flexibility by enabling high-level algorithms to control simulations based on intermediate results. Novel high-level algorithms require sophisticated approaches to utilize the intermediate data during runtime. Thus, there is a need for scalable software systems to support adaptive ensemble-based applications. We describe the operations in executing adaptive workflows, classify different types of adaptations, and describe challenges in implementing them in software tools. We enhance Ensemble Toolkit (EnTK) -- an ensemble execution system -- to support the scalable execution of adaptive workflows on HPC systems, and characterize the adaptation overhead in EnTK. We implement two high-level adaptive ensemble algorithms -- expanded ensemble and Markov state modeling, and execute upto $2^{12}$ ensemble members, on thousands of cores on three distinct HPC platforms. We highlight scientific advantages enabled by the novel capabilities of our approach. To the best of our knowledge, this is the first attempt at describing and implementing multiple adaptive ensemble workflows using a common conceptual and implementation framework.