Source author record

Rajkumar Buyya

Rajkumar Buyya appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Cryptography and Security eess.SP Artificial Intelligence cs.CY Machine Learning Multiagent Systems Networking and Internet Architecture Performance Software Engineering

Catalog footprint

What is connected

52works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Agentic Artificial Intelligence (AI): Architectures, Taxonomies, and Evaluation of Large Language Model Agents

Artificial Intelligence is moving from models that only generate text to Agentic AI, where systems behave as autonomous entities that can perceive, reason, plan, and act. Large Language Models (LLMs) are no longer used only as passive knowledge engines but as cognitive controllers that combine memory, tool use, and feedback from their environment to pursue extended goals. This shift already supports the automation of complex workflows in software engineering, scientific discovery, and web navigation, yet the variety of emerging designs, from simple single loop agents to hierarchical multi agent systems, makes the landscape hard to navigate. In this paper, we investigate architectures and propose a unified taxonomy that breaks agents into Perception, Brain, Planning, Action, Tool Use, and Collaboration. We use this lens to describe the move from linear reasoning procedures to native inference time reasoning models, and the transition from fixed API calls to open standards like the Model Context Protocol (MCP) and Native Computer Use. We also group the environments in which these agents operate, including digital operating systems, embodied robotics, and other specialized domains, and we review current evaluation practices. Finally, we highlight open challenges, such as hallucination in action, infinite loops, and prompt injection, and outline future research directions toward more robust and reliable autonomous systems.

preprint2026arXiv

Performance and Security Aware Distributed Service Placement in Fog Computing

The rapid proliferation of IoT applications has intensified the demand for efficient and secure service placement in Fog computing. However, heterogeneous resources, dynamic workloads, and diverse security requirements make optimal service placement highly challenging. Most solutions focus primarily on performance metrics while overlooking the security implications of deployment decisions. This paper proposes a Security and Performance-Aware Distributed Deep Reinforcement Learning (SPA-DDRL) framework for joint optimization of service response time and security compliance in Fog computing. The problem is formulated as a weighted multi-objective optimization task, minimizing latency while maximizing a security score derived from the security capabilities of Fog nodes. The security score features a new three-tier hierarchy, where configuration-level checks verify proper settings, capability-level assessments evaluate the resource security features, and control-level evaluations enforce stringent policies, thereby ensuring compliant solutions that align with performance objectives. SPA-DDRL adopts a distributed broker-learner architecture where multiple brokers perform autonomous service-placement decisions and a centralized learner coordinates global policy optimization through shared prioritized experiences. It integrates three key improvements, including Long Short-Term Memory networks, Prioritized Experience Replay, and off-policy correction mechanisms to improve the agent's performance. Experiments based on real IoT workloads show that SPA-DDRL significantly improves both service response time and placement security compared to current approaches, achieving a 16.3% improvement in response time and a 33% faster convergence rate. It also maintains consistent, feasible, security-compliant solutions across all system scales, while baseline techniques fail or show performance degradation.

preprint2026arXiv

ReinFog: A Deep Reinforcement Learning Empowered Framework for Resource Management in Edge and Cloud Computing Environments

The growing IoT landscape requires effective server deployment strategies to meet demands including real-time processing and energy efficiency. This is complicated by heterogeneous, dynamic applications and servers. To address these challenges, we propose ReinFog, a modular distributed software empowered with Deep Reinforcement Learning (DRL) for adaptive resource management across edge/fog and cloud environments. ReinFog enables the practical development/deployment of various centralized and distributed DRL techniques for resource management in edge/fog and cloud computing environments. It also supports integrating native and library-based DRL techniques for diverse IoT application scheduling objectives. Additionally, ReinFog allows for customizing deployment configurations for different DRL techniques, including the number and placement of DRL Learners and DRL Workers in large-scale distributed systems. Besides, we propose a novel Memetic Algorithm for DRL Component (e.g., DRL Learners and DRL Workers) Placement in ReinFog named MADCP, which combines the strengths of Genetic Algorithm, Firefly Algorithm, and Particle Swarm Optimization. Experiments reveal that the DRL mechanisms developed within ReinFog have significantly enhanced both centralized and distributed DRL techniques implementation. These advancements have resulted in notable improvements in IoT application performance, reducing response time by 45%, energy consumption by 39%, and weighted cost by 37%, while maintaining minimal scheduling overhead. Additionally, ReinFog exhibits remarkable scalability, with a rise in DRL Workers from 1 to 30 causing only a 0.3-second increase in startup time and around 2 MB more RAM per Worker. The proposed MADCP for DRL component placement further accelerates the convergence rate of DRL techniques by up to 38%.

preprint2022arXiv

A Strategy for Advancing Research and Impact in New Computing Paradigms

In the world of Information Technology, new computing paradigms, driven by requirements of different classes of problems and applications, emerge rapidly. These new computing paradigms pose many new research challenges. Researchers from different disciplines are working together to develop innovative solutions addressing them. In newer research areas with many unknowns, creating roadmaps, enabling tools, inspiring technological and application demonstrators offer confidence and prove feasibility and effectiveness of new paradigm. Drawing on our experience, we share strategy for advancing the field and community building in new and emerging computing research areas. We discuss how the development simulators can be cost-effective in accelerating design of real systems. We highlight strategic role played by different types of publications, conferences, and educational programs. We illustrate effectiveness of elements of our strategy with a case study on progression of cloud computing paradigm.

preprint2022arXiv

Container Orchestration in Edge and Fog Computing Environments for Real-Time IoT Applications

Resource management is the principal factor to fully utilize the potential of Edge/Fog computing to execute real-time and critical IoT applications. Although some resource management frameworks exist, the majority are not designed based on distributed containerized components. Hence, they are not suitable for highly distributed and heterogeneous computing environments. Containerized resource management frameworks such as FogBus2 enable efficient distribution of framework's components alongside IoT applications' components. However, the management, deployment, health-check, and scalability of a large number of containers are challenging issues. To orchestrate a multitude of containers, several orchestration tools are developed. But, many of these orchestration tools are heavy-weight and have a high overhead, especially for resource-limited Edge/Fog nodes. Thus, for hybrid computing environments, consisting of heterogeneous Edge/Fog and/or Cloud nodes, lightweight container orchestration tools are required to support both resource-limited resources at the Edge/Fog and resource-rich resources at the Cloud. Thus, in this paper, we propose a feasible approach to build a hybrid and lightweight cluster based on K3s, for the FogBus2 framework that offers containerized resource management framework. This work addresses the challenge of creating lightweight computing clusters in hybrid computing environments. It also proposes three design patterns for the deployment of the FogBus2 framework in hybrid environments, including 1) Host Network, 2) Proxy Server, and 3) Environment Variable. The performance evaluation shows that the proposed approach improves the response time of real-time IoT applications up to 29% with acceptable and low overhead.

preprint2022arXiv

Scheduling IoT Applications in Edge and Fog Computing Environments: A Taxonomy and Future Directions

Fog computing, as a distributed paradigm, offers cloud-like services at the edge of the network with low latency and high-access bandwidth to support a diverse range of IoT application scenarios. To fully utilize the potential of this computing paradigm, scalable, adaptive, and accurate scheduling mechanisms and algorithms are required to efficiently capture the dynamics and requirements of users, IoT applications, environmental properties, and optimization targets. This paper presents a taxonomy of recent literature on scheduling IoT applications in Fog computing. Based on our new classification schemes, current works in the literature are analyzed, research gaps of each category are identified, and respective future directions are described.

preprint2022arXiv

Securing the Future Internet of Things with Post-Quantum Cryptography

Traditional and lightweight cryptography primitives and protocols are insecure against quantum attacks. Thus, a real-time application using traditional or lightweight cryptography primitives and protocols does not ensure full-proof security. Post-quantum Cryptography is important for the Internet of Things (IoT) due to its security against Quantum attacks. This paper offers a broad literature analysis of post-quantum cryptography for IoT networks, including the challenges and research directions to adopt in real-time applications. The work draws focus towards post-quantum cryptosystems that are useful for resource-constraint devices. Further, those quantum attacks are surveyed, which may occur over traditional and lightweight cryptographic primitives.

preprint2020arXiv

A Data-Driven Frequency Scaling Approach for Deadline-aware Energy Efficient Scheduling on Graphics Processing Units (GPUs)

Modern computing paradigms, such as cloud computing, are increasingly adopting GPUs to boost their computing capabilities primarily due to the heterogeneous nature of AI/ML/deep learning workloads. However, the energy consumption of GPUs is a critical problem. Dynamic Voltage Frequency Scaling (DVFS) is a widely used technique to reduce the dynamic power of GPUs. Yet, configuring the optimal clock frequency for essential performance requirements is a non-trivial task due to the complex nonlinear relationship between the application's runtime performance characteristics, energy, and execution time. It becomes more challenging when different applications behave distinctively with similar clock settings. Simple analytical solutions and standard GPU frequency scaling heuristics fail to capture these intricacies and scale the frequencies appropriately. In this regard, we propose a data-driven frequency scaling technique by predicting the power and execution time of a given application over different clock settings. We collect the data from application profiling and train the models to predict the outcome accurately. The proposed solution is generic and can be easily extended to different kinds of workloads and GPU architectures. Furthermore, using this frequency scaling by prediction models, we present a deadline-aware application scheduling algorithm to reduce energy consumption while simultaneously meeting their deadlines. We conduct real extensive experiments on NVIDIA GPUs using several benchmark applications. The experiment results have shown that our prediction models have high accuracy with the average RMSE values of 0.38 and 0.05 for energy and time prediction, respectively. Also, the scheduling algorithm consumes 15.07% less energy as compared to the baseline policies.

preprint2020arXiv

A Drone-based Networked System and Methods for Combating Coronavirus Disease (COVID-19) Pandemic

Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. It is similar to influenza viruses and raises concerns through alarming levels of spread and severity resulting in an ongoing pandemic worldwide. Within eight months (by August 2020), it infected 24.0 million persons worldwide and over 824 thousand have died. Drones or Unmanned Aerial Vehicles (UAVs) are very helpful in handling the COVID-19 pandemic. This work investigates the drone-based systems, COVID-19 pandemic situations, and proposes an architecture for handling pandemic situations in different scenarios using real-time and simulation-based scenarios. The proposed architecture uses wearable sensors to record the observations in Body Area Networks (BANs) in a push-pull data fetching mechanism. The proposed architecture is found to be useful in remote and highly congested pandemic areas where either the wireless or Internet connectivity is a major issue or chances of COVID-19 spreading are high. It collects and stores the substantial amount of data in a stipulated period and helps to take appropriate action as and when required. In real-time drone-based healthcare system implementation for COVID-19 operations, it is observed that a large area can be covered for sanitization, thermal image collection, and patient identification within a short period (2 KMs within 10 minutes approx.) through aerial route. In the simulation, the same statistics are observed with an addition of collision-resistant strategies working successfully for indoor and outdoor healthcare operations. Further, open challenges are identified and promising research directions are highlighted.

preprint2020arXiv

A Privacy-preserving Mobile and Fog Computing Framework to Trace and Prevent COVID-19 Community Transmission

To slow down the spread of COVID-19, governments around the world are trying to identify infected people and to contain the virus by enforcing isolation and quarantine. However, it is difficult to trace people who came into contact with an infected person, which causes widespread community transmission and mass infection. To address this problem, we develop an e-government Privacy Preserving Mobile and Fog computing framework entitled PPMF that can trace infected and suspected cases nationwide. We use personal mobile devices with contact tracing app and two types of stationary fog nodes, named Automatic Risk Checkers (ARC) and Suspected User Data Uploader Node (SUDUN), to trace community transmission alongside maintaining user data privacy. Each user's mobile device receives a Unique Encrypted Reference Code (UERC) when registering on the central application. The mobile device and the central application both generate Rotational Unique Encrypted Reference Code (RUERC), which broadcasted using the Bluetooth Low Energy (BLE) technology. The ARCs are placed at the entry points of buildings, which can immediately detect if there are positive or suspected cases nearby. If any confirmed case is found, the ARCs broadcast pre-cautionary messages to nearby people without revealing the identity of the infected person. The SUDUNs are placed at the health centers that report test results to the central cloud application. The reported data is later used to map between infected and suspected cases. Therefore, using our proposed PPMF framework, governments can let organizations continue their economic activities without complete lockdown.

preprint2020arXiv

A Self-adaptive Approach for Managing Applications and Harnessing Renewable Energy for Sustainable Cloud Computing

Rapid adoption of Cloud computing for hosting services and its success is primarily attributed to its attractive features such as elasticity, availability and pay-as-you-go pricing model. However, the huge amount of energy consumed by cloud data centers makes it to be one of the fastest growing sources of carbon emissions. Approaches for improving the energy efficiency include enhancing the resource utilization to reduce resource wastage and applying the renewable energy as the energy supply. This work aims to reduce the carbon footprint of the data centers by reducing the usage of brown energy and maximizing the usage of renewable energy. Taking advantage of microservices and renewable energy, we propose a self-adaptive approach for the resource management of interactive workloads and batch workloads. To ensure the quality of service of workloads, a brownout-based algorithm for interactive workloads and a deferring algorithm for batch workloads are proposed. We have implemented the proposed approach in a prototype system and evaluated it with web services under real traces. The results illustrate our approach can reduce the brown energy usage by 21% and improve the renewable energy usage by 10%.

preprint2020arXiv

Application Management in Fog Computing Environments: A Taxonomy, Review and Future Directions

The Internet of Things (IoT) paradigm is being rapidly adopted for the creation of smart environments in various domains. The IoT-enabled Cyber-Physical Systems (CPSs) associated with smart city, healthcare, Industry 4.0 and Agtech handle a huge volume of data and require data processing services from different types of applications in real-time. The Cloud-centric execution of IoT applications barely meets such requirements as the Cloud datacentres reside at a multi-hop distance from the IoT devices. \textit{Fog computing}, an extension of Cloud at the edge network, can execute these applications closer to data sources. Thus, Fog computing can improve application service delivery time and resist network congestion. However, the Fog nodes are highly distributed, heterogeneous and most of them are constrained in resources and spatial sharing. Therefore, efficient management of applications is necessary to fully exploit the capabilities of Fog nodes. In this work, we investigate the existing application management strategies in Fog computing and review them in terms of architecture, placement and maintenance. Additionally, we propose a comprehensive taxonomy and highlight the research gaps in Fog-based application management. We also discuss a perspective model and provide future research directions for further improvement of application management in Fog computing.

preprint2020arXiv

DATESSO: Self-Adapting Service Composition with Debt-Aware Two Levels Constraint Reasoning

The rapidly changing workload of service-based systems can easily cause under-/over-utilization on the component services, which can consequently affect the overall Quality of Service (QoS), such as latency. Self-adaptive services composition rectifies this problem, but poses several challenges: (i) the effectiveness of adaptation can deteriorate due to over-optimistic assumptions on the latency and utilization constraints, at both local and global levels; and (ii) the benefits brought by each composition plan is often short term and is not often designed for long-term benefits -- a natural prerequisite for sustaining the system. To tackle these issues, we propose a two levels constraint reasoning framework for sustainable self-adaptive services composition, called DATESSO. In particular, DATESSO consists of a re ned formulation that differentiates the "strictness" for latency/utilization constraints in two levels. To strive for long-term benefits, DATESSO leverages the concept of technical debt and time-series prediction to model the utility contribution of the component services in the composition. The approach embeds a debt-aware two level constraint reasoning algorithm in DATESSO to improve the efficiency, effectiveness and sustainability of self-adaptive service composition. We evaluate DATESSO on a service-based system with real-world WS-DREAM dataset and comparing it with other state-of-the-art approaches. The results demonstrate the superiority of DATESSO over the others on the utilization, latency and running time whilst likely to be more sustainable.

preprint2020arXiv

Dynamic Scheduling for Stochastic Edge-Cloud Computing Environments using A3C learning and Residual Recurrent Neural Networks

The ubiquitous adoption of Internet-of-Things (IoT) based applications has resulted in the emergence of the Fog computing paradigm, which allows seamlessly harnessing both mobile-edge and cloud resources. Efficient scheduling of application tasks in such environments is challenging due to constrained resource capabilities, mobility factors in IoT, resource heterogeneity, network hierarchy, and stochastic behaviors. xisting heuristics and Reinforcement Learning based approaches lack generalizability and quick adaptability, thus failing to tackle this problem optimally. They are also unable to utilize the temporal workload patterns and are suitable only for centralized setups. However, Asynchronous-Advantage-Actor-Critic (A3C) learning is known to quickly adapt to dynamic scenarios with less data and Residual Recurrent Neural Network (R2N2) to quickly update model parameters. Thus, we propose an A3C based real-time scheduler for stochastic Edge-Cloud environments allowing decentralized learning, concurrently across multiple agents. We use the R2N2 architecture to capture a large number of host and task parameters together with temporal patterns to provide efficient scheduling decisions. The proposed model is adaptive and able to tune different hyper-parameters based on the application requirements. We explicate our choice of hyper-parameters through sensitivity analysis. The experiments conducted on real-world data set show a significant improvement in terms of energy consumption, response time, Service-Level-Agreement and running cost by 14.4%, 7.74%, 31.9%, and 4.64%, respectively when compared to the state-of-the-art algorithms.

preprint2020arXiv

Energy Efficient Algorithms based on VM Consolidation for Cloud Computing: Comparisons and Evaluations

Cloud Computing paradigm has revolutionized IT industry and be able to offer computing as the fifth utility. With the pay-as-you-go model, cloud computing enables to offer the resources dynamically for customers anytime. Drawing the attention from both academia and industry, cloud computing is viewed as one of the backbones of the modern economy. However, the high energy consumption of cloud data centers contributes to high operational costs and carbon emission to the environment. Therefore, Green cloud computing is required to ensure energy efficiency and sustainability, which can be achieved via energy efficient techniques. One of the dominant approaches is to apply energy efficient algorithms to optimize resource usage and energy consumption. Currently, various virtual machine consolidation-based energy efficient algorithms have been proposed to reduce the energy of cloud computing environment. However, most of them are not compared comprehensively under the same scenario, and their performance is not evaluated with the same experimental settings. This makes users hard to select the appropriate algorithm for their objectives. To provide insights for existing energy efficient algorithms and help researchers to choose the most suitable algorithm, in this paper, we compare several state-of-the-art energy efficient algorithms in depth from multiple perspectives, including architecture, modelling and metrics. In addition, we also implement and evaluate these algorithms with the same experimental settings in CloudSim toolkit. The experimental results show the performance comparison of these algorithms with comprehensive results. Finally, detailed discussions of these algorithms are provided.

preprint2020arXiv

Fog Computing for Smart Grids: Challenges and Solutions

Smart grids (SGs) enable integration of diverse power sources including renewable energy resources. They can contribute to the reduction of harmful gas emission, and support two-way information flow to enhance energy efficiency, along with small-scale Microgrids, acting as a promising solution to cope with environmental problems. However, with the emerging of mission-critical and delay-sensitive applications, traditional Cloud-based data processing mode becomes less satisfying. The use of Fog computing to empower the edge-side processing capability of Smart grid systems is considered as a potential solution to address the problem. In this chapter, we aim to offer a comprehensive analysis of application of Fog computing in Smart grids. We begin with an overview of Smart grids and Fog computing. Then, by surveying the existing research, we summarize the main Fog computing enabled Smart grid applications, key problems and the possible methods. We conclude the chapter by discussing the research challenges and future directions.

preprint2020arXiv

Green-aware Mobile Edge Computing for IoT: Challenges, Solutions and Future Directions

The development of Internet of Things (IoT) technology enables the rapid growth of connected smart devices and mobile applications. However, due to the constrained resources and limited battery capacity, there are bottlenecks when utilizing the smart devices. Mobile edge computing (MEC) offers an attractive paradigm to handle this challenge. In this work, we concentrate on the MEC application for IoT and deal with the energy saving objective via offloading workloads between cloud and edge. In this regard, we firstly identify the energy-related challenges in MEC. Then we present a green-aware framework for MEC to address the energy-related challenges, and provide a generic model formulation for the green MEC. We also discuss some state-of-the-art workloads offloading approaches to achieve green IoT and compare them in comprehensive perspectives. Finally, some future research directions related to energy efficiency in MEC are given.

preprint2020arXiv

High-Performance Mining of COVID-19 Open Research Datasets for Text Classification and Insights in Cloud Computing Environments

COVID-19 global pandemic is an unprecedented health crisis. Since the outbreak, many researchers around the world have produced an extensive collection of literatures. For the research community and the general public to digest, it is crucial to analyse the text and provide insights in a timely manner, which requires a considerable amount of computational power. Clouding computing has been widely adopted in academia and industry in recent years. In particular, hybrid cloud is gaining popularity since its two-fold benefits: utilising existing resource to save cost and using additional cloud service providers to gain assess to extra computing resources on demand. In this paper, we developed a system utilising the Aneka PaaS middleware with parallel processing and multi-cloud capability to accelerate the ETL and article categorising process using machine learning technology on a hybrid cloud. The result is then persisted for further referencing, searching and visualising. Our performance evaluation shows that the system can help with reducing processing time and achieving linear scalability. Beyond COVID-19, the application might be used directly in broader scholarly article indexing and analysing.

preprint2020arXiv

Resource-sharing Policy in Multi-tenant Scientific Workflow-as-a-Service Cloud Platform

Increased adoption of scientific workflows in the community has urged for the development of multi-tenant platforms that provide these workflow executions as a service. As a result, Workflow-as-a-Service (WaaS) concept has been created by researchers to address the future design of Workflow Management Systems (WMS) that can serve a large number of users from a single point of service. These platforms differ from traditional WMS in that they handle a workload of workflows at runtime. A traditional WMS is usually designed to execute a single workflow in a dedicated process while WaaS cloud platforms enhance the process by exploiting multiple workflows execution in a multi-tenant environment model. In this paper, we explore a novel resource-sharing policy to improve system utilization and to fulfil various Quality of Service (QoS) requirements from multiple users in WaaS cloud platforms. We propose an Elastic Budget-constrained resource Provisioning and Scheduling algorithm for Multiple workflows that can reduce the computational overhead by encouraging resource sharing to minimize workflows' makespan while meeting a user-defined budget. Our experiments show that the EBPSM algorithm can utilize the resource-sharing policy to achieve higher performance in terms of minimizing the makespan compared to the state-of-the-art budget-constraint scheduling algorithm.

preprint2020arXiv

ThermoSim: Deep Learning based Framework for Modeling and Simulation of Thermal-aware Resource Management for Cloud Computing Environments

Current cloud computing frameworks host millions of physical servers that utilize cloud computing resources in the form of different virtual machines (VM). Cloud Data Center (CDC) infrastructures require significant amounts of energy to deliver large scale computational services. Computing nodes generate large volumes of heat, requiring cooling units in turn to eliminate the effect of this heat. Thus, the overall energy consumption of the CDC increases tremendously for servers as well as for cooling units. However, current workload allocation policies do not take into account the effect on temperature and it is challenging to simulate the thermal behavior of CDCs. There is a need for a thermal-aware framework to simulate and model the behavior of nodes and measure the important performance parameters which can be affected by its temperature. In this paper, we propose a lightweight framework, ThermoSim, for modeling and simulation of thermal-aware resource management for cloud computing environments. This work presents a Recurrent Neural Network based deep learning temperature predictor for CDCs which is utilized by ThermoSim for lightweight resource management in constrained cloud environments. ThermoSim extends the CloudSim toolkit helping to analyze the performance of various key parameters such as energy consumption, SLA violation rate, number of VM migrations and temperature during the management of cloud resources for execution of workloads. Further, different energy-aware and thermal-aware resource management techniques are tested using the proposed ThermoSim framework in order to validate it against the existing framework. The experimental results demonstrate the proposed framework is capable of modeling and simulating the thermal behavior of a CDC and the ThermoSim framework is better than Thas in terms of energy consumption, cost, time, memory usage & prediction accuracy.

preprint2020arXiv

WattsApp: Power-Aware Container Scheduling

Containers are becoming a popular workload deployment mechanism in modern distributed systems. However, there are limited software-based methods (hardware-based methods are expensive requiring hardware level changes) for obtaining the power consumed by containers for facilitating power-aware container scheduling, an essential activity for efficient management of distributed systems. This paper presents WattsApp, a tool underpinned by a six step software-based method for power-aware container scheduling to minimize power cap violations on a server. The proposed method relies on a neural network-based power estimation model and a power capped container scheduling technique. Experimental studies are pursued in a lab-based environment on 10 benchmarks deployed on Intel and ARM processors. The results highlight that the power estimation model has negligible overheads for data collection - nearly 90% of all data samples can be estimated with less than a 10% error, and the Mean Absolute Percentage Error (MAPE) is less than 6%. The power-aware scheduling of WattsApp is more effective than Intel's Running Power Average Limit (RAPL) based power capping for both single and multiple containers as it does not degrade the performance of all containers running on the server. The results confirm the feasibility of WattsApp.

preprint2020arXiv

Workflow-as-a-Service Cloud Platform and Deployment of Bioinformatics Workflow Applications

Workflow management systems (WMS) support the composition and deployment of workflow-oriented applications in distributed computing environments. They hide the complexity of managing large-scale applications, which includes the controlling data pipelining between tasks, ensuring the application's execution, and orchestrating the distributed computational resources to get a reasonable processing time. With the increasing trends of scientific workflow adoption, the demand to deploy them using a third-party service begins to increase. Workflow-as-a-service (WaaS) is a term representing the platform that serves the users who require to deploy their workflow applications on third-party cloud-managed services. This concept drives the existing WMS technology to evolve towards the development of the WaaS cloud platform. Based on this requirement, we extend CloudBus WMS functionality to handle the workload of multiple workflows and develop the WaaS cloud platform prototype. We implemented the Elastic Budget-constrained resource Provisioning and Scheduling algorithm for Multiple workflows (EBPSM) algorithm that is capable of scheduling multiple workflows and evaluated the platform using two bioinformatics workflows. Our experimental results show that the platform is capable of efficiently handling multiple workflows execution and gaining its purpose to minimize the makespan while meeting the budget.

preprint2016arXiv

A Reliable and Cost-Efficient Auto-Scaling System for Web Applications Using Heterogeneous Spot Instances

Cloud providers sell their idle capacity on markets through an auction-like mechanism to increase their return on investment. The instances sold in this way are called spot instances. In spite that spot instances are usually 90% cheaper than on-demand instances, they can be terminated by provider when their bidding prices are lower than market prices. Thus, they are largely used to provision fault-tolerant applications only. In this paper, we explore how to utilize spot instances to provision web applications, which are usually considered availability-critical. The idea is to take advantage of differences in price among various types of spot instances to reach both high availability and significant cost saving. We first propose a fault-tolerant model for web applications provisioned by spot instances. Based on that, we devise novel auto-scaling polices for hourly billed cloud markets. We implemented the proposed model and policies both on a simulation testbed for repeatable validation and Amazon EC2. The experiments on the simulation testbed and the real platform against the benchmarks show that the proposed approach can greatly reduce resource cost and still achieve satisfactory Quality of Service (QoS) in terms of response time and availability.

preprint2016arXiv

Big Data Analytics = Machine Learning + Cloud Computing

Big Data can mean different things to different people. The scale and challenges of Big Data are often described using three attributes, namely Volume, Velocity and Variety (3Vs), which only reflect some of the aspects of data. In this chapter we review historical aspects of the term "Big Data" and the associated analytics. We augment 3Vs with additional attributes of Big Data to make it more comprehensive and relevant. We show that Big Data is not just 3Vs, but 32 Vs, that is, 9 Vs covering the fundamental motivation behind Big Data, which is to incorporate Business Intelligence (BI) based on different hypothesis or statistical models so that Big Data Analytics (BDA) can enable decision makers to make useful predictions for some crucial decisions or researching results. History of Big Data has demonstrated that the most cost effective way of performing BDA is to employ Machine Learning (ML) on the Cloud Computing (CC) based infrastructure or simply, ML + CC -> BDA. This chapter is devoted to help decision makers by defining BDA as a solution and opportunity to address their business needs.

preprint2016arXiv

Dynamic Selection of Virtual Machines for Application Servers in Cloud Environments

Autoscaling is a hallmark of cloud computing as it allows flexible just-in-time allocation and release of computational resources in response to dynamic and often unpredictable workloads. This is especially important for web applications whose workload is time dependent and prone to flash crowds. Most of them follow the 3-tier architectural pattern, and are divided into presentation, application/domain and data layers. In this work we focus on the application layer. Reactive autoscaling policies of the type "Instantiate a new Virtual Machine (VM) when the average server CPU utilisation reaches X%" have been used successfully since the dawn of cloud computing. But which VM type is the most suitable for the specific application at the moment remains an open question. In this work, we propose an approach for dynamic VM type selection. It uses a combination of online machine learning techniques, works in real time and adapts to changes in the users' workload patterns, application changes as well as middleware upgrades and reconfigurations. We have developed a prototype, which we tested with the CloudStone benchmark deployed on AWS EC2. Results show that our method quickly adapts to workload changes and reduces the total cost compared to the industry standard approach.

preprint2016arXiv

Fog Computing: Principles, Architectures, and Applications

The Internet of Everything (IoE) solutions gradually bring every object online, and processing data in centralized cloud does not scale to requirements of such environment. This is because, there are applications such as health monitoring and emergency response that require low latency and delay caused by transferring data to the cloud and then back to the application can seriously impact the performance. To this end, Fog computing has emerged, where cloud computing is extended to the edge of the network to decrease the latency and network congestion. Fog computing is a paradigm for managing a highly distributed and possibly virtualized environment that provides compute and network services between sensors and cloud data centers. This chapter provides background and motivations on emergence of Fog computing and defines its key characteristics. In addition, a reference architecture for Fog computing is presented and recent related development and applications are discussed.

preprint2016arXiv

iFogSim: A Toolkit for Modeling and Simulation of Resource Management Techniques in Internet of Things, Edge and Fog Computing Environments

Internet of Things (IoT) aims to bring every object (e.g. smart cameras, wearable, environmental sensors, home appliances, and vehicles) online, hence generating massive amounts of data that can overwhelm storage systems and data analytics applications. Cloud computing offers services at the infrastructure level that can scale to IoT storage and processing requirements. However, there are applications such as health monitoring and emergency response that require low latency, and delay caused by transferring data to the cloud and then back to the application can seriously impact their performances. To overcome this limitation, Fog computing paradigm has been proposed, where cloud services are extended to the edge of the network to decrease the latency and network congestion. To realize the full potential of Fog and IoT paradigms for real-time analytics, several challenges need to be addressed. The first and most critical problem is designing resource management techniques that determine which modules of analytics applications are pushed to each edge device to minimize the latency and maximize the throughput. To this end, we need a evaluation platform that enables the quantification of performance of resource management policies on an IoT or Fog computing infrastructure in a repeatable manner. In this paper we propose a simulator, called iFogSim, to model IoT and Fog environments and measure the impact of resource management techniques in terms of latency, network congestion, energy consumption, and cost. We describe two case studies to demonstrate modeling of an IoT environment and comparison of resource management policies. Moreover, scalability of the simulation toolkit in terms of RAM consumption and execution time is verified under different circumstances.

preprint2015arXiv

Agri-Info: Cloud Based Autonomic System for Delivering Agriculture as a Service

Cloud computing has emerged as an important paradigm for managing and delivering services efficiently over the Internet. Convergence of cloud computing with technologies such as wireless sensor networking and mobile computing offers new applications of cloud services but this requires management of Quality of Service (QoS) parameters to efficiently monitor and measure the delivered services. This paper presents a QoS-aware Cloud Based Autonomic Information System for delivering agriculture related information as a service through the use of latest Cloud technologies which manage various types of agriculture related data based on different domains. Proposed system gathers information from various users through preconfigured devices and manages and provides required information to users automatically. Further, Cuckoo Optimization Algorithm has been used for efficient resource allocation at infrastructure level for effective utilization of resources. We have evaluated the performance of the proposed approach in Cloud environment and experimental results show that the proposed system performs better in terms of resource utilization, execution time, cost and computing capacity along with other QoS parameters.

preprint2015arXiv

Big Data Analytics-Enhanced Cloud Computing: Challenges, Architectural Elements, and Future Directions

The emergence of cloud computing has made dynamic provisioning of elastic capacity to applications on-demand. Cloud data centers contain thousands of physical servers hosting orders of magnitude more virtual machines that can be allocated on demand to users in a pay-as-you-go model. However, not all systems are able to scale up by just adding more virtual machines. Therefore, it is essential, even for scalable systems, to project workloads in advance rather than using a purely reactive approach. Given the scale of modern cloud infrastructures generating real time monitoring information, along with all the information generated by operating systems and applications, this data poses the issues of volume, velocity, and variety that are addressed by Big Data approaches. In this paper, we investigate how utilization of Big Data analytics helps in enhancing the operation of cloud computing environments. We discuss diverse applications of Big Data analytics in clouds, open issues for enhancing cloud operations via Big Data analytics, and architecture for anomaly detection and prevention in clouds along with future research directions.

preprint2015arXiv

Multi-Cloud Resource Provisioning with Aneka: A Unified and Integrated Utilisation of Microsoft Azure and Amazon EC2 Instances

Many vendors are offering computing services on subscription basis via Infrastructure-as-a-Service (IaaS) model. Users can acquire resources from different providers and get the best of each of them to run their applications. However, deploying applications in multi-cloud environments is a complex task. Therefore, application platforms are needed to help developers to succeed. Aneka is one such platform that supports developers to program and deploy distributed applications in multi-cloud environments. It can be used to provision resources from different cloud providers and can be configured to request resources dynamically according to the needs of specific applications. This paper presents extensions incorporated in Aneka to support the deployment of applications in multi-cloud environments. The first extension shows the flexibility of Aneka architecture to add cloud providers. Specifically, we describe the addition of Microsoft Azure IaaS cloud provider. We also discuss the inclusion of public IPs to communicate resources located in different networks and the functionality of using PowerShell to automatize installation of Aneka on remote resources. We demonstrate how an application composed of independent tasks improves its total execution time when it is deployed in the multi-cloud environment created by Aneka using resources provisioned from Azure and EC2.

preprint2015arXiv

The Anatomy of Big Data Computing

Advances in information technology and its widespread growth in several areas of business, engineering, medical and scientific studies are resulting in information/data explosion. Knowledge discovery and decision making from such rapidly growing voluminous data is a challenging task in terms of data organization and processing, which is an emerging trend known as Big Data Computing; a new paradigm which combines large scale compute, new data intensive techniques and mathematical models to build data analytics. Big Data computing demands a huge storage and computing for data curation and processing that could be delivered from on-premise or clouds infrastructures. This paper discusses the evolution of Big Data computing, differences between traditional data warehousing and Big Data, taxonomy of Big Data computing and underpinning technologies, integrated platform of Big Data and Clouds known as Big Data Clouds, layered architecture and components of Big Data Cloud and finally discusses open technical challenges and future directions.

preprint2014arXiv

Big Data Computing and Clouds: Trends and Future Directions

This paper discusses approaches and environments for carrying out analytics on Clouds for Big Data applications. It revolves around four important areas of analytics and Big Data, namely (i) data management and supporting architectures; (ii) model development and scoring; (iii) visualisation and user interaction; and (iv) business models. Through a detailed survey, we identify possible gaps in technology and provide recommendations for the research community on future directions on Cloud-supported Big Data computing and analytics solutions.

preprint2013arXiv

A Taxonomy of Performance Prediction Systems in the Parallel and Distributed Computing Grids

As Grids are loosely-coupled congregations of geographically distributed heterogeneous resources, the efficient utilization of the resources requires the support of a sound Performance Prediction System (PPS). The performance prediction of grid resources is helpful for both Resource Management Systems and grid users to make optimized resource usage decisions. There have been many PPS projects that span over several grid resources in several dimensions. In this paper the taxonomy for describing the PPS architecture is discussed. The taxonomy is used to categorize and identify approaches which are followed in the implementation of the existing PPSs for Grids. The taxonomy and the survey results are used to identify approaches and issues that have not been fully explored in research.

preprint2013arXiv

Cloud-Based Augmentation for Mobile Devices: Motivation, Taxonomies, and Open Challenges

Recently, Cloud-based Mobile Augmentation (CMA) approaches have gained remarkable ground from academia and industry. CMA is the state-of-the-art mobile augmentation model that employs resource-rich clouds to increase, enhance, and optimize computing capabilities of mobile devices aiming at execution of resource-intensive mobile applications. Augmented mobile devices envision to perform extensive computations and to store big data beyond their intrinsic capabilities with least footprint and vulnerability. Researchers utilize varied cloud-based computing resources (e.g., distant clouds and nearby mobile nodes) to meet various computing requirements of mobile users. However, employing cloud-based computing resources is not a straightforward panacea. Comprehending critical factors that impact on augmentation process and optimum selection of cloud-based resource types are some challenges that hinder CMA adaptability. This paper comprehensively surveys the mobile augmentation domain and presents taxonomy of CMA approaches. The objectives of this study is to highlight the effects of remote resources on the quality and reliability of augmentation processes and discuss the challenges and opportunities of employing varied cloud-based resources in augmenting mobile devices. We present augmentation definition, motivation, and taxonomy of augmentation types, including traditional and cloud-based. We critically analyze the state-of-the-art CMA approaches and classify them into four groups of distant fixed, proximate fixed, proximate mobile, and hybrid to present a taxonomy. Vital decision making and performance limitation factors that influence on the adoption of CMA approaches are introduced and an exemplary decision making flowchart for future CMA approaches are presented. Impacts of CMA approaches on mobile computing is discussed and open challenges are presented as the future research directions.

preprint2012arXiv

Autonomic Cloud Computing: Open Challenges and Architectural Elements

As Clouds are complex, large-scale, and heterogeneous distributed systems, management of their resources is a challenging task. They need automated and integrated intelligent strategies for provisioning of resources to offer services that are secure, reliable, and cost-efficient. Hence, effective management of services becomes fundamental in software platforms that constitute the fabric of computing Clouds. In this direction, this paper identifies open issues in autonomic resource provisioning and presents innovative management techniques for supporting SaaS applications hosted on Clouds. We present a conceptual architecture and early results evidencing the benefits of autonomic management of Clouds.

preprint2012arXiv

Internet of Things (IoT): A Vision, Architectural Elements, and Future Directions

Ubiquitous sensing enabled by Wireless Sensor Network (WSN) technologies cuts across many areas of modern day living. This offers the ability to measure, infer and understand environmental indicators, from delicate ecologies and natural resources to urban environments. The proliferation of these devices in a communicating-actuating network creates the Internet of Things (IoT), wherein, sensors and actuators blend seamlessly with the environment around us, and the information is shared across platforms in order to develop a common operating picture (COP). Fuelled by the recent adaptation of a variety of enabling device technologies such as RFID tags and readers, near field communication (NFC) devices and embedded sensor and actuator nodes, the IoT has stepped out of its infancy and is the the next revolutionary technology in transforming the Internet into a fully integrated Future Internet. As we move from www (static pages web) to web2 (social networking web) to web3 (ubiquitous computing web), the need for data-on-demand using sophisticated intuitive queries increases significantly. This paper presents a cloud centric vision for worldwide implementation of Internet of Things. The key enabling technologies and application domains that are likely to drive IoT research in the near future are discussed. A cloud implementation using Aneka, which is based on interaction of private and public clouds is presented. We conclude our IoT vision by expanding on the need for convergence of WSN, the Internet and distributed computing directed at technological research community.

preprint2012arXiv

Market-Oriented Cloud Computing and the Cloudbus Toolkit

Cloud computing has penetrated the Information Technology industry deep enough to influence major companies to adopt it into their mainstream business. A strong thrust on the use of virtualization technology to realize Infrastructure-as-a-Service (IaaS) has led enterprises to leverage subscription-oriented computing capabilities of public Clouds for hosting their application services. In parallel, research in academia has been investigating transversal aspects such as security, software frameworks, quality of service, and standardization. We believe that the complete realization of the Cloud computing vision will lead to the introduction of a virtual market where Cloud brokers, on behalf of end users, are in charge of selecting and composing the services advertised by different Cloud vendors. In order to make this happen, existing solutions and technologies have to be redesigned and extended from a market-oriented perspective and integrated together, giving rise to what we term Market-Oriented Cloud Computing. In this paper, we will assess the current status of Cloud computing by providing a reference model, discuss the challenges that researchers and IT practitioners are facing and will encounter in the near future, and present the approach for solving them from the perspective of the Cloudbus toolkit, which comprises of a set of technologies geared towards the realization of Market Oriented Cloud Computing vision. We provide experimental results demonstrating market-oriented resource provisioning and brokering within a Cloud and across multiple distributed resources. We also include an application illustrating the hosting of ECG analysis as SaaS on Amazon IaaS (EC2 and S3) services.

preprint2012arXiv

SLA-Oriented Resource Provisioning for Cloud Computing: Challenges, Architecture, and Solutions

Cloud computing systems promise to offer subscription-oriented, enterprise-quality computing services to users worldwide. With the increased demand for delivering services to a large number of users, they need to offer differentiated services to users and meet their quality expectations. Existing resource management systems in data centers are yet to support Service Level Agreement (SLA)-oriented resource allocation, and thus need to be enhanced to realize cloud computing and utility computing. In addition, no work has been done to collectively incorporate customer-driven service management, computational risk management, and autonomic resource management into a market-based resource management system to target the rapidly changing enterprise requirements of Cloud computing. This paper presents vision, challenges, and architectural elements of SLA-oriented resource management. The proposed architecture supports integration of marketbased provisioning policies and virtualisation technologies for flexible allocation of resources to applications. The performance results obtained from our working prototype system shows the feasibility and effectiveness of SLA-based resource provisioning in Clouds.

preprint2011arXiv

Aneka Cloud Application Platform and Its Integration with Windows Azure

Aneka is an Application Platform-as-a-Service (Aneka PaaS) for Cloud Computing. It acts as a framework for building customized applications and deploying them on either public or private Clouds. One of the key features of Aneka is its support for provisioning resources on different public Cloud providers such as Amazon EC2, Windows Azure and GoGrid. In this chapter, we will present Aneka platform and its integration with one of the public Cloud infrastructures, Windows Azure, which enables the usage of Windows Azure Compute Service as a resource provider of Aneka PaaS. The integration of the two platforms will allow users to leverage the power of Windows Azure Platform for Aneka Cloud Computing, employing a large number of compute instances to run their applications in parallel. Furthermore, customers of the Windows Azure platform can benefit from the integration with Aneka PaaS by embracing the advanced features of Aneka in terms of multiple programming models, scheduling and management services, application execution services, accounting and pricing services and dynamic provisioning services. Finally, in addition to the Windows Azure Platform we will illustrate in this chapter the integration of Aneka PaaS with other public Cloud platforms such as Amazon EC2 and GoGrid, and virtual machine management platforms such as Xen Server. The new support of provisioning resources on Windows Azure once again proves the adaptability, extensibility and flexibility of Aneka.

preprint2011arXiv

Cost of Virtual Machine Live Migration in Clouds: A Performance Evaluation

Virtualization has become commonplace in modern data centers, often referred as "computing clouds". The capability of virtual machine live migration brings benefits such as improved performance, manageability and fault tolerance, while allowing workload movement with a short service downtime. However, service levels of applications are likely to be negatively affected during a live migration. For this reason, a better understanding of its effects on system performance is desirable. In this paper, we evaluate the effects of live migration of virtual machines on the performance of applications running inside Xen VMs. Results show that, in most cases, migration overhead is acceptable but cannot be disregarded, especially in systems where availability and responsiveness are governed by strict Service Level Agreements. Despite that, there is a high potential for live migration applicability in data centers serving modernInternet applications. Our results are based on a workload covering the domain of multi-tier Web 2.0 applications.

preprint2011arXiv

Platforms for Building and Deploying Applications for Cloud Computing

Cloud computing is rapidly emerging as a new paradigm for delivering IT services as utlity-oriented services on subscription-basis. The rapid development of applications and their deployment in Cloud computing environments in efficient manner is a complex task. In this article, we give a brief introduction to Cloud computing technology and Platform as a Service, we examine the offerings in this category, and provide the basis for helping readers to understand basic application platform opportunities in Cloud by technologies such as Microsoft Azure, Sales Force, Google App, and Aneka for Cloud computing. We demonstrate that Manjrasoft Aneka is a Cloud Application Platform (CAP) leveraging these concepts and allowing an easy development of Cloud ready applications on a Private/Public/Hybrid Cloud. Aneka CAP offers facilities for quickly developing Cloud applications and a modular platform where additional services can be easily integrated to extend the system capabilities, thus being at pace with the rapidly evolution of Cloud computing.

preprint2011arXiv

Provisioning Spot Market Cloud Resources to Create Cost-effective Virtual Clusters

Infrastructure-as-a-Service providers are offering their unused resources in the form of variable-priced virtual machines (VMs), known as "spot instances", at prices significantly lower than their standard fixed-priced resources. To lease spot instances, users specify a maximum price they are willing to pay per hour and VMs will run only when the current price is lower than the user's bid. This paper proposes a resource allocation policy that addresses the problem of running deadline-constrained compute-intensive jobs on a pool of composed solely of spot instances, while exploiting variations in price and performance to run applications in a fast and economical way. Our policy relies on job runtime estimations to decide what are the best types of VMs to run each job and when jobs should run. Several estimation methods are evaluated and compared, using trace-based simulations, which take real price variation traces obtained from Amazon Web Services as input, as well as an application trace from the Parallel Workload Archive. Results demonstrate the effectiveness of running computational jobs on spot instances, at a fraction (up to 60% lower) of the price that would normally cost on fixed priced resources.

preprint2011arXiv

Reliable Provisioning of Spot Instances for Compute-intensive Applications

Cloud computing providers are now offering their unused resources for leasing in the spot market, which has been considered the first step towards a full-fledged market economy for computational resources. Spot instances are virtual machines (VMs) available at lower prices than their standard on-demand counterparts. These VMs will run for as long as the current price is lower than the maximum bid price users are willing to pay per hour. Spot instances have been increasingly used for executing compute-intensive applications. In spite of an apparent economical advantage, due to an intermittent nature of biddable resources, application execution times may be prolonged or they may not finish at all. This paper proposes a resource allocation strategy that addresses the problem of running compute-intensive jobs on a pool of intermittent virtual machines, while also aiming to run applications in a fast and economical way. To mitigate potential unavailability periods, a multifaceted fault-aware resource provisioning policy is proposed. Our solution employs price and runtime estimation mechanisms, as well as three fault tolerance techniques, namely checkpointing, task duplication and migration. We evaluate our strategies using trace-driven simulations, which take as input real price variation traces, as well as an application trace from the Parallel Workload Archive. Our results demonstrate the effectiveness of executing applications on spot instances, respecting QoS constraints, despite occasional failures.

preprint2010arXiv

A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems

Traditionally, the development of computing systems has been focused on performance improvements driven by the demand of applications from consumer, scientific and business domains. However, the ever increasing energy consumption of computing systems has started to limit further performance growth due to overwhelming electricity bills and carbon dioxide footprints. Therefore, the goal of the computer system design has been shifted to power and energy efficiency. To identify open challenges in the area and facilitate future advancements it is essential to synthesize and classify the research on power and energy-efficient design conducted to date. In this work we discuss causes and problems of high power / energy consumption, and present a taxonomy of energy-efficient design of computing systems covering the hardware, operating system, virtualization and data center levels. We survey various key works in the area and map them to our taxonomy to guide future design and development efforts. This chapter is concluded with a discussion of advancements identified in energy-efficient computing and our vision on future research directions.

preprint2010arXiv

InterCloud: Utility-Oriented Federation of Cloud Computing Environments for Scaling of Application Services

Cloud computing providers have setup several data centers at different geographical locations over the Internet in order to optimally serve needs of their customers around the world. However, existing systems do not support mechanisms and policies for dynamically coordinating load distribution among different Cloud-based data centers in order to determine optimal location for hosting application services to achieve reasonable QoS levels. Further, the Cloud computing providers are unable to predict geographic distribution of users consuming their services, hence the load coordination must happen automatically, and distribution of services must change in response to changes in the load. To counter this problem, we advocate creation of federated Cloud computing environment (InterCloud) that facilitates just-in-time, opportunistic, and scalable provisioning of application services, consistently achieving QoS targets under variable workload, resource and network conditions. The overall goal is to create a computing environment that supports dynamic expansion or contraction of capabilities (VMs, services, storage, and database) for handling sudden variations in service demands. This paper presents vision, challenges, and architectural elements of InterCloud for utility-oriented federation of Cloud computing environments. The proposed InterCloud environment supports scaling of applications across multiple vendor clouds. We have validated our approach by conducting a set of rigorous performance evaluation study using the CloudSim toolkit. The results demonstrate that federated Cloud computing model has immense potential as it offers significant performance gains as regards to response time and cost saving under dynamic workload scenarios.

preprint2010arXiv

Service Level Agreement (SLA) in Utility Computing Systems

In recent years, extensive research has been conducted in the area of Service Level Agreement (SLA) for utility computing systems. An SLA is a formal contract used to guarantee that consumers' service quality expectation can be achieved. In utility computing systems, the level of customer satisfaction is crucial, making SLAs significantly important in these environments. Fundamental issue is the management of SLAs, including SLA autonomy management or trade off among multiple Quality of Service (QoS) parameters. Many SLA languages and frameworks have been developed as solutions; however, there is no overall classification for these extensive works. Therefore, the aim of this chapter is to present a comprehensive survey of how SLAs are created, managed and used in utility computing environment. We discuss existing use cases from Grid and Cloud computing systems to identify the level of SLA realization in state-of-art systems and emerging challenges for future research.

preprint2009arXiv

Cloudbus Toolkit for Market-Oriented Cloud Computing

This keynote paper: (1) presents the 21st century vision of computing and identifies various IT paradigms promising to deliver computing as a utility; (2) defines the architecture for creating market-oriented Clouds and computing atmosphere by leveraging technologies such as virtual machines; (3) provides thoughts on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain SLA-oriented resource allocation; (4) presents the work carried out as part of our new Cloud Computing initiative, called Cloudbus: (i) Aneka, a Platform as a Service software system containing SDK (Software Development Kit) for construction of Cloud applications and deployment on private or public Clouds, in addition to supporting market-oriented resource management; (ii) internetworking of Clouds for dynamic creation of federated computing environments for scaling of elastic applications; (iii) creation of 3rd party Cloud brokering services for building content delivery networks and e-Science applications and their deployment on capabilities of IaaS providers such as Amazon along with Grid mashups; (iv) CloudSim supporting modelling and simulation of Clouds for performance studies; (v) Energy Efficient Resource Allocation Mechanisms and Techniques for creation and management of Green Clouds; and (vi) pathways for future research.

preprint2009arXiv

High-Performance Cloud Computing: A View of Scientific Applications

Scientific computing often requires the availability of a massive number of computers for performing large scale experiments. Traditionally, these needs have been addressed by using high-performance computing solutions and installed facilities such as clusters and super computers, which are difficult to setup, maintain, and operate. Cloud computing provides scientists with a completely new model of utilizing the computing infrastructure. Compute resources, storage resources, as well as applications, can be dynamically provisioned (and integrated within the existing infrastructure) on a pay per use basis. These resources can be released when they are no more needed. Such services are often offered within the context of a Service Level Agreement (SLA), which ensure the desired Quality of Service (QoS). Aneka, an enterprise Cloud computing solution, harnesses the power of compute resources by relying on private and public Clouds and delivers to users the desired QoS. Its flexible and service based infrastructure supports multiple programming paradigms that make Aneka address a variety of different scenarios: from finance applications to computational science. As examples of scientific computing in the Cloud, we present a preliminary case study on using Aneka for the classification of gene expression data and the execution of fMRI brain imaging workflow.

preprint2009arXiv

Jeeva: Enterprise Grid-enabled Web Portal for Protein Secondary Structure Prediction

This paper presents a Grid portal for protein secondary structure prediction developed by using services of Aneka, a .NET-based enterprise Grid technology. The portal is used by research scientists to discover new prediction structures in a parallel manner. An SVM (Support Vector Machine)-based prediction algorithm is used with 64 sample protein sequences as a case study to demonstrate the potential of enterprise Grids.

preprint2008arXiv

Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities

This keynote paper: presents a 21st century vision of computing; identifies various computing paradigms promising to deliver the vision of computing utilities; defines Cloud computing and provides the architecture for creating market-oriented Clouds by leveraging technologies such as VMs; provides thoughts on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain SLA-oriented resource allocation; presents some representative Cloud platforms especially those developed in industries along with our current work towards realising market-oriented resource allocation of Clouds by leveraging the 3rd generation Aneka enterprise Grid technology; reveals our early thoughts on interconnecting Clouds for dynamically creating an atmospheric computing environment along with pointers to future community research; and concludes with the need for convergence of competing IT paradigms for delivering our 21st century vision.

preprint2006arXiv

A Case for Cooperative and Incentive-Based Coupling of Distributed Clusters

Research interest in Grid computing has grown significantly over the past five years. Management of distributed resources is one of the key issues in Grid computing. Central to management of resources is the effectiveness of resource allocation as it determines the overall utility of the system. The current approaches to superscheduling in a grid environment are non-coordinated since application level schedulers or brokers make scheduling decisions independently of the others in the system. Clearly, this can exacerbate the load sharing and utilization problems of distributed resources due to suboptimal schedules that are likely to occur. To overcome these limitations, we propose a mechanism for coordinated sharing of distributed clusters based on computational economy. The resulting environment, called \emph{Grid-Federation}, allows the transparent use of resources from the federation when local resources are insufficient to meet its users' requirements. The use of computational economy methodology in coordinating resource allocation not only facilitates the QoS based scheduling, but also enhances utility delivered by resources.

preprint2000arXiv

Nimrod/G: An Architecture of a Resource Management and Scheduling System in a Global Computational Grid

The availability of powerful microprocessors and high-speed networks as commodity components has enabled high performance computing on distributed systems (wide-area cluster computing). In this environment, as the resources are usually distributed geographically at various levels (department, enterprise, or worldwide) there is a great challenge in integrating, coordinating and presenting them as a single resource to the user; thus forming a computational grid. Another challenge comes from the distributed ownership of resources with each resource having its own access policy, cost, and mechanism. The proposed Nimrod/G grid-enabled resource management and scheduling system builds on our earlier work on Nimrod and follows a modular and component-based architecture enabling extensibility, portability, ease of development, and interoperability of independently developed components. It uses the Globus toolkit services and can be easily extended to operate with any other emerging grid middleware services. It focuses on the management and scheduling of computations over dynamic resources scattered geographically across the Internet at department, enterprise, or global level with particular emphasis on developing scheduling schemes based on the concept of computational economy for a real test bed, namely, the Globus testbed (GUSTO).

Rajkumar Buyya

What is connected

Connect this record

See the researcher in context

Building this map preview

52 published item(s)

Agentic Artificial Intelligence (AI): Architectures, Taxonomies, and Evaluation of Large Language Model Agents

Performance and Security Aware Distributed Service Placement in Fog Computing

ReinFog: A Deep Reinforcement Learning Empowered Framework for Resource Management in Edge and Cloud Computing Environments

A Strategy for Advancing Research and Impact in New Computing Paradigms

Container Orchestration in Edge and Fog Computing Environments for Real-Time IoT Applications

Scheduling IoT Applications in Edge and Fog Computing Environments: A Taxonomy and Future Directions

Securing the Future Internet of Things with Post-Quantum Cryptography

A Data-Driven Frequency Scaling Approach for Deadline-aware Energy Efficient Scheduling on Graphics Processing Units (GPUs)

A Drone-based Networked System and Methods for Combating Coronavirus Disease (COVID-19) Pandemic

A Privacy-preserving Mobile and Fog Computing Framework to Trace and Prevent COVID-19 Community Transmission

A Self-adaptive Approach for Managing Applications and Harnessing Renewable Energy for Sustainable Cloud Computing

Application Management in Fog Computing Environments: A Taxonomy, Review and Future Directions

DATESSO: Self-Adapting Service Composition with Debt-Aware Two Levels Constraint Reasoning

Dynamic Scheduling for Stochastic Edge-Cloud Computing Environments using A3C learning and Residual Recurrent Neural Networks

Energy Efficient Algorithms based on VM Consolidation for Cloud Computing: Comparisons and Evaluations

Fog Computing for Smart Grids: Challenges and Solutions

Green-aware Mobile Edge Computing for IoT: Challenges, Solutions and Future Directions

High-Performance Mining of COVID-19 Open Research Datasets for Text Classification and Insights in Cloud Computing Environments

Resource-sharing Policy in Multi-tenant Scientific Workflow-as-a-Service Cloud Platform

ThermoSim: Deep Learning based Framework for Modeling and Simulation of Thermal-aware Resource Management for Cloud Computing Environments

WattsApp: Power-Aware Container Scheduling

Workflow-as-a-Service Cloud Platform and Deployment of Bioinformatics Workflow Applications

A Reliable and Cost-Efficient Auto-Scaling System for Web Applications Using Heterogeneous Spot Instances

Big Data Analytics = Machine Learning + Cloud Computing

Dynamic Selection of Virtual Machines for Application Servers in Cloud Environments

Fog Computing: Principles, Architectures, and Applications

iFogSim: A Toolkit for Modeling and Simulation of Resource Management Techniques in Internet of Things, Edge and Fog Computing Environments

Agri-Info: Cloud Based Autonomic System for Delivering Agriculture as a Service

Big Data Analytics-Enhanced Cloud Computing: Challenges, Architectural Elements, and Future Directions

Multi-Cloud Resource Provisioning with Aneka: A Unified and Integrated Utilisation of Microsoft Azure and Amazon EC2 Instances

The Anatomy of Big Data Computing

Big Data Computing and Clouds: Trends and Future Directions

A Taxonomy of Performance Prediction Systems in the Parallel and Distributed Computing Grids

Cloud-Based Augmentation for Mobile Devices: Motivation, Taxonomies, and Open Challenges

Autonomic Cloud Computing: Open Challenges and Architectural Elements

Internet of Things (IoT): A Vision, Architectural Elements, and Future Directions

Market-Oriented Cloud Computing and the Cloudbus Toolkit

SLA-Oriented Resource Provisioning for Cloud Computing: Challenges, Architecture, and Solutions

Aneka Cloud Application Platform and Its Integration with Windows Azure

Cost of Virtual Machine Live Migration in Clouds: A Performance Evaluation

Platforms for Building and Deploying Applications for Cloud Computing

Provisioning Spot Market Cloud Resources to Create Cost-effective Virtual Clusters

Reliable Provisioning of Spot Instances for Compute-intensive Applications

A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems

InterCloud: Utility-Oriented Federation of Cloud Computing Environments for Scaling of Application Services

Service Level Agreement (SLA) in Utility Computing Systems

Cloudbus Toolkit for Market-Oriented Cloud Computing

High-Performance Cloud Computing: A View of Scientific Applications

Jeeva: Enterprise Grid-enabled Web Portal for Protein Secondary Structure Prediction

Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities

A Case for Cooperative and Incentive-Based Coupling of Distributed Clusters

Nimrod/G: An Architecture of a Resource Management and Scheduling System in a Global Computational Grid