Advances, Systems and Applications

  • Open access
  • Published: 13 June 2023

A survey of Kubernetes scheduling algorithms

  • Khaldoun Senjab 1 ,
  • Sohail Abbas 1 ,
  • Naveed Ahmed 1 &
  • Atta ur Rehman Khan 2  

Journal of Cloud Computing volume  12 , Article number:  87 ( 2023 ) Cite this article

8945 Accesses

4 Citations

Metrics details

As cloud services expand, the need to improve the performance of data center infrastructure becomes more important. High-performance computing, advanced networking solutions, and resource optimization strategies can help data centers maintain the speed and efficiency necessary to provide high-quality cloud services. Running containerized applications is one such optimization strategy, offering benefits such as improved portability, enhanced security, better resource utilization, faster deployment and scaling, and improved integration and interoperability. These benefits can help organizations improve their application deployment and management, enabling them to respond more quickly and effectively to dynamic business needs. Kubernetes is a container orchestration system designed to automate the deployment, scaling, and management of containerized applications. One of its key features is the ability to schedule the deployment and execution of containers across a cluster of nodes using a scheduling algorithm. This algorithm determines the best placement of containers on the available nodes in the cluster. In this paper, we provide a comprehensive review of various scheduling algorithms in the context of Kubernetes. We characterize and group them into four sub-categories: generic scheduling, multi-objective optimization-based scheduling, AI-focused scheduling, and autoscaling enabled scheduling, and identify gaps and issues that require further research.

Introduction

Kubernetes is an open-source platform for automating the deployment, scaling, and management of containerized applications. It allows developers to focus on building and deploying their applications without worrying about the underlying infrastructure. Kubernetes uses a declarative approach to managing applications, where users specify desired application states, and the system maintains them. It also provides robust tools for monitoring and managing applications, including self-healing mechanisms for automatic failure detection and recovery. Overall, Kubernetes offers a powerful and flexible solution for managing containerized applications in production environments.

Kubernetes is well-suited for microservice-based web applications, where each component can be run in its own container. Containers are lightweight and can be easily created and destroyed, providing faster and more efficient resource utilization than virtual machines, as shown in Fig.  1 . Kubernetes automates the deployment, scaling, and management of containers across a cluster of machines, making resource utilization more efficient and flexible. This simplifies the process of building and maintaining complex applications.

figure 1

Comparison between different types of applications deployments

Microservice-based architecture involves dividing an application into small, independent modules called microservices, Fig.  2 . Each microservice is responsible for a specific aspect of the application, and they communicate through a message bus. This architecture offers several benefits, such as the ability to automate deployment, scaling, and management. Because each microservice is independent and can be managed and updated separately, it is easier to make changes without affecting the entire system. Additionally, microservices can be written in different languages and can run on different servers, providing greater flexibility in the development process.

figure 2

Comparison between different applications architectures

Kubernetes can quickly adapt to various types of demand intensities. For example, if a web application has few visitors at a given time, it can be scaled down to a few pods using minimal resources to reduce costs. However, if the application becomes extremely popular and receives a large number of visitors simultaneously, it can be scaled up to be serviced by a large number of pods, making it capable of handling almost any level of demand.

Kubernetes have been employed by many organizations in a diverse area of underlying applications and have gained the trust of being the best option for the management and deployment of containerized applications. In terms of recent applications, Kubernetes are proving to be an invaluable resource for IT infrastructure as they provide a sustainable path towards serverless computing that will result in easing up challenges in IT administration [ 1 ]. Serverless computing will provide end-to-end security enhancements but will also result in new infrastructure and security challenges as discussed in [ 1 ].

As the computing paradigm moves towards edge and fog computing, Kubernetes is proving to be a versatile solution that provides seamless network management between cloud and edge nodes [ 2 , 3 , 4 ]. Kubernetes face multiple challenges when deployed in an IoT environment. These challenges range from optimizing network traffic distribution [ 2 ], optimizing flow routing policies [ 3 ], and edge device’s computational resources distribution [ 4 ].

As can be seen from the diverse range of applications, and challenges associated with Kubernetes, it is imperative to study proposed algorithms in the related area to identify the state-of-the-art and future research directions. Numerous studies have focused on the development of new algorithms for Kubernetes. The main motivation for this survey is to provide a comprehensive overview of the state-of-the-art in the field of Kubernetes scheduling algorithms. By reviewing the existing literature and identifying the key theories, methods, and findings from previous studies, we aim to provide a critical evaluation of the strengths and limitations of existing approaches. We also hope to identify gaps and open questions in the existing literature, and to offer suggestions for future research directions. Overall, our goal is to contribute to the advancement of knowledge in the field and to provide a useful resource for researchers and practitioners working with Kubernetes scheduling algorithms.

To the best of authors’ knowledge, there are no related surveys found that specifically address the topic at hand. The surveys found are mostly targeted at the container orchestration in general (including Kubernetes), such as [ 5 , 6 , 7 , 8 ]. These surveys address Kubernetes breadthwise without targeting scheduling and diving deep into it and some even did not focus on Kubernetes. For example, some concentrated on scheduling in the cloud [ 9 ] and its associated concerns [ 10 ]. Others targeted big data applications in data center networks [ 11 ], or fog computing environments [ 12 ]. The authors have found two closely related and well-organized surveys [ 13 ] and [ 14 ] that targeted Kubernetes scheduling in depth. However, our work is different than these two surveys in terms of taxonomy, i.e., they targeted different aspects and objectives in scheduling whereas we categorized the literature into different four sub-categories: generic scheduling, multi-objective optimization-based scheduling, AI focused scheduling, and autoscaling enabled scheduling. Thereby focusing specifically on wide range of schemes related to multi-objective optimization and AI, in addition to the main scheduling with autoscaling. Our categorization, we believe, is more fine-grained and novel as compared to the existing surveys.

In this paper, the literature has been divided into four sub-categories: generic scheduling, multi-objective optimization-based scheduling, AI-focused scheduling, and autoscaling enabled scheduling. The literature pertaining to each sub-category is analyzed and summarized based on six parameters outlined in Literature review section.

Our main contributions are as follows:

A comprehensive review of the literature on Kubernetes scheduling algorithms targeting four sub-categories: generic scheduling, multi-objective optimization-based scheduling, AI focused scheduling, and autoscaling enabled scheduling.

A critical evaluation of the strengths and limitations of existing approaches.

Identification of gaps and open questions in the existing literature.

The remainder of this paper is organized as follows: In  Search methodology  section, we describe the methodology used to conduct the survey. In Literature review  section, we present the literature review along with results of our survey, including a critical evaluation of the strengths and limitations of existing approaches. A taxonomy of the identified research papers based on the literature review is presented as well. In  Discussion, challenges & future suggestions  section, we discuss the implications of our findings and suggest future research directions. Finally, in  Conclusions  section, we summarize the key contributions of the survey and provide our conclusions.

Search methodology

This section presents our search methodology for identifying relevant studies that are included in this review.

To identify relevant studies for our review, we conducted a comprehensive search of the literature using the following databases: IEEE, ACM, Elsevier, Springer, and Google Scholar. We used the following search terms: "Kubernetes," "scheduling algorithms," and "scheduling optimizing." We limited our search to studies published in the last 5 years and written in English.

We initially identified a total of 124 studies from the database searches, see Fig.  3 . We then reviewed the abstracts of these studies to identify those that were relevant to our review. We excluded studies that did not focus on Kubernetes scheduling algorithms, as well as those that were not original research or review articles. After this initial screening, we were left with 67 studies, see Fig.  4 .

figure 3

Inclusion criteria

figure 4

Exclusion criteria

We then reviewed the texts of the remaining studies to determine their eligibility for inclusion in our review. We excluded studies that did not meet our inclusion criteria, which were: (1) focus on optimizing Kubernetes scheduling algorithms, (2) provide original research or a critical evaluation of existing approaches, and (3) be written in English and published in the last 5 years. After this final screening, we included 47 studies in our review, see Fig.  4 . A yearly distribution of papers can be seen in Fig.  5 .

figure 5

Detailed statistics showing the yearly breakdown of analyzed studies

We also searched the reference lists of the included studies to identify any additional relevant studies that were not captured in our database searches. We did not identify any additional studies through this process. Therefore, our review includes 47 studies on Kubernetes scheduling algorithms published in the last 5 years. These studies represent a diverse range of research methods, including surveys, experiments, and simulations.

Literature review

This section has been organized into four sub-categories, i.e., generic scheduling, multi-objective optimization-based scheduling, AI focused scheduling, and autoscaling enabled scheduling. A distribution of analyzed research papers in each category can be seen in Fig.  6 . The literature in each sub-category is analyzed and then summarized based on six parameters given below:

Methodology/Algorithms

Experiments

Applications

Limitations

figure 6

Detailed statistics for each category in terms of analyzed studies

Scheduling in Kubernetes

The field of Kubernetes scheduling algorithms has attracted significant attention from researchers and practitioners in recent years. A growing body of literature has explored the potential benefits and challenges of using different scheduling algorithms to optimize the performance of a Kubernetes cluster. In this section, we present a review of the key theories, methods, and findings from previous studies in this area.

One key theme in the literature is the need for efficient effective scheduling of workloads in a Kubernetes environment. Many studies have emphasized the limitations of traditional scheduling approaches, which often struggle to handle the complex and dynamic nature of workloads in a Kubernetes cluster. As a result, there has been increasing interest in the use of advanced scheduling algorithms to enable efficient, effective allocation of computing resources within the cluster.

Another key theme in the literature is the potential benefits of advanced scheduling algorithms for Kubernetes. Many studies have highlighted the potential for these algorithms to improve resource utilization, reduce latency, and enhance the overall performance of the cluster. Additionally, advanced scheduling algorithms have the potential to support the development of new applications and services within the Kubernetes environment, such as real-time analytics and machine learning and deep learning, see AI Focused Scheduling section. 

Despite these potential benefits, the literature also identifies several challenges and limitations of Kubernetes scheduling algorithms. One key challenge is the need to address the evolving nature of workloads and applications within the cluster. Therefore, various authors focused on improving the autoscaling feature in Kubernetes scheduling to allow for automatic adjustment of the resources allocated to pods based on the current demand, more detailed discussion can be found in Autoscaling-enabled Scheduling section. Other challenges include the need to manage and coordinate multiple scheduling algorithms, and to ensure the stability and performance of the overall system.

Overall, the literature suggests that advanced scheduling algorithms offer a promising solution to the challenges posed by the complex and dynamic nature of workloads in a Kubernetes cluster. However, further research is needed to address the limitations and challenges of these algorithms, and to explore their potential applications and benefits.

In Santos et al. [ 15 ], for deployments in smart cities, the authors suggest a network-aware scheduling method for container-based apps. Their strategy is put into practice as an addition to Kubernetes' built-in default scheduling system, which is an open-source orchestrator for the automatic management and deployment of micro-services. By utilizing container-based smart city apps, the authors assess the suggested scheduling approach's performance and contrast it with that of Kubernetes' built-in default scheduling mechanism. Compared to the default technique, they discovered that the suggested solution reduces network latency by almost 80%.

In Chung et al. [ 16 ], the authors propose a new cluster scheduler called Stratus that is specialized for orchestrating batch job execution on virtual clusters in public Infrastructure-as-a-Service (IaaS) platforms. Stratus focuses on minimizing dollar costs by aggressively packing tasks onto machines based on runtime estimates, i.e., to save money, the allocated resources will be made either mostly full or empty so that they may then be released. Using the workload traces from TwoSigma and Google, the authors evaluate Stratus and establish that the proposed Stratus reduces cost by 17–44% compared to the benchmarks of virtual cluster scheduling.

In Le et al. [ 17 ], the authors propose a new scheduling algorithm called AlloX for optimizing job performance in shared clusters that use interchangeable resources such as CPUs, GPUs, and other accelerators. AlloX transforms the scheduling problem into a min-cost bipartite matching problem and provides dynamic fair allocation over time. The authors demonstrate theoretically and empirically that AlloX performs better than existing solutions in the presence of interchangeable resources, and they show that it can reduce the average job completion time significantly while providing fairness and preventing starvation.

In Zhong et al. [ 18 ], the authors propose a heterogeneous task allocation strategy for cost-efficient container orchestration in Kubernetes-based cloud computing infrastructures with elastic compute resources. The proposed strategy has three main features: support for heterogeneous job configurations, cluster size adjustment through autoscaling algorithms, and a rescheduling mechanism to shut down underutilized VM instances and reallocate relevant jobs without losing task progress. The authors evaluate their approach using the Australian National Cloud Infrastructure (Nectar) and show that it can reduce overall cost by 23–32% compared to the default Kubernetes framework.

In Thinakaran et al. [ 19 ], to create Kube-Knots, the authors combine their proposed GPU-aware resource orchestration layer, Knots, with the Kubernetes container orchestrator. Through dynamic container orchestration, Kube-Knots dynamically harvests unused computing cycles, enabling the co-location of batch and latency-critical applications and increasing overall resource utilization. The authors demonstrate that the proposed scheduling strategies increase average and 99th percentile cluster-wide GPU usage by up to 80% in the case of HPC workloads when used to plan datacenter-scale workloads using Kube-Knots on a ten-node GPU cluster. In addition, the suggested schedulers reduce energy consumption throughout the cluster by an average of 33% for three separate workloads and increase the average task completion times of deep learning workloads by up to 36% when compared to modern schedulers.

In Townend et al. [ 20 ], the authors propose a holistic scheduling system for Kubernetes that replaces the default scheduler and considers both software and hardware models to improve data center efficiency. The authors claim that by introducing hardware modeling into a software-based solution, an intelligent scheduler can make significant improvements in data center efficiency. In their initial deployment, the authors observed power consumption reductions of 10–20%.

In the work by Menouer [ 21 ], the author describes the KCSS, a brand-new Kubernetes container scheduling strategy. The purpose of KCSS is to increase performance in terms of makespan and power consumption by scheduling user-submitted containers as efficiently as possible. For each freshly submitted container, KCSS chooses the best node based on a number of factors linked to the cloud infrastructure and the user's requirements using a multi-criteria decision analysis technique. The author uses the Go programming language to create KCSS and shows how it works better than alternative container scheduling methods in a variety of situations.

In Song et al. [ 22 ], authors present a topology-based GPU scheduling framework for Kubernetes. The framework is based on the traditional Kubernetes GPU scheduling algorithm, but introduces the concept of a GPU cluster topology, which is restored in a GPU cluster resource access cost tree. This allows for more efficient scheduling of different GPU resource application scenarios. The proposed framework has been used in the production practice of Tencent and has reportedly improved the resource utilization of GPU clusters by about 10%.

In Ogbuachi et al. [ 23 ], the authors propose an improved design for Kubernetes scheduling that takes into account physical, operational, and network parameters in addition to software states in order to enable better orchestration and management of edge computing applications. They compare the proposed design to the default Kubernetes scheduler and show that it offers improved fault tolerance and dynamic orchestration capabilities.

In the work by Beltre et al. [ 24 ], utilizing fairness measures including dominant resource fairness, resource demand, and average waiting time, the authors outline a scheduling policy for Kubernetes clusters. KubeSphere, a policy-driven meta-scheduler created by the authors, enables tasks to be scheduled according to each user's overall resource requirements and current consumption. The proposed policy increased fairness in a multi-tenant cluster, according to experimental findings.

In Haja et al. [ 25 ], the authors propose a custom Kubernetes scheduler that takes into account delay constraints and edge reliability when making scheduling decisions. The authors argue that this type of scheduler is necessary for edge infrastructure, where applications are often delay-sensitive, and the infrastructure is prone to failures. The authors demonstrate their Kubernetes extension and release the solution as open source.

In Wojciechowski et al. [ 26 ], the authors propose a unique method for scheduling Kubernetes pods that makes advantage of dynamic network measurements gathered by Istio Service Mesh. According to the authors, this approach can fully automate saving up to 50% of inter-node bandwidth and up to 37% of application response time, which is crucial for the adoption of Kubernetes in 5G use cases.

In Cai et al. [ 27 ], the authors propose a feedback control method for elastic container provisioning in Kubernetes-based systems. The method uses a combination of a varying-processing-rate queuing model and a linear model to improve the accuracy of output errors. The authors compare their approach with several existing algorithms on a real Kubernetes cluster and find that it obtains the lowest percentage of service level agreement (SLA) violation and the second lowest cost.

In Ahmed et al. [ 28 ], the deployment of Docker containers in a heterogeneous cluster with CPU and GPU resources can be managed via the authors' dynamic scheduling framework for Kubernetes. The Kubernetes Pod timeline and previous data about the execution of the containers are taken into account by the platform, known as KubCG, to optimize the deployment of new containers. The time it took to complete jobs might be cut by up to 64% using KubCG, according to the studies the authors conducted to validate their algorithm.

In Ungureanu et al. [ 29 ], the authors propose a hybrid shared-state scheduling framework for Kubernetes that combines the advantages of centralized and distributed scheduling. The framework uses distributed scheduling agents to delegate most tasks, and a scheduling correction function to process unprioritized and unscheduled tasks. Based on the entire cluster state the scheduling decisions are made, which are then synchronized and updated by the master-state agent. The authors performed experiments to test the behavior of their proposed scheduler and found that it performed well in different scenarios, including failover and recovery. They also found that other centralized scheduling frameworks may not perform well in situations like collocation interference or priority preemption.

In Yang et al. [ 30 ], the authors present the design and implementation of KubeHICE, a performance-aware container orchestrator for heterogeneous-ISA architectures in cloud-edge platforms. KubeHICE extends Kubernetes with two functional approaches, AIM (Automatic Instruction Set Architecture Matching) and PAS (Performance-Aware Scheduling), to handle heterogeneous ISA and schedule containers according to the computing capabilities of cluster nodes. The authors performed experiments to evaluate KubeHICE and found that it added no additional overhead to container orchestration and was effective in performance estimation and resource scheduling. They also demonstrated the advantages of KubeHICE in several real-world scenarios, showing for example a 40% increase in CPU utilization when eliminating heterogeneity.

In Li et al. [ 31 ], the authors propose two dynamic scheduling algorithms, Balanced-Disk-IO-Priority (BDI) and Balanced-CPU-Disk-IO-Priority (BCDI), to address the issue of Kubernetes' scheduler not taking the disk I/O load of nodes into account. BDI is designed to improve the disk I/O balance between nodes, while BCDI is designed to solve the issue of load imbalance of CPU and disk I/O on a single node. The authors perform experiments to evaluate the algorithms and find that they are more effective than the Kubernetes default scheduling algorithms.

In Fan et al. [ 32 ], the authors propose an algorithm for optimizing the scheduling of pods in the Serverless framework on the Kubernetes platform. The authors argue that the default Kubernetes scheduler, which operates on a pod-by-pod basis, is not well-suited for the rapid deployment and running of pods in the Serverless framework. To address this issue, the authors propose an algorithm that uses simultaneous scheduling of pods to improve the efficiency of resource scheduling in the Serverless framework. Through preliminary testing, the authors found that their algorithm was able to greatly reduce the delay in pod startup while maintaining a balanced use of node resources.

In Bestari et al. [ 33 ], the authors propose a scheduler for distributed deep learning training in Kubeflow that combines features from existing works, including autoscaling and gang scheduling. The proposed scheduler includes modifications to increase the efficiency of the training process, and weights are used to determine the priority of jobs. The authors evaluate the proposed scheduler using a set of Tensorflow jobs and find that it improves training speed by over 26% compared to the default Kubernetes scheduler.

In Dua et al. [ 34 ], the authors present an alternative algorithm for load balancing in distributed computing environments. The algorithm uses task migration to balance the workload among processors of different capabilities and configurations. The authors define labels to classify tasks into different categories and configure clusters dedicated to specific types of tasks.

The above-mentioned schemes are summarized in Table 1 .

Scheduling using multi-objective optimization

Multi-objective optimization scheduling takes into account multiple objectives or criteria when deciding how to allocate resources and schedule containers on nodes in the cluster. This approach is particularly useful in complex distributed systems where there are multiple competing objectives that need to be balanced to achieve the best overall performance. In a multi-objective optimization scheduling approach, the scheduler considers multiple objectives simultaneously, such as minimizing response time, maximizing resource utilization, and reducing energy consumption. The scheduler uses optimization algorithms to find the optimal solution that balances these objectives.

Multi-objective optimization scheduling can help improve the overall performance and efficiency of Kubernetes clusters by taking into account multiple objectives when allocating resources and scheduling containers. This approach can result in better resource utilization, improved application performance, reduced energy consumption, and lower costs.

Some examples of multi-objective optimization scheduling algorithms used in Kubernetes include genetic algorithms, Ant Colony Optimization, and particle swarm optimization. These algorithms can help optimize different objectives, such as response time, resource utilization, energy consumption, and other factors, to achieve the best overall performance and efficiency in the Kubernetes cluster.

In this section, multi-objective scheduling proposals are discussed.

In Kaur et al. [ 35 ], the authors propose a new controller for managing containers on edge-cloud nodes in Industrial Internet of Things (IIoT) systems. The controller, called Kubernetes-based energy and interference driven scheduler (KEIDS), is based on Google Kubernetes and is designed to minimize energy utilization and interference in IIoT systems. KEIDS uses integer linear programming to formulate the task scheduling problem as a multi-objective optimization problem, taking into account factors such as energy consumption, carbon emissions, and interference from other applications. The authors evaluate KEIDS using real-time data from Google compute clusters and find that it outperforms existing state-of-the-art schemes.

In Lin et al. [ 36 ], the authors propose a multi-objective optimization model for container-based microservice scheduling in cloud architectures. They present an ant colony algorithm for solving the scheduling problem, which takes into account factors such as computing and storage resource utilization, the number of microservice requests, and the failure rate of physical nodes. The authors evaluate the proposed algorithm using experiments and compare its performance to other related algorithms. They find that the proposed algorithm achieves better results in terms of cluster service reliability, cluster load balancing, and network transmission overhead.

In Wei-guo et al. [ 37 ], the authors propose an improved scheduling algorithm for Kubernetes by combining ant colony optimization and particle swarm optimization to better balance task assignments and reduce resource costs. The authors implemented the algorithm in Java and tested it using the CloudSim tool, showing that it outperformed the original scheduling algorithm.

In the work by Oleghe [ 38 ], the idea of container placement and migration in edge servers, as well as the scheduling models created for this purpose, are discussed by the author. The majority of scheduling models, according to the author, are based mostly on heuristic algorithms and use multi-objective optimization models or graph network models. The study also points out the lack of studies on container scheduling models that take dispersed edge computing activities into account and predicts that future studies in this field will concentrate on scheduling containers for mobile edge nodes.

In Carvalho et al. [ 39 ], The authors offer an addition to the Kubernetes scheduler that uses Quality of Experience (QoE) measurements to help cloud management Service Level Objectives (SLOs) be more accurate. In the context of video streaming services that are co-located with other services, the authors assess the suggested architecture using the QoE metric from the ITU P.1203 standard. According to the findings, resource rescheduling increases average QoE by 135% while the proposed scheduler increases it by 50% when compared to other schedulers.

The above-mentioned schemes are summarized in Table 2 .

AI focused scheduling

Many large companies have recently started to provide AI based services. For this purpose, they have installed machine/deep learning clusters composed of tens to thousands of CPUs and GPUs for training their deep learning models in a distributed manner. Different machine learning frameworks are used such as MXNet [ 40 ], TensorFlow [ 41 ], and Petuum [ 42 ]. Training a deep learning model is usually very resource hungry and time consuming. In such a setting, efficient scheduling is crucial in order to fully utilize the expensive deep learning cluster and expedite the model training process. Different strategies have been used to schedule tasks in this arena, for examples, general purpose schedulers are customized to tackle distributed deep learning tasks, example include [ 43 ] and [ 44 ]; however, they statically allocate resources and do not adjust resource under different load conditions which lead to poor resource utilization. Others proposed dynamic allocation of resources after carefully analyzing the workloads, examples include [ 45 ] and [ 46 ].

In this section, deep learning focused schedulers are surveyed.

In Peng et al. [ 46 ], the authors propose a customized job scheduler for deep learning clusters called Optimus. The goal of Optimus is to minimize the time required for deep learning training jobs, which are resource-intensive and time-consuming. Optimus employs performance models to precisely estimate training speed as a function of resource allocation and online fitting to anticipate model convergence during training. These models inform how Optimus dynamically organizes tasks and distributes resources to reduce job completion time. The authors put Optimus into practice on a deep learning cluster and evaluate its efficiency in comparison to other cluster schedulers. They discover that Optimus beats conventional schedulers in terms of job completion time and makespan by roughly 139% and 63%, respectively.

In Mao et al. [ 47 ], the authors propose using modern machine learning techniques to develop highly efficient policies for scheduling data processing jobs on distributed compute clusters. They present their system, called Decima, which uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms. Decima is designed to be scalable and able to handle complex job dependency graphs. The authors report that their prototype integration with Spark on a 25-node cluster improved average job completion time by at least 21% over existing hand-tuned scheduling heuristics, with up to 2 × improvement during periods of high cluster load.

In Chaudhary et al. [ 48 ], a distributed fair share scheduler for GPU clusters used for deep learning training termed as Gandivafair is presented by the authors. This GPU cluster utilization system offers performance isolation between users and is created to strike a balance between the competing demands of justice and efficiency. In spite of cluster heterogeneity, Gandivafair is the first scheduler to fairly distribute GPU time among all active users. The authors demonstrate that Gandivafair delivers both fairness and efficiency under realistic multi-user workloads by evaluating it using a prototype implementation on a heterogeneous 200-GPU cluster.

In Fu et al. [ 49 ], the authors propose a new container placement scheme called ProCon for scheduling jobs in a Kubernetes cluster. ProCon uses an estimation of future resource usage to balance resource contentions across the cluster and reduce the completion time and makespan of jobs. The authors demonstrate through experiments that ProCon decreases completion time by up to 53.3% for a specific job and enhances general performance by 23.0%. In addition, ProCon shows a makespan improvement of up to 37.4% in comparison to Kubernetes' built-in default scheduler.

In Peng et al. [ 50 ], the authors propose DL2, a deep learning-based scheduler for deep learning clusters that aims to improve global training job expedition by dynamically resizing resources allocated to jobs. The authors implement DL2 on Kubernetes and evaluate its performance against a fairness scheduler and an expert heuristic scheduler. The results show that DL2 outperforms the other schedulers in terms of average job completion time.

In Mao et al. [ 51 ], the authors propose a new container scheduler called SpeCon optimized for short-lived deep learning applications. SpeCon is designed to improve resource utilization and job completion times in a Kubernetes cluster by analyzing the progress of deep learning training processes and speculatively migrating slow-growing models to release resources for faster-growing ones. The authors conduct experiments that demonstrate that SpeCon improves individual job completion times by up to 41.5%, improves system-wide performance by 14.8%, and reduces makespan by 24.7%.

In Huang et al. [ 52 ], for scheduling independent batch jobs across many federated cloud computing clusters, the authors suggest a deep reinforcement learning-based job scheduler dubbed RLSK. The authors put RLSK into use on Kubernetes and tested its performance through simulations, demonstrating that it can outperform conventional scheduling methods.

The work by Wang et al. [ 53 ] describes MLFS, a feature-based task scheduling system for machine learning clusters that can conduct both data- and model-parallel processes. To determine task priority for work queue ordering, MLFS uses a heuristic scheduling method. The data from this method is then used to train a deep reinforcement learning model for job scheduling. In comparison to existing work schedules, the proposed system is shown to reduce job completion time by up to 53%, makespan by up to 52%, and increase accuracy by up to 64%. The system is tested using real experiments and large-scale simulations based on real traces.

In Han et al. [ 54 ], the authors present KaiS, an edge-cloud Kubernetes scheduling framework based on learning. KaiS models system state data using graph neural networks and a coordinated multi-agent actor-critic method for decentralized request dispatch. Research indicates that when compared to baselines, KaiS can increase average system throughput rate by 14.3% and decrease scheduling cost by 34.7%.

In Casquero et al. [ 55 ], the Kubernetes orchestrator's scheduling task is distributed among processing nodes by the authors' proposed custom scheduler, which makes use of a Multi-Agent System (MAS). According to the authors, this method is quicker than the centralized scheduling strategy employed by the default Kubernetes scheduler.

In Yang et al. [ 56 ], the authors propose a method for optimizing Kubernetes' container scheduling algorithm by combining the grey system theory with the LSTM (Long Short-Term Memory) neural network prediction method. They perform experiments to evaluate their approach and find that it can reduce the resource fragmentation problem of working nodes in the cluster and increase the utilization of cluster resources.

In Zhang et al. [ 57 ], a highly scalable cluster scheduling system for Kubernetes, termed as Zeus, is proposed by the authors. The main feature of Zeus is that based on the actual server utilization it schedules the best-effort jobs. It has the ability to adaptively divide resources between workloads of two different classes. Zeus is meant to enable the safe colocation of best-effort processes and latency-sensitive services. The authors test Zeus in a real-world setting and discover that it can raise average CPU utilization from 15 to 60% without violating Service Level Objectives (SLOs).

In Liu et al. [ 58 ], the authors suggest a scheduling strategy for deep learning tasks on Kubernetes that takes into account the tasks' resource usage characteristics. To increase task execution efficiency and load balancing, the suggested paradigm, dubbed FBSM, has modules for a GPU sniffer and a balance-aware scheduler. The execution of deep learning tasks is sped up by the suggested system, known as KubFBS, according to the authors' evaluation, which also reveals improved load balancing capabilities for the cluster.

In Rahali et al. [ 59 ], the authors propose a solution for resource allocation in a Kubernetes infrastructure hosting network service. The proposed solution aims to avoid resource shortages and protect the most critical functions. The authors use a statistical approach to model and solve the problem, given the random nature of the treated information.

The above-mentioned schemes are summarized in Table 3 .

Autoscaling-enabled scheduling

Autoscaling is an important feature in Kubernetes scheduling because it allows for automatic adjustment of the resources allocated to pods based on the current demand. It allows efficient resource utilization, improved performance, cost savings, and high availability of the application. Auto rescaling and scheduling are related in that auto rescaling can be used to ensure that there are always enough resources available to handle the tasks that are scheduled. For example, if the scheduler assigns a new task to a worker node, but that node does not have enough resources to execute the task, the auto scaler can add more resources to that node or spin up a new node to handle the task. In this way, auto rescaling and scheduling work together to ensure that a distributed system is able to handle changing workloads and optimize resource utilization. Some of the schemes related to this category are surveyed below.

In Taherizadeh et al. [ 60 ], the authors propose a new dynamic multi-level (DM) autoscaling method for container-based cloud applications. The DM method uses both infrastructure- and application-level monitoring data to determine when to scale up or down, and its thresholds are dynamically adjusted based on workload conditions. The authors compare the performance of the DM method to seven existing autoscaling methods using synthetic and real-world workloads. They find that the DM method has better overall performance than the other methods, particularly in terms of response time and the number of instantiated containers. SWITCH system was used to implement the DM method for time-critical cloud applications.

In Rattihalli et al. [ 61 ], the authors propose a new resource management system called RUBAS that can dynamically adjust the allocation of containers running in a Kubernetes cluster. RUBAS incorporates container migration to improve upon the Kubernetes Vertical Pod Autoscaler (VPA) system non-disruptively. The authors evaluate RUBAS using multiple scientific benchmarks and compare its performance to Kubernetes VPA. They find that RUBAS improves CPU and memory utilization by 10% and reduces runtime by 15% with an overhead for each application ranging from 5–20%.

In Toka et al. [ 62 ], the authors present a Kubernetes scaling engine that uses machine learning forecast methods to make better autoscaling decisions for cloud-based applications. The engine's short-term evaluation loop allows it to adapt to changing request dynamics, and the authors introduce a compact management parameter for cloud tenants to easily set their desired level of resource over-provisioning vs. service level agreement (SLA) violations. The proposed engine is evaluated in simulations and with measurements on Web trace data, and the results show that it results in fewer lost requests and slightly more provisioned resources compared to the default Kubernetes baseline.

In Balla et al. [ 63 ], the authors propose an adaptive autoscaler called Libra, which automatically detects the optimal resource set for a single pod and manages the horizontal scaling process. Libra is also able to adapt the resource definition for the pod and adjust the horizontal scaling process if the load or underlying virtualized environment changes. The authors evaluate Libra in simulations and show that it can reduce the average CPU and memory utilization by up to 48% and 39%, respectively, compared to the default Kubernetes autoscaler.

In another work by Toka et al. [ 64 ], the authors propose a Kubernetes scaling engine that uses multiple AI-based forecast methods to make autoscaling decisions that are better suited to handle the variability of incoming requests. The authors also introduce a compact management parameter to help application providers easily set their desired resource over-provisioning and SLA violation trade-off. The proposed engine is evaluated in simulations and with measurements on web traces, showing improved fitting of provisioned resources to service demand.

In Wu et al., the authors propose a new active Kubernetes auto scaling device based on prediction of pod replicas. They demonstrate that their proposed autoscaler has a faster response speed compared to existing scaling strategies in Kubernetes.

In Wang et al. [ 65 ] the authors propose an improved automatic scaling scheme for Kubernetes that combines the advantages of different types of nodes in the scaling process. They found that their scheme improves the performance of the system under rapid load pressure and reduces instability within running clusters compared to the default auto scaler.

In Kang et al. [ 66 ], the authors propose a method for improving the reliability of virtual networks by using optimization models and heuristic algorithms to allocate virtual network functions (VNFs) to suitable locations. The authors also develop function scheduler plugins for the Kubernetes system, which allows for the automatic deployment and management of containerized applications. The proposed method is demonstrated to be effective in allocating functions and running service functions correctly. This work was published in the 2021 edition of the IEEE Conference on Decision and Control.

In Vu et al. [ 67 ], propose a hybrid autoscaling method for containerized applications that combines vertical and horizontal scaling capabilities to optimize resource utilization and ensure quality of service (QoS) requirements. The proposed method uses a predictive approach based on machine learning to forecast future demand and a burst identification module to make scaling decisions. The authors evaluate the proposed method and find that it improves response time and resource utilization compared to existing methods that only use a single scaling mode.

The above-mentioned schemes are summarized in Table 4 .

Discussion, challenges & future suggestions

In Literature review section, a comprehensive review has been presented covering four sub-categories in the area of Kubernetes scheduling. It is crucial to provide a brief discussion on the categorized literature review that is presented in this section.

In the area of multi-objective optimization-based scheduling in Kubernetes, several research studies have been conducted to optimize various objectives such as minimizing the energy consumption and cost while maximizing resource utilization and meeting application performance requirements. These studies employ different optimization techniques such as genetic algorithms, particle swarm optimization, and ant colony optimization. Some studies also incorporate machine learning-based approaches to predict workload patterns and make scheduling decisions. There are still several challenges that need to be addressed. Firstly, the multi-objective nature of the problem poses a significant challenge in finding optimal solutions that balance conflicting objectives. Second, the dynamic nature of the cloud environment requires real-time adaptation of scheduling decisions to changing conditions. Overall, the research in multi-objective optimization-based scheduling in Kubernetes shows great potential in achieving efficient and effective resource management. Still, further work is needed to address the challenges and validate the effectiveness of these approaches in real-world scenarios.

On the other hand, AI-based scheduling in Kubernetes has been a popular area of research in recent years. Many studies have proposed different approaches to optimize scheduling decisions using machine learning and other AI techniques. One of the key accomplishments in this area is the development of scheduling algorithms that can handle complex workloads in a dynamic environment. These algorithms can consider various factors, such as resource availability, task dependencies, and application requirements, to make optimal scheduling decisions. Some studies have proposed reinforcement learning-based scheduling algorithms, which can adapt to changing workload patterns and learn from experience to improve scheduling decisions. Other studies have proposed deep learning-based approaches, which can capture complex patterns in the workload data and make accurate predictions. Overall, these studies have demonstrated that AI-based scheduling can improve the efficiency and performance of Kubernetes clusters. However, there are still some challenges that need to be addressed in this area. One of the main challenges is the lack of real-world datasets for training and evaluation of AI-based scheduling algorithms. Most studies use synthetic or simulated datasets, which may not reflect the complexities of real-world workloads. Another challenge is the trade-off between accuracy and computational complexity. Future research in this area could focus on developing more efficient and scalable AI-based scheduling algorithms that can handle large-scale, real-world workloads. This could involve exploring new machine learning and optimization techniques that can improve scheduling accuracy while reducing computational complexity.

Lastly, autoscaling enabled scheduling is an emerging research area that aims to optimize resource utilization and improve application performance by combining autoscaling and scheduling techniques. Several research studies have been published in this area in recent years. The analysis of these studies reveals that autoscaling enabled scheduling can lead to significant improvements in resource utilization and application performance. The studies have shown they can help reduce resource wastage, minimize the risk of under-provisioning, and improve application response times. However, despite these promising results, there are still some challenges that need to be addressed in this area. One of the main challenges is the complexity of designing effective autoscaling enabled scheduling algorithms. Developing algorithms that can adapt to dynamic workload changes and optimize resource utilization while maintaining application performance is a non-trivial task. Furthermore, there is a need for more research on the practical implementation of autoscaling enabled scheduling in real-world scenarios. Most of the existing studies have been conducted in controlled experimental settings, and there is a need to evaluate the effectiveness of auto scaling enabled scheduling in real-world applications. There are still several challenges that need to be addressed, including algorithm design, standardization, and practical implementation. Future research in this area should focus on addressing these challenges and developing more effective and practical auto scaling enabled scheduling techniques.

The research papers use diverse algorithms to enhance Kubernetes scheduling. These algorithms are tested on various platforms and environments, such as Spark, MXNet, Kubernetes, Google and TwoSigma's GPU cluster, workloads, Google compute, CPU-GPU, the National Cloud Infrastructure, benchmarks, ProCon, DL2, DRF, Optimus, CBP, PP, scaling, data centers, schedulers, CloudSim and Java, scenarios, cloud infrastructure, user need, RLSK, real trace, GaiaGPU and Tencent, real workload traces, simulations and web traces, Kubernetes, a new algorithm, Kubernetes failover and recovery, KubeHICE, real-world scenarios, BDI, BCDI, Kubernetes, a proposed algorithm, autoscalers, default auto scalers, video streaming, Tensorflow, Zeus, and latency-sensitive services. Some papers did not specify the details of the algorithms they used or the platforms and environments they tested on.

As can be seen in the previous sections, the survey extensively analyzes the current literature, and composes a taxonomy to not only effectively analyze the current state-of-the-art but also identify the challenges and future directions. Based on the analysis, the following areas have been identified as potential future research in the field:

As Kubernetes becomes more popular, there will be a growing need for advanced computation optimization techniques. In the future, Kubernetes may benefit from the development of more sophisticated algorithms for workload scheduling and resource allocation, potentially using AI or machine learning. Additionally, integrating Kubernetes with emerging technologies like serverless computing could lead to even more efficient resource usage by enabling dynamic scaling without pre-provisioned infrastructure. Ultimately, the future of computation optimization in Kubernetes is likely to involve a combination of cutting-edge algorithms, innovative technologies, and ongoing advancements in cloud computing.

Testing and implementation to reveal limitations or current learning algorithms for scheduling and potential improvements on large scale clusters. One important focus is on improving the tooling and automation around testing and deployment, including the development of new testing frameworks and the integration of existing tools into the Kubernetes ecosystem. Another key area is the ongoing refinement of Kubernetes' implementation and development process, with a focus on streamlining workflows, improving documentation, and fostering greater collaboration within the open-source community. Additionally, there is a growing emphasis on developing more comprehensive testing and validation strategies for Kubernetes clusters, including the use of advanced techniques like chaos engineering to simulate real-world failure scenarios. Overall, the future of testing and implementation in Kubernetes is likely to involve ongoing innovation, collaboration, and an ongoing commitment to driving the platform forward.

A number of methods are employing learning algorithms for resource balancing inside and outside the cluster. Even though the methods given encouraging results, new learning algorithms can be found to improve the scheduler, especially on large scale clusters.

Limitations and potential improvements in specific contexts, e.g., Green Computing. Minimizing the carbon footprint of a cluster is an ongoing challenge. Advanced schedulers are needed to be proposed in order to reduce the energy consumption and carbon footprint of clusters in IIoT setups. There is a huge opportunity for improving the existing methods and proposing new methods in this area.

Future research in Kubernetes resource management. Kubernetes resource management mostly relies on optimization modelling framework and heuristic-based algorithms. The potential for improving and proposing new resource management algorithms is a very promising area of research. Future research in Kubernetes resource management may focus on addressing the challenges of managing complex, dynamic workloads across distributed, heterogeneous environments. This may involve developing more sophisticated algorithms and techniques for workload placement, resource allocation, and load balancing, as well as exploring new approaches to containerization and virtualization. Additionally, there may be opportunities to leverage emerging technologies like edge computing and 5G networks to enable more efficient and scalable resource management in Kubernetes.

Most of the work done in the area of Kubernetes scheduling has been evaluated on small clusters. However, this might not always be tempting. One future research direction in Kubernetes scheduling is to use larger cluster sizes for algorithm evaluation. While Kubernetes has been shown to be effective in managing clusters of up to several thousand nodes, there is a need to evaluate its performance in even larger cluster sizes. This includes evaluating the scalability of the Kubernetes scheduler, identifying potential bottlenecks, and proposing solutions to address them. Additionally, there is a need to evaluate the impact of larger cluster sizes on application performance and resource utilization. This research could lead to the development of more efficient scheduling algorithms and better management strategies for large-scale Kubernetes deployments.

Scheduling should not only be considered from the static infrastructure point of view, but rather advanced context-aware scheduling algorithms may be proposed that could focus on developing new approaches to resource allocation and scheduling that take into account a broader range of contextual factors, such as user preferences, application dependencies, and environmental conditions. This may involve exploring new machine learning techniques and optimization algorithms that can dynamically adapt to changing conditions and prioritize resources based on real-time feedback and analysis. Other potential areas of research may include developing new models and frameworks for managing resources in Kubernetes clusters, improving container orchestration and load balancing, and enhancing monitoring and analytics capabilities to enable more effective use of context-aware scheduling algorithms.

As can be seen from the diversity of future directions, the potential for new research in Kubernetes is ripe with challenges of myriad levels of difficulty and effort. It provides future researchers with exciting opportunities to pursue and problems to tackle. We hope that this survey will facilitate future researchers in selecting a suitable challenge and solve new problems to expand the state-of-the-art in the area of Kubernetes.

Conclusions

In conclusion, the survey on Kubernetes scheduling provides a comprehensive overview of the current state of the field. It covers the objectives, methodologies, algorithms, experiments, and results of various research efforts in this area. The survey highlights the importance of scheduling in Kubernetes and the need for efficient and effective scheduling algorithms. The results of the experiments show that there is still room for improvement in this area, and future work should focus on developing new algorithms and improving existing ones. Overall, the survey provides valuable insight into the current state of Kubernetes scheduling and points to promising directions for future research.

Availability of data and materials

The corresponding author may provide the supporting data on request.

Mondal SK, Pan R, Kabir HMD, Tian T, Dai HN (2022) Kubernetes in IT administration and serverless computing: an empirical study and research challenges. J Supercomput 78(2):2937–2987

Article   Google Scholar  

Phuc LH, Phan LA, Kim T (2022) Traffic-Aware horizontal pod autoscaler in kubernetes-based edge computing infrastructure. IEEE Access 10:18966–18977

Zhang M, Cao J, Yang L, Zhang L, Sahni Y, Jiang S (2022) ENTS: An Edge-native Task Scheduling System for Collaborative Edge Computing. IEEE/ACM 7th Symposium on Edge Computing, SEC. pp 149–161

Google Scholar  

Kim SH, Kim T (2023) Local scheduling in kubeedge-based edge computing environment. Sensors 23(3):1522

E. Casalicchio (2019) “Container orchestration: A survey,” Syst Model, 221–235.

Pahl C, Brogi A, Soldani J, Jamshidi P (2017) Cloud container technologies: a state-of-the-art review. IEEE Transact Cloud Comput 7(3):677–692

Rodriguez MA, Buyya R (2019) Container-based cluster orchestration systems: A taxonomy and future directions. Software Pract Experience 49(5):698–719

Truyen E, Van Landuyt D, Preuveneers D, Lagaisse B, Joosen W (2019) A comprehensive feature comparison study of open-source container orchestration frameworks. Appl Sciences (Switzerland) 9(5):931

Arunarani AR, Manjula D, Sugumaran V (2019) Task scheduling techniques in cloud computing: a literature survey. Futur Gener Comput Syst 91:407–415

Vijindra and S. Shenai, (2012) Survey on scheduling issues in cloud computing. Procedia Eng 38:2881–2888

Wang K, Zhou Q, Guo S, Luo J (2018) Cluster frameworks for efficient scheduling and resource allocation in data center networks: a survey. IEEE Commun Surveys Tutor 20(4):3560–3580

Hosseinioun P, Kheirabadi M, Kamel Tabbakh SR, Ghaemi R (2022) A task scheduling approaches in fog computing: a survey”. Transact Emerg TelecommunTechnol 33(3):e3792

Rejiba Z, Chamanara J (2022) Custom scheduling in Kubernetes: a survey on common problems and solution approaches. ACM Comput Surv 55(7):1–37

Carrión C (2022) Kubernetes scheduling: taxonomy, ongoing issues and challenges. ACM Comput Surv 55(7):1–37

Article   MathSciNet   Google Scholar  

Santos J, Wauters T, Volckaert B, De Turck F (2019) Towards network-Aware resource provisioning in kubernetes for fog computing applications. Proceedings of the IEEE Conference on Network Softwarization: Unleashing the Power of Network Softwarization. pp 351–359

Chung A, Park JW, Ganger GR (2018) Stratus: Cost-aware container scheduling in the public cloud. Proceedings of the ACM Symposium on Cloud Computing. pp 121–134

Chapter   Google Scholar  

Le TN, Sun X, Chowdhury M, Liu Z (2020) AlloX: Compute allocation in hybrid clusters. Proceedings of the 15th European Conference on Computer Systems, EuroSys

Zhong Z, Buyya R (2020) A Cost-Efficient Container Orchestration Strategy in Kubernetes-Based Cloud Computing Infrastructures with Heterogeneous Resources. ACM Trans Internet Technol 20(2):1–24

Thinakaran P, Gunasekaran JR, Sharma B, Kandemir MT, Das CR (2019) Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters. Proceedings - IEEE International Conference on Cluster Computing, ICCC

Townend P et al (2019) Invited paper: Improving data center efficiency through holistic scheduling in kubernetes. Proceedings - 13th IEEE International Conference on Service-Oriented System Engineering, 10th International Workshop on Joint Cloud Computing, and IEEE International Workshop on Cloud Computing in Robotic Systems, CCRS. pp 156–166

Menouer T (2021) KCSS: Kubernetes container scheduling strategy. J Supercomput 77(5):4267–4293

Song S, Deng L, Gong J, Luo H (2019) Gaia scheduler: A kubernetes-based scheduler framework. 16th IEEE International Symposium on Parallel and Distributed Processing with Applications, 17th IEEE International Conference on Ubiquitous Computing and Communications, 8th IEEE International Conference on Big Data and Cloud Computing. pp 252–259

Ogbuachi MC, Gore C, Reale A, Suskovics P, Kovacs B (2019) Context-aware K8S scheduler for real time distributed 5G edge computing applications. 27th International Conference on Software, Telecommunications and Computer Networks, SoftCOM

Beltre A, Saha P, Govindaraju M (2019) KubeSphere: An approach to multi-tenant fair scheduling for kubernetes clusters. 3rd IEEE International Conference on Cloud and Fog Computing Technologies and Applications, Cloud Summit. pp 14–20

Haja D, Szalay M, Sonkoly B, Pongracz G, Toka L (2019) Sharpening Kubernetes for the Edge. ACM SIGCOMM Conference Posters and Demos, Part of SIGCOMM. pp 136–137

Wojciechowski L et al (2021) NetMARKS: Network metrics-AwaRe kubernetes scheduler powered by service mesh. Proceedings - IEEE INFOCOM

Cai Z, Buyya R (2022) Inverse Queuing Model-Based Feedback Control for Elastic Container Provisioning of Web Systems in Kubernetes. IEEE Trans Comput 71(2):337–348

Article   MATH   Google Scholar  

El Haj Ahmed G, Gil-Castiñeira F, Costa-Montenegro E (2021) KubCG: A dynamic Kubernetes scheduler for heterogeneous clusters. Software Pract Experience 51(2):213–234

Ungureanu OM, Vlădeanu C, Kooij R (2019) Kubernetes cluster optimization using hybrid shared-state scheduling framework. ACM International Conference Proceeding Series

Yang S, Ren Y, Zhang J, Guan J, Li B (2021) KubeHICE: Performance-aware Container Orchestration on Heterogeneous-ISA Architectures in Cloud-Edge Platforms. 19th IEEE International Symposium on Parallel and Distributed Processing with Applications, 11th IEEE International Conference on Big Data and Cloud Computing, 14th IEEE International Conference on Social Computing and Networking and 11th IEEE Internation. pp 81–91

Li D, Wei Y, Zeng B (2020) A Dynamic I/O Sensing Scheduling Scheme in Kubernetes. ACM International Conference Proceeding Series. pp 14–19

Fan D, He D (2020) A Scheduler for Serverless Framework base on Kubernetes. ACM International Conference Proceeding Series. pp 229–232

Bestari MF, Kistijantoro AI, Sasmita AB (2020) Dynamic Resource Scheduler for Distributed Deep Learning Training in Kubernetes. 7th International Conference on Advanced Informatics: Concepts, Theory and Applications, ICAICTA

Dua A, Randive S, Agarwal A, Kumar N (2020) Efficient Load balancing to serve Heterogeneous Requests in Clustered Systems using Kubernetes. IEEE 17th Annual Consumer Communications and Networking Conference, CCNC

Kaur K, Garg S, Kaddoum G, Ahmed SH, Atiquzzaman M (2020) KEIDS: Kubernetes-Based Energy and Interference Driven Scheduler for Industrial IoT in Edge-Cloud Ecosystem. IEEE Internet Things J 7(5):4228–4237

Lin M, Xi J, Bai W, Wu J (2019) Ant colony algorithm for multi-objective optimization of container-based microservice scheduling in cloud. IEEE Access 7:83088–83100

Wei-guo Z, Xi-lin M, Jin-zhong Z (2018) Research on kubernetes’ resource scheduling scheme. ACM International Conference Proceeding Series

Oleghe O (2021) Container placement and migration in edge computing: concept and scheduling models. IEEE Access 9:68028–68043

Carvalho M, MacEdo DF (2021) QoE-Aware Container Scheduler for Co-located Cloud Environments,” Faculdades Catolicas

Chen T, Li M, Li Y, Lin M, Wang N, Wang M, Xiao T, Xu B, Zhang C, Zhang Z (2015) Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274

Abadi M et al (2016) Tensorflow: a system for large-scale machine learning. Osdi 2016(16):265–283

Xing EP et al (2015) Petuum: A new platform for distributed machine learning on big data. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp 1335–1344

Verma A, Pedrosa L, Korupolu M, Oppenheimer D, Tune E, Wilkes J (2015) Large-scale cluster management at Google with Borg. 10th European Conference on Computer Systems, EuroSys. pp 1–15

Vavilapalli VK et al (2013) Apache hadoop YARN: Yet another resource negotiator. 4th Annual Symposium on Cloud Computing, SoCC. pp 1–16

Bao Y, Peng Y, Wu C, Li Z (2018) Online Job Scheduling in Distributed Machine Learning Clusters. Proceedings - IEEE INFOCOM. pp 495–503

Peng Y, Bao Y, Chen Y, Wu C, Guo C (2018) Optimus: An Efficient Dynamic Resource Scheduler for Deep Learning Clusters. Proceedings of the 13th EuroSys Conference, EuroSys

Mao H, Schwarzkopf M, Venkatakrishnan SB, Meng Z, Alizadeh M (2019) Learning scheduling algorithms for data processing clusters. SIGCOMM Conference of the ACM Special Interest Group on Data Communication. pp 270–288

Chaudhary S, Ramjee R, Sivathanu M, Kwatra N, Viswanatha S (2020) Balancing efficiency and fairness in heterogeneous GPU clusters for deep learning. Proceedings of the 15th European Conference on Computer Systems, EuroSys

Fu Y et al (2019) Progress-based Container Scheduling for Short-lived Applications in a Kubernetes Cluster. IEEE International Conference on Big Data, Big Data. pp 278–287

Peng Y, Bao Y, Chen Y, Wu C, Meng C, Lin W (2021) DL2: A Deep Learning-Driven Scheduler for Deep Learning Clusters. IEEE Trans Parallel Distrib Syst 32(8):1947–1960

Mao Y, Fu Y, Zheng W, Cheng L, Liu Q, Tao D (2022) Speculative Container Scheduling for Deep Learning Applications in a Kubernetes Cluster. IEEE Syst J 16(3):3770–3781

Huang J, Xiao C, Wu W (2020) RLSK: A Job Scheduler for Federated Kubernetes Clusters based on Reinforcement Learning. IEEE International Conference on Cloud Engineering, IC2E. pp 116–123

Wang H, Liu Z, Shen H (2020) Job scheduling for large-scale machine learning clusters. Proceedings of the 16th International Conference on Emerging Networking EXperiments and Technologies. pp 108–120

Han Y, Shen S, Wang X, Wang S, Leung VCM (2021) Tailored learning-based scheduling for kubernetes-oriented edge-cloud system. Proceedings - IEEE INFOCOM

Casquero O, Armentia A, Sarachaga I, Pérez F, Orive D, Marcos M (2019) Distributed scheduling in Kubernetes based on MAS for Fog-in-the-loop applications. IEEE International Conference on Emerging Technologies and Factory Automation, ETFA. pp 1213–1217

Yang Y, Chen L (2019) Design of Kubernetes Scheduling Strategy Based on LSTM and Grey Model. Proceedings of IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering, ISKE. pp 701–707

Zhang X, Li L, Wang Y, Chen E, Shou L (2021) Zeus: Improving Resource Efficiency via Workload Colocation for Massive Kubernetes Clusters. IEEE Access 9:105192–105204

Liu Z, Chen C, Li J, Cheng Y, Kou Y, Zhang D (2022) KubFBS: A fine-grained and balance-aware scheduling system for deep learning tasks based on kubernetes. Concurrency Computat Pract Exper 34(11):e6836. https://doi.org/10.1002/cpe.6836

Rahali M, Phan CT, Rubino G (2021) KRS: Kubernetes Resource Scheduler for resilient NFV networks. IEEE Global Communications Conference

Taherizadeh S, Stankovski V (2019) Dynamic multi-level auto-scaling rules for containerized applications. Computer J 62(2):174–197

Rattihalli G, Govindaraju M, Lu H, Tiwari D (2019) Exploring potential for non-disruptive vertical auto scaling and resource estimation in kubernetes. IEEE International Conference on Cloud Computing, CLOUD. pp 33–40

Toka L, Dobreff G, Fodor B, Sonkoly B (2021) Machine Learning-Based Scaling Management for Kubernetes Edge Clusters. IEEE Trans Netw Serv Manage 18(1):958–972

Balla D, Simon C, Maliosz M (2020) Adaptive scaling of Kubernetes pods. IEEE/IFIP Network Operations and Management Symposium 2020: Management in the Age of Softwarization and Artificial Intelligence, NOMS

Toka L, Dobreff G, Fodor B, Sonkoly B (2020) Adaptive AI-based auto-scaling for Kubernetes. IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID. pp 599–608

Wang M, Zhang D, Wu B (2020) A Cluster Autoscaler Based on Multiple Node Types in Kubernetes. IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference, ITNEC. pp 575–579

Kang R, Zhu M, He F, Sato T, Oki E (2021) Design of Scheduler Plugins for Reliable Function Allocation in Kubernetes. 17th International Conference on the Design of Reliable Communication Networks, DRCN

Vu DD, Tran MN, Kim Y (2022) Predictive hybrid autoscaling for containerized applications. IEEE Access 10:109768–109778

Download references

Acknowledgements

The author(s) received no financial support for the research and publication of this article.

Author information

Authors and affiliations.

Department of Computer Science, College of Computing and Informatics, University of Sharjah, Sharjah, UAE

Khaldoun Senjab, Sohail Abbas & Naveed Ahmed

College of Engineering and Information Technology, Ajman University, Ajman, UAE

Atta ur Rehman Khan

You can also search for this author in PubMed   Google Scholar

Contributions

Research was supervised by Sohail Abbas and Naveed Ahmed. Data collection, material preparation, and analysis were performed by Khaldoun. All authors read and approved the final manuscript. Conceptualization and revisions done by Sohail Abbas, Naveed Ahmed and Atta ur Rehman.

Corresponding author

Correspondence to Sohail Abbas .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Authors provide consent for publication.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Senjab, K., Abbas, S., Ahmed, N. et al. A survey of Kubernetes scheduling algorithms. J Cloud Comp 12 , 87 (2023). https://doi.org/10.1186/s13677-023-00471-1

Download citation

Received : 27 January 2023

Accepted : 01 June 2023

Published : 13 June 2023

DOI : https://doi.org/10.1186/s13677-023-00471-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cloud services
  • Data center infrastructure
  • Resource optimization
  • Containerized applications
  • Container orchestration
  • Scheduling algorithm

cloud computing research papers 2023

cloud computing research papers 2023

Advances, Systems and Applications

Volumes and issues

Volume 13 december 2024.

  • December 2024, issue 1

Volume 12 December 2023

  • December 2023, issue 1

Volume 11 December 2022

  • December 2022, issue 1

Volume 10 December 2021

  • December 2021, issue 1

Volume 9 December 2020

  • December 2020, issue 1

Volume 8 December 2019

  • December 2019, issue 1

Volume 7 December 2018

  • December 2018, issue 1

Volume 6 December 2017

  • December 2017, issue 1

Volume 5 December 2016

  • December 2016, issue 1

Volume 4 December 2015

  • December 2015, issue 1

Volume 3 December 2014

  • December 2014, issue 1

Volume 2 December 2013

  • December 2013, issue 1

Volume 1 December 2012

  • December 2012, issue 1

For authors

  • Find a journal
  • Publish with us
  • Track your research

Subscribe to the PwC Newsletter

Join the community, add a new evaluation result row, cloud computing.

83 papers with code • 0 benchmarks • 0 datasets

Benchmarks Add a Result

Latest papers, privacy-preserving deep learning using deformable operators for secure task learning.

cloud computing research papers 2023

To address these challenges, we propose a novel Privacy-Preserving framework that uses a set of deformable operators for secure task learning.

IMPaCT: Interval MDP Parallel Construction for Controller Synthesis of Large-Scale Stochastic Systems

kiguli/impact • 7 Jan 2024

This paper is concerned with developing a software tool, called IMPaCT, for the parallelized verification and controller synthesis of large-scale stochastic systems using interval Markov chains (IMCs) and interval Markov decision processes (IMDPs), respectively.

LiPar: A Lightweight Parallel Learning Model for Practical In-Vehicle Network Intrusion Detection

Through experiments, we prove that LiPar has great detection performance, running efficiency, and lightweight model size, which can be well adapted to the in-vehicle environment practically and protect the in-vehicle CAN bus security.

CloudEval-YAML: A Practical Benchmark for Cloud Configuration Generation

alibaba/cloudeval-yaml • 10 Nov 2023

We develop the CloudEval-YAML benchmark with practicality in mind: the dataset consists of hand-written problems with unit tests targeting practical scenarios.

Deep learning based Image Compression for Microscopy Images: An Empirical Study

In the end, we hope the present study could shed light on the potential of deep learning based image compression and the impact of image compression on downstream deep learning based image analysis models.

MLatom 3: Platform for machine learning-enhanced computational chemistry simulations and workflows

dralgroup/mlatom • 31 Oct 2023

MLatom 3 is a program package designed to leverage the power of ML to enhance typical computational chemistry simulations and to create complex workflows.

Federated learning compression designed for lightweight communications

Federated Learning (FL) is a promising distributed method for edge-level machine learning, particularly for privacysensitive applications such as those in military and medical domains, where client data cannot be shared or transferred to a cloud computing server.

Recoverable Privacy-Preserving Image Classification through Noise-like Adversarial Examples

Extensive experiments demonstrate that 1) the classification accuracy of the classifier trained in the plaintext domain remains the same in both the ciphertext and plaintext domains; 2) the encrypted images can be recovered into their original form with an average PSNR of up to 51+ dB for the SVHN dataset and 48+ dB for the VGGFace2 dataset; 3) our system exhibits satisfactory generalization capability on the encryption, decryption and classification tasks across datasets that are different from the training one; and 4) a high-level of security is achieved against three potential threat models.

FastAiAlloc: A real-time multi-resources allocation framework proposal based on predictive model and multiple optimization strategies

marcosd3souza/fast-ai-alloc • Future Generation Computer Systems 2023

To integrate these steps, this work proposes a framework based on the following strategies, widely used in the literature: Genetic Algorithms (GA), Particle Swarm Optimization (PSO) and Linear Programming, besides our Heuristic approach.

Technical note: ShinyAnimalCV: open-source cloud-based web application for object detection, segmentation, and three-dimensional visualization of animals using computer vision

Moreover, the rapid expansion of precision livestock farming is creating a growing need to educate and train animal science students in CV.

cloud computing research papers 2023

The Fourteenth International Conference on Cloud Computing, GRIDs, and Virtualization

Cloud computing 2023, june 26, 2023 to june 30, 2023 - nice, saint-laurent-du-var, france.

Deadlines differ for special tracks. Please consult the conference home page for special tracks Call for Papers (if any).

cloud computing research papers 2023

CLOUD COMPUTING 2023 - The Fourteenth International Conference on Cloud Computing, GRIDs, and Virtualization

June 26, 2023 - June 30, 2023

Cloud computing is a normal evolution of distributed computing combined with Service-oriented architecture, leveraging most of the GRID features and Virtualization merits. The technology foundations for cloud computing led to a new approach of reusing what was achieved in GRID computing with support from virtualization.

CLOUD COMPUTING 2023 is intended as an event to prospect the applications supported by the new paradigm and validate the techniques and the mechanisms. A complementary target is to identify the open issues and the challenges to fix them, especially on security, privacy, and inter- and intra-clouds protocols.

We solicit both academic, research, and industrial contributions. We welcome technical papers presenting research and practical results, position papers addressing the pros and cons of specific proposals, such as those being discussed in the standard fora or in industry consortia, survey papers addressing the key problems and solutions on any of the above topics short papers on work in progress, and panel proposals.

Industrial presentations are not subject to the format and content constraints of regular submissions. We expect short and long presentations that express industrial position and status.

Tutorials on specific related topics and panels on challenging areas are encouraged.

The topics suggested by the conference can be discussed in term of concepts, state of the art, research, standards, implementations, running experiments, applications, and industrial case studies. Authors are invited to submit complete unpublished papers, which are not under review in any other conference or journal in the following, but not limited to, topic areas.

All topics and submission formats are open to both research and industry contributions.

CLOUD COMPUTING 2023 conference tracks:

TRENDS: New trends

Fog-computing; Mobile Edge Computing; Cloudlets; Hosted Cloud services (WebRTC, Containers, Cloud micro-services); Cloud computing and SDN/NFV; Cloud computing and 5G; Cloud computing and LTE Pro 4.5; Cloud computing ad Big Data; High performance computing (HPC) in the Cloud; Superfluid Clouds; Mobile Apps to the public Clouds; Vehicular Cloud networks; Cloud orchestration features; Converged edge systems; Cloud federation; Micro-cloud provider federation; Open-implementation Cloud infrastructures; Untrusted Cloud environments; Multiple Clouds and data centers; Power Constrained VMs; Cloud Green abstraction layer; Managing applications in the clouds (CloudOps)

CLOUD: Cloud computing

Cloud economics; Core cloud services; Cloud technologies; Cloud computing; On-demand computing models; Hardware-as-a-service; Software-as-a-service [SaaS applications]; Platform-as-service; Storage as a service in cloud; Data-as-a-Service; Service-oriented architecture (SOA); Cloud computing programming and application development; Scalability, discovery of services and data in Cloud computing infrastructures; Trust and clouds; Client-cloud computing challenges; Geographical constraints for deploying clouds

CLOUD: Challenging features

Privacy, security, ownership and reliability issues; Performance and QoS; Dynamic resource provisioning; Power-efficiency and Cloud computing; Load balancing; Application streaming; Cloud SLAs; Business models and pricing policies; Cloud service subscription models; Cloud standardized SLA; Cloud-related privacy; Cloud-related control; Managing applications in the clouds; Mobile clouds; Roaming services in Clouds; Agent-based cloud computing; Cloud brokering; Cloud contracts (machine readable); Cloud security; Security and assurance properties in cloud environments; Big Data Analytics in clouds; Cloud computing back-end solutions; Cloud applications portability; Cloud-native application design; Security by design for cloud services; Data privacy guarantee at run-time

CLOUD: Platforms, Infrastructures and Applications

Custom platforms; Large-scale compute infrastructures; Data centers; Processes intra- and inter-clouds; Content and service distribution in Cloud computing infrastructures; Multiple applications can run on one computer (virtualization a la VMWare); Grid computing (multiple computers can be used to run one application); Cloud-computing vendor governance and regulatory compliance; Enterprise clouds; Enterprise-centric cloud computing; Interaction between vertical clouds; Public, Private, and Hybrid clouds; Cloud computing testbeds

GRID: Grid networks, services and applications

GRID theory, frameworks, methodologies, architecture, ontology; GRID infrastructure and technologies; GRID middleware; GRID protocols and networking; GRID computing, utility computing, autonomic computing, metacomputing; Programmable GRID; Data GRID; Context ontology and management in GRIDs; Distributed decisions in GRID networks; GRID services and applications; Virtualization, modeling, and metadata in GRID; Resource management, scheduling, and scalability in GRID; GRID monitoring, control, and management; Traffic and load balancing in GRID; User profiles and priorities in GRID; Performance and security in GRID systems; Fault tolerance, resilience, survivability, robustness in GRID; QoS/SLA in GRID networks; GRID fora, standards, development, evolution; GRID case studies, validation testbeds, prototypes, and lessons learned

VIRTUALIZATION: Computing in virtualization-based environments

Principles of virtualization; Virtualization platforms; Thick and thin clients; Data centers and nano-centers; Open virtualization format; Orchestration of virtualization across data centers; Dynamic federation of compute capacity; Dynamic geo-balancing; Instant workload migration; Virtualization-aware storage; Virtualization-aware networking; Virtualization embedded-software-based smart mobile phones; Trusted platforms and embedded supervisors for security; Virtualization management operations /discovery, configuration, provisioning, performance, etc.; Energy optimization and saving for green datacenters; Virtualization supporting cloud computing; Applications as pre-packaged virtual machines; Licensing and support policies

INSTRUCTION FOR THE AUTHORS

Authors of selected papers will be invited to submit extended versions to one of the IARIA Journals .

Publisher: XPS (Xpert Publishing Services) Archived: ThinkMind TM Digital Library (free access) Prints available at Curran Associates, Inc. How to submit to appropriate indexes .

Only .pdf or .doc files will be accepted for paper submission. All received submissions will be acknowledged via an automated system.

Contribution types

  • regular papers [in the proceedings, digital library]
  • short papers (work in progress) [in the proceedings, digital library]
  • ideas: two pages [in the proceedings, digital library]
  • extended abstracts: two pages [in the proceedings, digital library]
  • posters: two pages [in the proceedings, digital library]
  • posters: slide only [slide-deck posted on www.iaria.org ]
  • presentations: slide only [slide-deck posted on www.iaria.org ]
  • demos: two pages [posted on www.iaria.org ]

Final author manuscripts will be 8.5" x 11", not exceeding 6 pages; max 4 extra pages allowed at additional cost. Helpful information for paper formatting can be found here . Latex templates are also available.

Slides-based contributions can use the corporate/university format and style.

Your paper should also comply with the additional editorial rules .

Once you receive the notification of contribution acceptance, you will be provided by the publisher an online author kit with all the steps an author needs to follow to submit the final version. The author kits URL will be included in the letter of acceptance.

We would recommend that you should not use too many extra pages, even if you can afford the extra fees. No more than 2 contributions per event are recommended, as each contribution must be separately registered and paid for. At least one author of each accepted paper must register to ensure that the paper will be included in the conference proceedings and in the digital library, or posted on the www.iaria.org (for slide-based contributions).

CONTRIBUTION TYPE

Regular Papers (up to 6-10 page article -6 pages covered the by regular registration; max 4 extra pages allowed at additional cost- ) (oral presentation) These contributions could be academic or industrial research, survey, white, implementation-oriented, architecture-oriented, white papers, etc. They will be included in the proceedings, posted in the free-access ThinkMind digital library and sent for indexing. Please submit the contributions following the instructions for the regular submissions using the "Submit a Paper" button and selecting the appropriate contribution type. 12-14 presentation slides are suggested.

Short papers (work in progress) (up to 4 pages long) (oral presentation) Work-in-progress contributions are welcome. These contributions represent partial achievements of longer-term projects. They could be academic or industrial research, survey, white, implementation-oriented, architecture-oriented, white papers, etc. Please submit the contributions following the instructions for the regular submissions using the "Submit a Paper" button and selecting the contribution type as work in progress. Contributors must follow the conference deadlines, describing early research and novel skeleton ideas in the areas of the conference topics. The work will be published in the conference proceedings, posted in the free-access ThinkMind digital library and sent for indexing. For more details, see the Work in Progress explanation page. 12-14 presentation slides are suggested.

Ideas contributions (2 pages long) (oral presentation) This category is dedicated to new ideas in their very early stage. Idea contributions are expression of yet to be developed approaches, with pros/cons, not yet consolidated. Ideas contributions are intended for a debate and audience feedback. Please submit the contributions following the instructions for the regular submissions using the "Submit a Paper" button and selecting the contribution type as Idea. Contributors must follow the conference deadlines, describing early research and novel skeleton ideas in the areas of the conference topics. The work will be published in the conference proceedings, posted in the free-access ThinkMind digital library and sent for indexing. For more details, see the Ideas explanation page. 12-14 presentation slides are suggested.

Extended abstracts (2 pages long) (oral presentation) Extended abstracts summarize a long potential publication with noticeable results. It is intended for sharing yet to be written, or further on intended for a journal publication. Please submit the contributions following the instructions for the regular submissions using the "Submit a Paper" button and selecting the contribution type as Extended abstract. Contributors must follow the conference deadlines, describing early research and novel skeleton ideas in the areas of the conference topics. The work will be published in the conference proceedings, posted in the free-access ThinkMind digital library and sent for indexing. 12-14 presentation slides are suggested.

Posters (paper-based, two pages long) (oral presentation) Posters are intended for ongoing research projects, concrete realizations, or industrial applications/projects presentations. The poster may be presented during sessions reserved for posters, or mixed with presentation of articles of similar topic. A two-page paper summarizes a presentation intended to be a POSTER. This allows an author to summarize a series of results and expose them via a big number of figures, graphics and tables. Please submit the contributions following the instructions for the regular submissions using the "Submit a Paper" button and selecting the contribution type as Poster Two Pages. Contributors must follow the conference deadlines, describing early research and novel skeleton ideas in the areas of the conference topics. The work will be published in the conference proceedings, posted in the free-access ThinkMind digital library and sent for indexing. 8-10 presentation slides are suggested. Also a big Poster is suitable, used for live discussions with the attendees, in addition to the oral presentation.

Posters (slide-based, only) (oral presentation) Posters are intended for ongoing research projects, concrete realizations, or industrial applications/projects presentations. The poster may be presented during sessions reserved for posters, or mixed with presentation of articles of similar topic. The slides must have comprehensive comments. This type of contribution only requires a 8-10 slide-deck. Please submit the contributions following the instructions for the regular submissions using the "Submit a Paper" button and selecting the contribution type as Poster (slide-only). The slide-deck will be posted, post-event, on www.iaria.org . 8-10 presentation slides are suggested. Also a big Poster is suitable, used for live discussions with the attendees, additionally to the oral presentation.

Presentations (slide-based, only) (oral presentation) These contributions represent technical marketing/industrial/business/positioning presentations. This type of contribution only requires a 12-14 slide-deck. Please submit the contributions following the submission instructions by using the "Submit a Paper" button and selecting the contribution type as Presentation (slide-only). The slide-deck will be posted, post-event, on www.iaria.org . 12-14 presentation slides are suggested.

Demos (two pages) [posted on www.iaria.org ] Demos represent special contributions where a tool, an implementation of an application, or a freshly implemented system is presented in its alfa/beta version. It might also be intended for thsoe new application to gather the attendee opinion. A two-page summary for a demo is intended to be. It would be scheduled in special time spots, to ensure a maximum attendance from the participants. Please submit the contributions following the submission instructions by using the "Submit a Paper" button and selecting the contribution type as Demos. The Demos paper will be posted, post-event, on www.iaria.org .

Tutorial proposals Tutorials provide overviews of current high interest topics. Proposals should be for 2-3 hour long. Proposals must contain the title, the summary of the content, and the biography of the presenter(s). The tutorial slide decks will be posted on the IARIA site. Please send your proposals to tutorial proposal

Panel proposals The organizers encourage scientists and industry leaders to organize dedicated panels dealing with controversial and challenging topics and paradigms. Panel moderators are asked to identify their guests and manage that their appropriate talk supports timely reach our deadlines. Moderators must specifically submit an official proposal, indicating their background, panelist names, their affiliation, the topic of the panel, as well as short biographies. The panel slide deck will be posted on the IARIA site. Please send your proposals to panel proposal

cloud computing research papers 2023

Copyright (c) 2006-2024, IARIA

Cloud computing study 2023

cloud computing research papers 2023

  • cloud computing

In its 10th year, Foundry’s 2023 Cloud Computing research was conducted to measure cloud computing trends among technology decision-makers, including adoption plans, spending, business drivers, challenges, and top cloud growth areas, such as AI. While cloud adoption is continuing at pace, there are signs that the frenzied activity that took place during and following the pandemic period is easing somewhat. This year’s study found that 57% of organizations have accelerated their cloud migration over the past 12 months, however this was 69% last year.   

Organizations recognize the value of cloud computing, as half of IT decision-makers (ITDMs) state that cloud capabilities have helped their organization achieve sustainable revenue growth over the past 12 months. However, adoption and migration do not come without challenges that cloud providers must address. When asked about the biggest obstacles to implementing their cloud strategy, the top three stated by ITDMs are controlling cloud costs, data privacy and security challenges, and lack of cloud security skills/expertise.

Our study explores these challenges in more depth and also highlights what technology buyers need from their current and future cloud providers in order to successfully advance their cloud strategy.

Key takeaways for tech marketers:

  • Cloud budgets continue to increase – IT decision-makers report that 31% of their overall IT budget will go towards cloud computing and two-thirds expect their cloud budget to increase in the next 12 months.  
  • Organizations are defaulting to cloud-based services when upgrading or purchasing new technical capabilities. It’s important to have a grasp on what business objectives are driving these cloud investments.
  • IT decision-makers have plans for artificial intelligence and cloud computing, as the majority say AI/ML is the top cloud growth area this year and it is the number one cloud capability ITDMs plan to adopt.
  • Due to an increase in cloud investments, organizations have added new cloud roles and functions. Understand their business needs and responsibilities when creating your messaging.
  • Despite the benefits organizations see from the cloud, a variety of challenges still get in their way, mostly around cost control, security expertise and a skills gap. Provide solutions to your customers and prospects to combat these challenges.

View the sample slides below for further insights on the cloud computing research:

cloud computing research papers 2023

Click to enlarge

Additional cloud resources.

cloud computing research papers 2023

About the research

Foundry’s 2023 Cloud Computing Survey is the 10th year of this research and was conducted to measure cloud computing trends among technology decision-makers including: adoption plans, spending, business drivers, challenges, and top cloud growth areas, such as AI. The study was fielded throughout August 2023 and is based on the responses of 893 global IT decision-makers that are involved in the purchase process for cloud computing and their organization has, or plans to have, at least one application, or a portion of their infrastructure, in the cloud.

cloud computing research papers 2023

Security priorities: a look ahead

cloud computing research papers 2023

Inside the C-suite: CIOs share how the role is evolving

cloud computing research papers 2023

On the road: Richard O’Connor, B2B Marketing

Related content.

cloud computing research papers 2023

Role & Influence of the Technology Decision-Maker white paper 2023

Insights to help tech vendors understand the decisions made around their products and services as they plan their strategies for the future and gain an advantage in this competitive landscape.

By Foundry • Research topic

Jun 15, 2023 • 10 mins

Cloud_study_2023

Cloud computing executive summary 2023

Insight into cloud computing trends There has been a massive shift to moving IT environments to the cloud over the past three years as organizations navigated a remote-first world due to the pandemic. According to…

Sep 14, 2023 • 1 mins

State-of-the-CIO-2024

State of the CIO Study 2024

Foundry’s 23rd annual State of the CIO research details how the role of the CIO continues to evolve and elevate in today’s business climate and defines the CIO agenda for 2024.

Jan 24, 2024 • 5 mins

CCGRID 2023

Accepted Papers

At ccgrid 2023 & co-located workshops.

Last updated on 21 April, 2023

Full Research Papers

Hardware systems and networking track.

  • hsSpMV : A Heterogeneous and SPM-aggregated SpMV for SW26010-Pro many-core processor , Jingshan Pan, Lei Xiao, Min Tian, Li Wang, Chaochao Yang, Renjiang Chen, Zenghui Ren, Anjun Liu and Guanghui Zhu
  • Rethinking Design Paradigm of Graph Processing System with a CXL-like Memory Semantic Fabric, Xu Zhang, Yisong Chang, Tianyue Lu, Ke Zhang and Mingyu Chen
  • RoUD: Scalable RDMA over UD in Lossy Data Center Networks, Zhiqiang He and Yuxin Chen
  • An Optical Transceiver Reliability Study based on SFP Monitoring and OS-level Metric Data, Paolo Notaro, Soroush Haeri, Qiao Yu, Jorge Cardoso and Michael Gerndt ★  

Software Systems and Platforms Track

  • Runway: In-transit Data Compression on Heterogeneous HPC Systems, John Ravi, Michela Becchi and Suren Byna
  • LayerCake: Efficient Inference Serving with Cloud and Mobile Resources, Samuel Ogden and Tian Guo
  • An Asynchronous Dataflow-Driven Execution Model For Distributed Accelerator Computing, Philip Salzmann, Fabian Knorr, Peter Thoman, Philipp Gschwandtner, Biagio Cosenza and Thomas Fahringer
  • Predicting the Performance-Cost Trade-off of Applications Across Multiple Systems, Amir Nassereldine, Safaa Diab, Mohammed Baydoun, Kenneth Leach, Maxim Alt, Dejan Milojicic and Izzat El Hajj
  • KalpaVriksh: Efficient and Cost-effective GUI Application Hosting using Singleton Snapshots, Sumaiya Shaikh, Saurabh Kumar and Debadatta Mishra
  • Designing and Optimizing a GPU-aware MPI Library for Intel GPUs: Early Experiences, Chen-Chun Chen, Kawthar Shafie Khorassani, Goutham Kalikrishna Reddy Kuncham, Rahul Vaidya, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni and Dhabaleswar K. Panda
  • A Case Study of Data Management Challenges Presented in Large-Scale Machine Learning Workflows, Claire Lee, Wei-Keng Liao, V Hewes, Giuseppe Cerati, Jim Kowalkowski, Adam Aurisano, Alok Choudhary and Ankit Agrawal
  • HeROfake: Heterogeneous Resources Orchestration in a Serverless Cloud: An Application to Deepfake Detection, Vincent Lannurien, Laurent D’Orazio, Olivier Barais, Esther Bernard, Olivier Weppe, Laurent Beaulieu, Amine Kacete, Stéphane Paquelet and Jalil Boukhobza

ML for Systems, and Systems for ML Track

  • FreeTrain: A Framework to Utilize Unused Supercomputer Nodes for Training Neural Networks, Zhengchun Liu, Rajkumar Kettimuthu, Michael Papka and Ian Foster
  • Overcoming Noisy Labels in Federated Learning Through Local Self-Guiding, Daokuan Bai, Shanshan Wang, Wenyue Wang, Chuan Zhao and Zhenxiang Chen
  • Measuring the Impact of Gradient Accumulation on Cloud-based Distributed Training, Zimeng Huang, Bo Jiang, Tian Guo and Yunzhuo Liu
  • HDFL: A Heterogeneity and Client Dropout-Aware Federated Learning Framework, Syed Zawad, Ali Anwar, Yi Zhou, Nathalie Baracaldo and Feng Yan
  • Scavenger: A Cloud Service For Optimizing Cost and Performance of ML Training, Sahil Tyagi and Prateek Sharma
  • A Deep Learning Pipeline Parallel Optimization Method, Tiantian Lv, Lu Wu, Zhigang Zhao, Chunxiao Wang and Chuantao Li
  • Chronica: A Data-Imbalance-Aware Scheduler for Data-Parallel Distributed Deep Learning, Sanha Maeng, Euhyun Moon and Sungyong Park
  • ScaMP: Scalable Meta-Parallelism for Deep Learning Search, Quentin Anthony, Lang Xu, Aamir Shafi, Hari Subramoni and Dhabaleswar Panda

Future Compute Continuum Track

  • Bottleneck identification and failure prevention with procedural learning in 5G RAN, Tobias Sundqvist, Monowar Bhuyan and Erik Elmroth
  • Soft Reliability Aware Scheduling of Real-time Applications on Cloud with MTTF constraints, Manojit Ghose, Aryabartta Sahu, Krishna Prabin Pandey and Niyati Chaudhari
  • AggFirstJoin: Optimizing Geo-Distributed Joins using Aggregation-Based Transformations, Dhruv Kumar, Sohaib Ahmad, Abhishek Chandra and Ramesh K. Sitaraman   ★
  • CacheIn: A Secure Distributed Multi-layer Mobility-Assisted Edge Intelligence based Caching for Internet of Vehicles, Ankur Nahar, Himani Sikarwar, Sanyam Jain and Debasis Das
  • PrivFlow: Secure and Privacy Preserving Serverless Workflows on Cloud, Surabhi Garg, Meena Singh Dilip Thakur, Dr. Rajan M A, Lakshmipadmaja Maddali and Ramachandran Vigneswaran

Applications and Translation Track

  • Use of Cost Surface Analysis and Stream Order Analysis for Computing Shortest Paths, Yogesh Dasgaonkar
  • WiDual: User Identified Gesture Recognition Using Commercial WiFi, Miaoling Dai, Chenhong Cao, Tong Liu, Meijia Su, Yufeng Li and Jiangtao Li
  • Enabling Fast, Effective Visualization of Voluminous Gridded Spatial Datasets, Paahuni Khandelwal, Menuka Warushavithana, Sangmi Lee Pallickara and Shrideep Pallickara
  • Blockchain Proportional Governance Reconfiguration: Mitigating a Governance Oligarchy, Deepal Tennakoon and Vincent Gramoli
  • Serverless Approach to Sensitivity Analysis of Computational Models, Piotr Kica, Magdalena Otta, Krzysztof Czechowicz, Karol Zając, Piotr Nowakowski, Andrew Narracott, Ian Halliday and Maciej Malawski
  • Scheduling DNN Inferencing on Edge and Cloud for Personalized UAV Fleets, Suman Raj, Harshil Gupta and Yogesh Simmhan
  • CrossLedger: A Pioneer Cross-chain Asset Transfer Protocol, Lokendra Vishwakarma, Amritesh Kumar and Debasis Das
  • Towards Improving Reverse Time Migration Performance by High-speed Lossy Compression, Yafan Huang, Kai Zhao, Sheng Di, Guanpeng Li, Maxim Dmitriev, Thierry-Laurent D. Tonellot and Franck Cappello
  • Mixed Precision Based Parallel Optimization of Tensor Mathematical Operations on a New-generation Sunway Processor, Shuwei Fan, Yao Liu, Juliang Su, Xianyou Wu and Qiong Jiang
  • Congestion Minimization using Fog-deployed DRL-Agent Feedback enabled Traffic Light Cooperative Framework, Anuj Sachan, Nisha Singh Chauhan and Neetesh Kumar
  • Accelerating Hybrid DFT Simulations Using Performance Modeling on Supercomputers, Yosuke Oyama, Takumi Honda, Atsushi Ishikawa and Koichi Shirahata
  • A Cloud-Fog Architecture for Video Analytics on Large Scale Camera Networks Using Semantic Scene Analysis, Kunal Jain, Kishan Sairam Adapa, Kunwar Shaanjeet Singh Singh Grover, Ravi Kiran Sarvadevabhatla and Suresh Purini
  • Speaker recognition system of flexible throat microphone using contrastive learning, Weiliang Zheng, Zhenxiang Chen, Yang Li, Xiaoqing Jiang and Xueyang Cao

Research Posters

  • TwMiner: Mining Relevant Tweets of News Articles, Roshni Chakraborty and Nilotpal Chakraborty
  • Privacy Regulations-Aware Storage Service  Selection in Multi-Cloud, Rishabh Kumar, Pankaj Sahu, Shubhro Roy, Sutapa Mondal, Mangesh Gharote and Sachin Lodha
  • Evaluating Kubernetes at the Edge for Fault Tolerant Multi-Camera Computer Vision Applications, Owen Heckmann and Arun Ravindran
  • Hierarchical Clustering Architecture for Metaverse Applications, Santhosh Kumar, Goutam Sanyal and Prabhakar M
  • Optimizing Memory Allocation in a Serverless Architecture through Function Scheduling, Manish Pandey and Young Woo Kwon
  • Sequence-based System Call Filtering for Enhanced Container Security, is it beneficial? Somin Song, Sahil Suneja, Michael Le and Byungchul Tak
  • Power to the Applications: The Vision of Continuous Decentralized Autoscaling, Martin Straesser, Stefan Geißler, Tobias Hoßfeld and Samuel Kounev
  • 3D CNN Acceleration using Block Circulant Matrix in Frequency Domain, Rui Han, Huarong Xu, Peng Jiang, Xiongwei Jiang and Jiaming Qian
  • Distributed In-band Network Telemetry, Dhyey Thummar, Iram Nawab and Sameer Kulkarni
  • Sparse-HeteroCL: From Sparse Tensor Algebra to Highly Customized Accelerators on FPGAs, Jize Pang, Lei Gong, Chao Wang and Xuehai Zhou
  • Query Latency Optimization by Resource-Aware Task Placement in Fog, Fatima Abdullah, Limei Peng and Byungchul Tak
  • Satin Bowerbird optimization based classification model for heart disease prediction using deep learning in E-Healthcare, Kamal Kumar Gola and Shikha Arya
  • Knowledge Enhanced Digital Objects: a Data Lake Approach, Yu Luo and Beth Plale

Early Career and Students’ Showcase Posters

  • Performance Modelling of Graph Neural Networks , Pranjal Naman and Yogesh Simmhan
  • Towards efficient scheduling of concurrent DNN training and inferencing on accelerated edge devices, Prashanthi S K, Vinayaka Hegde and Yogesh Simmhan
  • Privacy-preserving Job Scheduler for GPU Sharing, Aritra Ray, Kyle Lafata, Zhaobo Zhang, Ying Xiong and Krishnendu Chakrabarty
  • Mobile app platform for Personalized UAV Fleets using Edge and Cloud, Suman Raj and Yogesh Simmhan
  • Scalable, High-Quality Scheduling of Data Center Workloads, Meghana Thiyyakat, Subramaniam Kalambur and Dinkar Sitaram
  • GA-PSO-Based Cryptographic Technique for Cloud Data Security, Amit Sanger and Rahul Johari
  • Microservice-based in-network security framework for FPGA NICs, Mayank Rawat, Lasani Hussain, Neeraj Kumar Yadav, Sumit Darak, Praveen Tammana and Rinku Shah
  • Deep Learning based Approach for Fast, Effective Visualization of Voluminous Gridded Spatial Observations, Paahuni Khandelwal, Menuka Warushavithana, Sangmi Lee Pallickara and Shrideep Pallickara
  • Importance-driven In situ Analysis and Visualization, Muzafar Ahmad Wani and Preeti Malakar
  • Unique Prefix vs. Unique Mask for Minimizing SDN Flows with Transparent Edge Access, Josef Hammer and Hermann Hellwagner
  • Toward Next-Generation Distributed Rate-limiters, Iram Nawab and Sameer Kulkarni
  • Distributed and Dependable Software-Defined Storage Control Plane for HPC, Mariana Miranda
  • A Lightweight, Mobility-Aware, Geospatial & Temporal Data Store for Multi-UAV Systems, Shashwat Jaiswal, Suman Raj, Subhajit Sidhanta and Yogesh Simmhan
  • Extending the Interval-centric Distributed Computing Model for Incremental Graphs, Varad Kulkarni, Ruchi Bhoot and Yogesh Simmhan
  • Comparing the Orchestration of Quantum Applications on Hybrid Clouds, Rajiv Sangle, Tuhin Khare and Yogesh Simmhan
  • A flexible dataflow CNN accelerator on FPGA, Haoran Li, Lei Gong, Chao Wang and Xuehai Zhou
  • To Think Like a Vertex (or Not) for Distributed Training of Graph Neural Networks, Varad Kulkarni, Akarsh Chaturevedi, Pranjal Naman and Yogesh Simmhan
  • Scavenger: A Cloud Service for Optimizing Cost and Performance of DL Training, Sahil Tyagi
  • An Improved PBFT-Based Consensus Protocol for Industrial IoT, Roshan Singh and Sukumar Nandi
  • AggFirstJoin: Optimizing Geo-Distributed Joins using Aggregation-Based Transformations, Dhruv Kumar, Sohaib Ahmad, Abhishek Chandra and Ramesh Sitaraman

TCSC SCALE Challenge

  • Highly Scalable Large-Scale Asynchronous Graph Processing using Actors , Youssef Elmougy, Akihiro Hayashi, Vivek Sarkar
  • ParaDiS: a Parallel and Distributed framework for Significant pattern mining, Jyoti, Sriram Kailasam, Aleksey Buzmakov
  • BigHOST: Automatic Grading System for Big Data Assignments, Vishal Ramesha, Sachin Shankar, Suhas Thalanki, Supreeth Kurpad, Prafullata Auradkar

1st Workshop on Data and Service modeling for Edge Computing (DASMEC) →

  • Edge Computing Solutions Supporting Voice Recognition Services for Speakers with Dysarthria, Davide Mulfari, Lorenzo Carnevale, Antonino Galletta, Massimo Villari
  • Gateway-based certification approach to include IoT nodes in a trusted Edge/Cloud environment, Valeria Lukaj, Francesco Martella, Maria Fazio, Antonino Galletta, Antonio Celesti, Massimo Villari

6th Workshop on Security Trust Privacy for Cyber-Physical Systems (STP-CPS) →

  • N ext Generation Financial Services: Role of Blockchain Enabled Federated Learning and Metaverse , P ushpita Chatterjee, Debashis Das, Danda B Rawat
  • Generative AI for Cyber Threat-Hunting in 6G-Enabled IoT Networks, Mohamed Amine Ferrag, Merouane Debbah, Muna Al-Hawawreh
  • FL-PSO: A Federated Learning Approach with Particle Swarm Optimization for Brain Stroke Prediction, N ancy Victor, Sweta Bhattacharya, Praveen Kumar Reddy Maddikunta, Fasial Mohammed Alotaibi, Thippa Reddy Gadekallu, Rutvij H. Jhaveri
  • Cognitive Health Assessment of Decentralized Smart Home Activities using Federated Learning, Javed Abdul Rehman, Jerry Chun-Wei Lin, Gautam Srivastava
  • A Blockchain-Based Security Management Framework for Cyber-Physical Systems, D ebashis Das, Sourav Banerjee, Rakhi Chakraborty, Kousik Dasgupta, Pushpita Chatterjee, Uttam Ghosh
  • S ecuring Multi-IRS mmWave Communications with Eavesdroppers and Friendly and Unfriendly Jammers in Vehicular Cyber Physical Systems , Reham Alsabet, Danda B. Rawat
  • S AS-UNet: Modified Encoder-Decoder Network for the Segmentation of Obscenity in Images , S onali Samal, Thippa Reddy Gadekellu, Pankaj Rajput, Yu-Dong Zhang, Bunil Kumar Balabantaray
  • A Study on Secure Network Slicing in 5G , Pranav Kumar Singh, Maharaj Brahma, Panchanan Nath, Uttam Ghosh
  • A Review of Colonial Pipeline Ransomware Attack, Jack Beerman, David Berent, Zach Falter, Suman Bhunia
  • D OC-NAD: A Hybrid Deep One-Class Classifier for Network Anomaly Detection , Mohanad Sarhan, Gayan Kulatilleke, Wai Weng Lo, Siamak Layeghy, Marius Portmann

1st Workshop on Cloud, Edge, and Fog for smart industries (CLEF) →

  • Controlling Air Pollution in Data Centers using Green Data Centers, Sweta Dey, Sujata Pal
  • API Traffic Anomaly Detection in Microservice Architecture, Sowmya M, Ankith J Rai, Spoorthi V, MD Irfan, Prasad B Honnavalli, Nagasundari S
  • AI-Powered Interfaces for Extended Reality to Support Remote Maintenance, A kos Nagy, George Amponis, Konstantinos Kyranou, Thomas Lagkas, Alexandros Apostolos Boulogeorgos, Panagiotis Sarigiannidis, Vasileios Argyriou
  • Improving Industry 4.0 Readiness: Monolith Application Refactoring using Graph Attention Networks, T anisha Rathod, Christina Joseph, John Martin

3rd Workshop on Advances in Data Systems Management, Engineering, and Analytics (MegaData) →

  • MUAR: Maximizing Utilization of Available Resources for Query Processing, Mayankkumar Patel, Minal Bhise
  • Information-Theoretically Secure Multi-Party Linear Regression and Logistic Regression, Hengcheng Zhou
  • Disease Prediction using Chest X-ray Images in Serverless Data Pipeline Framework, Vikas Singh, Neha Singh, Mainak Adhikari
  • Spark Based Parallel Frequent Pattern Rules for Social Media Data Analytics, Shubhangi Chaturvedi, Sri Khetwat Saritha, Animesh Chaturvedi
  • OPERA-gSAM: Big Data Processing Framework for UMI Sequencing at High Scalability and Efficiency, Pablo Vazquez Caderno, Feras M. Awaysheh, Yolanda Colino-Sanguino, Laura Rodriguez de la Fuente, Fatima Valdes-Mora, José C. Cabaleiro, Tomas F.,Pena, David Gallego-Ortega

4th Workshop on Secure IoT, Edge and Cloud Systems (SioTEC) →

  • E fficient Design for Smart Environment using Raspberry Pi with Blockchain and IoT (BRIoT) , S unil K. Ponugumati, Kamran Ali, Zaid Zahoor, Aboubaker Lasebae, Anum T. Kiyani, Ali Khoshkholghi, Latha Maddu
  • B lockchains’ Federation: Developing Personal Health Trajectory-Centered Health Systems , J avier Rojo, Jose García, Javier Berrocal, Luca Foschini, Paolo Bellavista, Juan Hernández, Juan M. Murillo
  • A n Isolation-Aware Online Virtual Network Embedding via Deep Reinforcement Learning , Ali Gohar, Chunming Rong , Sanghwan Lee
  • I nformation Fusion-Based Cybersecurity Threat Detection for Intelligent Transportation System , Abdullahi Chowdhury, Ranesh Naha , Shahriar Kaisar, Mohammad Ali Khoshkholghi, Kamran Ali, Antonino Galletta
  • Transformer Inference Acceleration in Edge Computing Environment , Mingchu Li, Wenteng Zhang , Dexin Xia
  • RISC-V Core for Ethical Intelligent IoT Edge: Analysis & Design Choice , Hari Babu P, Sasirekha Gvk , Madhav Rao, Jyotsna Bapat, Debabrata Das
  • Securing Serverless Workflows on the Cloud Edge Continuum , Gabriele Morabito, Christian Sicari , Lorenzo Carnevale, Antonino Galletta, Massimo Villari
  • Web Services Relocation and Reallocation for Data Residency Compliance , Pankaj Sahu, Shubhro Roy , Mangesh Gharote, Sachin Lodha
  • Competitor Attack Model for Privacy-Preserving Deep Learning , Dongdong Zhao, Songsong Liao , Huanhuan Li, Jianwen Xiang
  • ML based D3R : Detecting DDoS using Random Forest , Anagha Ramesh, Ramza Haris , Sumedha Arora
  • On the timing of applying resource separation during DDoS attacks , Anmol Kumar, Gaurav Somani
  • Malware Detection using API Calls Visualisations and Convolutional Neural Networks , Rafael Bonilla, Carlos Jimenez , Jaime Pizarro, Joseph Avila, Joangie Márquez

cloud computing research papers 2023

IEEE/ACM CCGrid

© Copyright  CCGRID 2023 All Rights Reserved

cloud computing research papers 2023

Explore Bengaluru >>

cloud computing research papers 2023

Explore Karnataka>>

We, as the CCGrid community, are committed, both individually and collectively, to nurturing diversity and inclusion, sustaining a climate of mutual respect, and fostering a collaborative and supportive environment wherein everyone is empowered to succeed.

Authors and participants are encouraged to read the ACM Code of Ethics and Professional Conduct and the ACM Policy Against Harassment , and IEEE Computer Society's expectation of professional behavior found in the IEEE Computer Society Open Conference Statement , the IEEE Event Conduct and Safety Statement , and the IEEE Code of Ethics .

cloud security Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

A Review on AWS - Cloud Computing Technology

Abstract: Cloud computing is something simple we can define as maintaining data centers and data servers and also u can access technology services by computing power, storage, and database using cloud computing technology AWS(Amazon Web Services). It is an emerged model which is already popular among almost all enterprises. It provides us the concept of ondemand services where we are using and scaling cloud resources on demand and as per demand respectively. AWS Cloud computing is a cost-effective model. The major concern in this model is Security and Storage in the cloud. This is one of the major reasons many enterprises of choosing AWS cloud computing. This paper provides a review of security research in the field of cloud security and storage services of the AWS cloud platform. After security and storage, we have presented the working of AWS (Amazon Web Service) cloud computing. AWS is the most trusted provider of cloud computing which not only provides excellent cloud security but also provides excellent cloud storage services. The main aim of this paper is to make cloud computing storage and security a core operation and not an add-on operation. As per the increase in the Service provider and related companies, this AWS Cloud Platform plays a vital role in service industries by giving its best web services, so, therefore, choosing the cloud service providers wisely is the basic need of the industry. Therefore we are going to see how AWS fulfills all these specific needs. Keywords: Trusted Computing, AWS, Information-Centric Security, Cloud Storage, S3, EC2, Cloud Computing

Deep Learning Approaches to Cloud Security

Genetic algorithm-based pseudo random number generation for cloud security, cloud security service for identifying unauthorized user behaviour, qos based cloud security evaluation using neuro fuzzy model, azure cloud security for absolute beginners, mitigating theft-of-service attack - ensuring cloud security on virtual machines, cloud computing security requirements: a review.

Abstract Cloud computing is a new technology that is undergoing tremendous development today. People who use it are not able to separate the reasonable from the unreasonable arguments that come with the security requirements in the cloud. The claim that cloud computing is hereditarily insecure is as absurd as the claim that cloud computing does not create new security problems. Cloud computing is a way to dynamically increase resources without the need for in-depth knowledge of a brand new infrastructure, without training new workers or designing new software solutions. The article aims to analyse the different cloud security issues and models of cloud architectures. Some of the main problems with security in virtualization, concerns about storing data in the cloud and the assessment of risk tolerance in cloud computing are presented. Legal and regulatory issues for the protection of personal data are addressed.

The Vulnerabilities of Cloud Computing : A Review

A Cloud is a type of analogous and scattered system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources “ . cloud computing is the dynamic provisioning of IT capabilities (hardware, software, or services) from third parties over a network. However this technology is still in its initial stages of development, as it suffers from threats and vulnerabilities that prevent the users from trusting it. Various malicious activities from illegal users have threatened this technology such as data misuse, inflexible access control and limited monitoring. The occurrence of these threats may result into damaging or illegal access of critical and confidential data of users. This article is in order to describe the impact of those vulnerabilities and threats to create awareness among organisations and users so that they can Adopt this technology with trust And form a trusted provider Who has trusted security policies. Here we have defined cloud-specific vulnerabilities, cloud feature vulnerabilities and propose a reference vulnerabilities architecture of cloud computing and threats related cloud computing. Cloud security and privacy plays an important role to avoid cloud threats .Cloud Privacy Concerns the expression of or devotion to various legal and non- legal norms regarding the right to private life. Cloud Security Concerns the confidentiality, ease of use and reliability of data or information. As the development of cloud computing, issue of security has become a top priority. In this article we are going to discuss about the Characteristics of vulnerabilities , cloud vulnerabilities and cloud threats , Also how we can overcome or avoid them and keep our data safe.

Security and Privacy in Cloud Computing: Technical Review

Advances in the usage of information and communication technologies (ICT) has given rise to the popularity and success of cloud computing. Cloud computing offers advantages and opportunities for business users to migrate and leverage the scalability of the pay-as-you-go price model. However, outsourcing information and business applications to the cloud or a third party raises security and privacy concerns, which have become critical in adopting cloud implementation and services. Researchers and affected organisations have proposed different security approaches in the literature to tackle the present security flaws. The literature also provides an extensive review of security and privacy issues in cloud computing. Unfortunately, the works provided in the literature lack the flexibility in mitigating multiple threats without conflicting with cloud security objectives. The literature has further focused on only highlighting security and privacy issues without providing adequate technical approaches to mitigate such security and privacy threats. Conversely, studies that offer technical solutions to security threats have failed to explain how such security threats exist. This paper aims to introduce security and privacy issues that demand an adaptive solution approach without conflicting with existing or future cloud security. This paper reviews different works in the literature, taking into account its adaptiveness in mitigating against future reoccurring threats and showing how cloud security conflicts have invalidated their proposed models. The article further presents the security threats surrounding cloud computing from a user perspective using the STRIDE approach. Additionally, it provides an analysis of different inefficient solutions in the literature and offers recommendations in terms of implementing a secure, adaptive cloud environment.

Export Citation Format

Share document.

For enquiries call:

+1-469-442-0620

banner-in1

  • Cloud Computing

Top 10 Cloud Computing Research Topics of 2024

Home Blog Cloud Computing Top 10 Cloud Computing Research Topics of 2024

Play icon

Cloud computing is a fast-growing area in the technical landscape due to its recent developments. If we look ahead to 2024, there are new research topics in cloud computing that are getting more traction among researchers and practitioners. Cloud computing has ranged from new evolutions on security and privacy with the use of AI & ML usage in the Cloud computing for the new cloud-based applications for specific domains or industries. In this article, we will investigate some of the top cloud computing research topics for 2024 and explore what we get most out of it for researchers or cloud practitioners. To master a cloud computing field, we need to check these Cloud Computing online courses .

Why Cloud Computing is Important for Data-driven Business?

The Cloud computing is crucial for data-driven businesses because it provides scalable and cost-effective ways to store and process huge amounts of data. Cloud-based storage and analytical platform helps business to easily access their data whenever required irrespective of where it is located physically. This helps businesses to take good decisions about their products and marketing plans. 

Cloud computing could help businesses to improve their security in terms of data, Cloud providers offer various features such as data encryption and access control to their customers so that they can protect the data as well as from unauthorized access. 

Few benefits of Cloud computing are listed below: 

  • Scalability: With Cloud computing we get scalable applications which suits for large scale production systems for Businesses which store and process large sets of data.
  • Cost-effectiveness : It is evident that Cloud computing is cost effective solution compared to the traditional on-premises data storage and analytical solutions due to its scaling capacity which leads to saving more IT costs. 
  • Security : Cloud providers offer various security features which includes data encryption and access control, that can help businesses to protect their data from unauthorized access.
  • Reliability : Cloud providers ensure high reliability to their customers based on their SLA which is useful for the data-driven business to operate 24X7. 

Top 10 Cloud Computing Research Topics

1. neural network based multi-objective evolutionary algorithm for dynamic workflow scheduling in cloud computing.

Cloud computing research topics are getting wider traction in the Cloud Computing field. These topics in the paper suggest a multi-objective evolutionary algorithm (NN-MOEA) based on neural networks for dynamic workflow scheduling in cloud computing. Due to the dynamic nature of cloud resources and the numerous competing objectives that need to be optimized, scheduling workflows in cloud computing is difficult. The NN-MOEA algorithm utilizes neural networks to optimize multiple objectives, such as planning, cost, and resource utilization. This research focuses on cloud computing and its potential to enhance the efficiency and effectiveness of businesses' cloud-based workflows.

The algorithm predicts workflow completion time using a feedforward neural network based on input and output data sizes and cloud resources. It generates a balanced schedule by taking into account conflicting objectives and projected execution time. It also includes an evolutionary algorithm for future improvement.

The proposed NN-MOEA algorithm has several benefits, such as the capacity to manage dynamic changes in cloud resources and the capacity to simultaneously optimize multiple objectives. The algorithm is also capable of handling a variety of workflows and is easily expandable to include additional goals. The algorithm's use of neural networks to forecast task execution times is a crucial component because it enables the algorithm to generate better schedules and more accurate predictions.

The paper concludes by presenting a novel multi-objective evolutionary algorithm-based neural network-based approach to dynamic workflow scheduling in cloud computing. In terms of optimizing multiple objectives, such as make span and cost, and achieving a better balance between them, these cloud computing dissertation topics on the proposed NN-MOEA algorithm exhibit encouraging results.

Key insights and Research Ideas:

Investigate the use of different neural network architectures for predicting the future positions of optimal solutions. Explore the use of different multi-objective evolutionary algorithms for solving dynamic workflow scheduling problems. Develop a cloud-based workflow scheduling platform that implements the proposed algorithm and makes it available to researchers and practitioners.

2. A systematic literature review on cloud computing security: threats and mitigation strategies 

This is one of cloud computing security research topics in the cloud computing paradigm. The authors then provide a systematic literature review of studies that address security threats to cloud computing and mitigation techniques and were published between 2010 and 2020. They list and classify the risks and defense mechanisms covered in the literature, as well as the frequency and distribution of these subjects over time.

The paper suggests the data breaches, Insider threats and DDoS attack are most discussed threats to the security of cloud computing. Identity and access management, encryption, and intrusion detection and prevention systems are the mitigation techniques that are most frequently discussed. Authors depict the future trends of machine learning and artificial intelligence might help cloud computing to mitigate its risks. 

The paper offers a thorough overview of security risks and mitigation techniques in cloud computing, and it emphasizes the need for more research and development in this field to address the constantly changing security issues with cloud computing. This research could help businesses to reduce the amount of spam that they receive in their cloud-based email systems.

Explore the use of blockchain technology to improve the security of cloud computing systems. Investigate the use of machine learning and artificial intelligence to detect and prevent cloud computing attacks. Develop new security tools and technologies for cloud computing environments. 

3. Spam Identification in Cloud Computing Based on Text Filtering System

A text filtering system is suggested in the paper "Spam Identification in Cloud Computing Based on Text Filtering System" to help identify spam emails in cloud computing environments. Spam emails are a significant issue in cloud computing because they can use up computing resources and jeopardize the system's security. 

To detect spam emails, the suggested system combines text filtering methods with machine learning algorithms. The email content is first pre-processed by the system, which eliminates stop words and stems the remaining words. The preprocessed text is then subjected to several filters, including a blacklist filter and a Bayesian filter, to identify spam emails.

In order to categorize emails as spam or non-spam based on their content, the system also employs machine learning algorithms like decision trees and random forests. The authors use a dataset of emails gathered from a cloud computing environment to train and test the system. They then assess its performance using metrics like precision, recall, and F1 score.

The findings demonstrate the effectiveness of the proposed system in detecting spam emails, achieving high precision and recall rates. By contrasting their system with other spam identification systems, the authors also show how accurate and effective it is. 

The method presented in the paper for locating spam emails in cloud computing environments has the potential to improve the overall security and performance of cloud computing systems. This is one of the interesting clouds computing current research topics to explore and innovate. This is one of the good Cloud computing research topics to protect the Mail threats. 

Create a stronger spam filtering system that can recognize spam emails even when they are made to avoid detection by more common spam filters. examine the application of artificial intelligence and machine learning to the evaluation of spam filtering system accuracy. Create a more effective spam filtering system that can handle a lot of emails quickly and accurately.

4. Blockchain data-based cloud data integrity protection mechanism 

The "Blockchain data-based cloud data integrity protection mechanism" paper suggests a method for safeguarding the integrity of cloud data and which is one of the Cloud computing research topics. In order to store and process massive amounts of data, cloud computing has grown in popularity, but issues with data security and integrity still exist. For the proposed mechanism to guarantee the availability and integrity of cloud data, data redundancy and blockchain technology are combined.

A data redundancy layer, a blockchain layer, and a verification and recovery layer make up the mechanism. For availability in the event of server failure, the data redundancy layer replicates the cloud data across multiple cloud servers. The blockchain layer stores the metadata (such as access rights) and hash values of the cloud data and access control information

Using a dataset of cloud data, the authors assess the performance of the suggested mechanism and compare it to other cloud data protection mechanisms. The findings demonstrate that the suggested mechanism offers high levels of data availability and integrity and is superior to other mechanisms in terms of processing speed and storage space.

Overall, the paper offers a promising strategy for using blockchain technology to guarantee the availability and integrity of cloud data. The suggested mechanism may assist in addressing cloud computing's security issues and enhancing the dependability of cloud data processing and storage. This research could help businesses to protect the integrity of their cloud-based data from unauthorized access and manipulation.

Create a data integrity protection system based on blockchain that is capable of detecting and preventing data tampering in cloud computing environments. For enhancing the functionality and scalability of blockchain-based data integrity protection mechanisms, look into the use of various blockchain consensus algorithms. Create a data integrity protection system based on blockchain that is compatible with current cloud computing platforms. Create a safe and private data integrity protection system based on blockchain technology.

5. A survey on internet of things and cloud computing for healthcare

This article suggests how recent tech trends like the Internet of Things (IoT) and cloud computing could transform the healthcare industry. It is one of the Cloud computing research topics. These emerging technologies open exciting possibilities by enabling remote patient monitoring, personalized care, and efficient data management. This topic is one of the IoT and cloud computing research papers which aims to share a wider range of information. 

The authors categorize the research into IoT-based systems, cloud-based systems, and integrated systems using both IoT and the cloud. They discussed the pros of real-time data collection, improved care coordination, automated diagnosis and treatment.

However, the authors also acknowledge concerns around data security, privacy, and the need for standardized protocols and platforms. Widespread adoption of these technologies faces challenges in ensuring they are implemented responsibly and ethically. To begin the journey KnowledgeHut’s Cloud Computing online course s are good starter for beginners so that they can cope with Cloud computing with IOT. 

Overall, the paper provides a comprehensive overview of this rapidly developing field, highlighting opportunities to revolutionize how healthcare is delivered. New devices, systems and data analytics powered by IoT, and cloud computing could enable more proactive, preventative and affordable care in the future. But careful planning and governance will be crucial to maximize the value of these technologies while mitigating risks to patient safety, trust and autonomy. This research could help businesses to explore the potential of IoT and cloud computing to improve healthcare delivery.

Examine how IoT and cloud computing are affecting patient outcomes in various healthcare settings, including hospitals, clinics, and home care. Analyze how well various IoT devices and cloud computing platforms perform in-the-moment patient data collection, archival, and analysis. assessing the security and privacy risks connected to IoT devices and cloud computing in the healthcare industry and developing mitigation strategies.

6. Targeted influence maximization based on cloud computing over big data in social networks

Big data in cloud computing research papers are having huge visibility in the industry. The paper "Targeted Influence Maximization based on Cloud Computing over Big Data in Social Networks" proposes a targeted influence maximization algorithm to identify the most influential users in a social network. Influence maximization is the process of identifying a group of users in a social network who can have a significant impact or spread information. 

A targeted influence maximization algorithm is suggested in the paper "Targeted Influence maximization based on Cloud Computing over Big Data in Social Networks" to find the most influential users in a social network. The process of finding a group of users in a social network who can make a significant impact or spread information is known as influence maximization.

Four steps make up the suggested algorithm: feature extraction, classification, influence maximization, and data preprocessing. The authors gather and preprocess social network data, such as user profiles and interaction data, during the data preprocessing stage. Using machine learning methods like text mining and sentiment analysis, they extract features from the data during the feature extraction stage. Overall, the paper offers a promising strategy for maximizing targeted influence using big data and Cloud computing research topics to look into. The suggested algorithm could assist companies and organizations in pinpointing their marketing or communication strategies to reach the most influential members of a social network.

Key insights and Research Ideas: 

Develop a cloud-based targeted influence maximization algorithm that can effectively identify and influence a small number of users in a social network to achieve a desired outcome. Investigate the use of different cloud computing platforms to improve the performance and scalability of cloud-based targeted influence maximization algorithms. Develop a cloud-based targeted influence maximization algorithm that is compatible with existing social network platforms. Design a cloud-based targeted influence maximization algorithm that is secure and privacy-preserving.

7. Security and privacy protection in cloud computing: Discussions and challenges

Cloud computing current research topics are getting traction, this is of such topic which provides an overview of the challenges and discussions surrounding security and privacy protection in cloud computing. The authors highlight the importance of protecting sensitive data in the cloud, with the potential risks and threats to data privacy and security. The article explores various security and privacy issues that arise in cloud computing, including data breaches, insider threats, and regulatory compliance.

The article explores challenges associated with implementing these security measures and highlights the need for effective risk management strategies. Azure Solution Architect Certification course is suitable for a person who needs to work on Azure cloud as an architect who will do system design with keep security in mind. 

Final take away of cloud computing thesis paper by an author points out by discussing some of the emerging trends in cloud security and privacy, including the use of artificial intelligence and machine learning to enhance security, and the emergence of new regulatory frameworks designed to protect data in the cloud and is one of the Cloud computing research topics to keep an eye in the security domain. 

Develop a more comprehensive security and privacy framework for cloud computing. Explore the options with machine learning and artificial intelligence to enhance the security and privacy of cloud computing. Develop more robust security and privacy mechanisms for cloud computing. Design security and privacy policies for cloud computing that are fair and transparent. Educate cloud users about security and privacy risks and best practices.

8. Intelligent task prediction and computation offloading based on mobile-edge cloud computing

This Cloud Computing thesis paper "Intelligent Task Prediction and Computation Offloading Based on Mobile-Edge Cloud Computing" proposes a task prediction and computation offloading mechanism to improve the performance of mobile applications under the umbrella of cloud computing research ideas.

An algorithm for offloading computations and a task prediction model makes up the two main parts of the suggested mechanism. Based on the mobile application's usage patterns, the task prediction model employs machine learning techniques to forecast its upcoming tasks. This prediction is to decide whether to execute a specific task locally on the mobile device or offload the computation of it to the cloud.

Using a dataset of mobile application usage patterns, the authors assess the performance of the suggested mechanism and compare it to other computation offloading mechanisms. The findings demonstrate that the suggested mechanism performs better in terms of energy usage, response time, and network usage.

The authors also go over the difficulties in putting the suggested mechanism into practice, including the need for real-time task prediction and the trade-off between offloading computation and network usage. Additionally, they outline future research directions for mobile-edge cloud computing applications, including the use of edge caching and the integration of blockchain technology for security and privacy. 

Overall, the paper offers a promising strategy for enhancing mobile application performance through mobile-edge cloud computing. The suggested mechanism might improve the user experience for mobile users while lowering the energy consumption and response time of mobile applications. These Cloud computing dissertation topic leads to many innovation ideas. 

Develop an accurate task prediction model considering mobile device and cloud dynamics. Explore machine learning and AI for efficient computation offloading. Create a robust framework for diverse tasks and scenarios. Design a secure, privacy-preserving computation offloading mechanism. Assess computation offloading effectiveness in real-world mobile apps.

9. Cloud Computing and Security: The Security Mechanism and Pillars of ERPs on Cloud Technology

Enterprise resource planning (ERP) systems are one of the Cloud computing research topics in particular face security challenges with cloud computing, and the paper "Cloud Computing and Security: The Security Mechanism and Pillars of ERPs on Cloud Technology" discusses these challenges and suggests a security mechanism and pillars for protecting ERP systems on cloud technology.

The authors begin by going over the benefits of ERP systems and cloud computing as well as the security issues with cloud computing, like data breaches and insider threats. They then go on to present a security framework for cloud-based ERP systems that is built around four pillars: access control, data encryption, data backup and recovery, and security monitoring. The access control pillar restricts user access, while the data encryption pillar secures sensitive data. Data backup and recovery involve backing up lost or failed data. Security monitoring continuously monitors the ERP system for threats. The authors also discuss interoperability challenges and the need for standardization in securing ERP systems on the cloud. They propose future research directions, such as applying machine learning and artificial intelligence to security analytics.

Overall, the paper outlines a thorough strategy for safeguarding ERP systems using cloud computing and emphasizes the significance of addressing security issues related to this technology. Organizations can protect their ERP systems and make sure the Security as well as privacy of their data by implementing these security pillars and mechanisms. 

Investigate the application of blockchain technology to enhance the security of cloud-based ERP systems. Look into the use of machine learning and artificial intelligence to identify and stop security threats in cloud-based ERP systems. Create fresh security measures that are intended only for cloud-based ERP systems. By more effectively managing access control and data encryption, cloud-based ERP systems can be made more secure. Inform ERP users about the security dangers that come with cloud-based ERP systems and how to avoid them.

10. Optimized data storage algorithm of IoT based on cloud computing in distributed system

The article proposes an optimized data storage algorithm for Internet of Things (IoT) devices which runs on cloud computing in a distributed system. In IoT apps, which normally generate huge amounts of data by various devices, the algorithm tries to increase the data storage and faster retrials of the same. 

The algorithm proposed includes three main components: Data Processing, Data Storage, and Data Retrieval. The Data Processing module preprocesses IoT device data by filtering or compressing it. The Data Storage module distributes the preprocessed data across cloud servers using partitioning and stores it in a distributed database. The Data Retrieval module efficiently retrieves stored data in response to user queries, minimizing data transmission and enhancing query efficiency. The authors evaluated the algorithm's performance using an IoT dataset and compared it to other storage and retrieval algorithms. Results show that the proposed algorithm surpasses others in terms of storage effectiveness, query response time, and network usage. 

They suggest future directions such as leveraging edge computing and blockchain technology for optimizing data storage and retrieval in IoT applications. In conclusion, the paper introduces a promising method to improve data archival and retrieval in distributed cloud based IoT applications, enhancing the effectiveness and scalability of IoT applications.

Create a data storage algorithm capable of storing and managing large amounts of IoT data efficiently. Examine the use of cloud computing to improve the performance and scalability of data storage algorithms for IoT. Create a secure and privacy-preserving data storage algorithm. Assess the performance and effectiveness of data storage algorithms for IoT in real-world applications.

How to Write a Perfect Research Paper?

  • Choose a topic: Select the topic which is interesting to you so that you can share things with the viewer seamlessly with good content. 
  • Do your research: Read books, articles, and websites on your topic. Take notes and gather evidence to support your arguments.
  • Write an outline: This will help you organize your thoughts and make sure your paper flows smoothly.
  • Start your paper: Start with an introduction that grabs the reader's attention. Then, state your thesis statement and support it with evidence from your research. Finally, write a conclusion that summarizes your main points.
  • Edit and proofread your paper. Make sure you check the grammatical errors and spelling mistakes. 

Cloud computing is a rapidly evolving area with more interesting research topics being getting traction by researchers and practitioners. Cloud providers have their research to make sure their customer data is secured and take care of their security which includes encryption algorithms, improved access control and mitigating DDoS – Deniel of Service attack etc., 

With the improvements in AI & ML, a few features developed to improve the performance, efficiency, and security of cloud computing systems. Some of the research topics in this area include developing new algorithms for resource allocation, optimizing cloud workflows, and detecting and mitigating cyberattacks.

Cloud computing is being used in industries such as healthcare, finance, and manufacturing. Some of the research topics in this area include developing new cloud-based medical imaging applications, building cloud-based financial trading platforms, and designing cloud-based manufacturing systems.

Frequently Asked Questions (FAQs)

Data security and privacy problems, vendor lock-in, complex cloud management, a lack of standardization, and the risk of service provider disruptions are all current issues in cloud computing. Because data is housed on third-party servers, data security and privacy are key considerations. Vendor lock-in makes transferring providers harder and increases reliance on a single one. Managing many cloud services complicates things. Lack of standardization causes interoperability problems and restricts workload mobility between providers. 

Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) are the cloud computing scenarios where industries focusing right now. 

The six major components of cloud infrastructure are compute, storage, networking, security, management and monitoring, and database. These components enable cloud-based processing and execution, data storage and retrieval, communication between components, security measures, management and monitoring of the infrastructure, and database services.  

Profile

Vinoth Kumar P

Vinoth Kumar P is a Cloud DevOps Engineer at Amadeus Labs. He has over 7 years of experience in the IT industry, and is specialized in DevOps, GitOps, DevSecOps, MLOps, Chaos Engineering, Cloud and Cloud Native landscapes. He has published articles and blogs on recent tech trends and best practices on GitHub, Medium, and LinkedIn, and has delivered a DevSecOps 101 talk to Developers community , GitOps with Argo CD Webinar for DevOps Community. He has helped multiple enterprises with their cloud migration, cloud native design, CICD pipeline setup, and containerization journey.

Avail your free 1:1 mentorship session.

Something went wrong

Upcoming Cloud Computing Batches & Dates

Course advisor icon

OPINION article

This article is part of the research topic.

Smart Sustainable Development: Exploring Innovative Solutions and Sustainable Practices for a Resilient Future

Action Learning for Change Management in Digital Transformation Provisionally Accepted

  • 1 Frankfurt University of Applied Sciences, Germany

The final, formatted version of the article will be published soon.

Digital Transformation is not only a technology endeavour but affects the whole organisation, like a company or Non-Profit-Organisation (Tabrizi et al., 2019). Technologies like Artificial Intelligence, Data Science or Cloud Computing are relevant (Sebastian et al., 2017) but rather enable improvements (Pasqual et al., 2023;Vogelsang et al., 2019). Real benefits can only be achieved by a new business models or innovative products that will change the way how value is created in a company (Matt et al., 2015). As this also implies structural changes, succeeding in such a journey requires skills and competencies in conducting changes in an organisation.Education in courses on Digital Transformation at university aims to prepare students for conducting such changes within an organisation-both, from a technological but also from a management perspective. However, there are some challenges in teaching change management as the topic and the consequences of a change in a corporate environment are still quite abstract for students. While individual students managed personal changes in their life, challenges in a large organisation are hard to tell by just using words. Change projects and, therefore, a Digital Transformation for revolutionising the business model of a company, change the organisational structure, affect people and their careers and may cause uncertainty (Kotter, 2012).The paper presents a case study on applying Action Learning (AL) for simulating the situation during a change and how to facilitate a change. The objective therefore is to let students experience changes in organisations in order to develop a better understanding of the need for and how to deal with resistance from employees or stakeholders during a digital transformation. AL, an experienced-based learning method, is described as e.g. learning by doing, collaborating, sharing ideas, lifelong learning as well as reflecting on practice (Zuber-Skerrit, 2002, p. 114). It focusses on taking action on important issues or problems (Hauser et al., 2023, p. 117). In addition, it is "a framework for a group of people to learn and develop through open and trusting interaction" (Pedler et al., 2005in Hauser et al., 2023, p. 116). The basis of AL is the concept of question. By asking questions, AL becomes a social process in which a lot of people start to learn with and from each other, and a learning community comes into being (Revans, 1982, pp. 66, 69,70).As well as AL, sustainable education is a cultural shift in how education and learning is understood (Sterling, 2008, p. 65). If the method is applied in higher education, it changes the learning and teaching culture. While the main objective remains knowledge transfer, experience as well as soft skills become more important including planning and organising the own learning process. AL can be used as a method to encourage students to be more independent.An AL Project starts with a specific (real) problem without a (simple) solution at handlecturers accompany the learning. Addressing the problem that confronts participants necessitates a decisionmaking process within the group. In this project the primary objective is to make knowledge from the lecture permanently available in the students' minds and also to motivate them to learn more independently, reflect and think critically. The achievement of the objective is supposed to be determined during the oral exams at the end of the semester. The postgraduate course on Business Information Systems at the Frankfurt University of Applied Sciences (Germany) has a focus on Digital Transformation. A dedicated module on Strategic Process Management teaches methods and tools for optimising processes in the course of a transformationincluding change management. While teaching, it became clear that most students have never been subject to a significant corporate change, cannot assess the necessity for facilitating such a change and dealing with resistance from employees or stakeholders. The class was therefore running into danger to just learn words by heart (written in text books on change management) but will never understand how being part of such a change feels like. Hence, the teachers introduced one session using action learning to achieve sustainability in learning by experiencing change. The second author, who is the professor in charge, has no active role during the AL training session and is deliberately not in the room. As the examining and grading person the assumption is that it could hinder the training. The professor is therefore the Learning Process Facilitator (Robertson and Heckroodt, 2022, p. 81). The first author accompanies the process as participant and take on the role of observer. Two external facilitators guide the students through the training.This training has integrated work and learning which is the basis of AL (Maltiba andMarsick, 2008 in Cho andEgan, 2009, p. 441). The (learning) success was due to the systematic approach of this AL session as well due to the guidance of the trainers. Learning from experience does need structure otherwise it can be inefficient (Zuber-Skerrit, 2002, p. 115).A professional training company with experience in change management and personal development has been hired. Two trainers of this company prepared a curriculum on how to motivate a change and, at the same time, confronted the students with a tough situation. After the training, they had to break a wooden board with their bare hands. Being shocked by this perspective, students listened to the trainers while they talked about facilitation as well as motivation and explained everything based on breaking the board. The whole training took around five hours, and at the end, each participant broke the board with their bare hands. In the pursuit of insights, data was collected through a combination of student observations and discussions and reflective exchanges with the students. The master students were hesitant in the beginningthey were expecting a lecture and got a quite different setting: visible through a circle of chairs, flip chart instead of Power Point and two people in front who do not look familiar. The students were intimidated, unsure and initially quiet. Over the day, the students thawed out and participated. At first, they could not make the connection to their lecture. The trainers supported the students in building the bridge to change management in the work context. This guidance through the trainers was necessary. Students were encouraged to ask questions and think of examples from their professional contexts; if they did not have them, references to their personal lives or volunteer work should be made. By the end of the day, students were open, asking questions, exchanging knowledge and experience, loosened up, and having fun: As the students were also emotionally involved in the training (because of the challenge) they developed an empathic understanding on how employees feel when being subject to change. This is one of the intended results since Action Learning has a "dual mission": people development and business impact (Cho and Egan, 2009, p. 441). They were able to experience transformation and change.It was a functional decision not to include the examiner in the training, because the observer also noticed that the students were somewhat restrained and sometimes looked at her. The observer was only known to the students from greeting and she also had the feeling that this made some people feel inhibited. For this reason, the external trainers, who ensure confidentiality, were ideal. The participative observation could have influenced the students' later statements.This case study is only transferable to a limited extent, since it is very specific: it only includes postgraduate student from one degree programme who mainly have done their undergraduates at the same university. In addition, for German universities, it is a rather smaller study group (10-16 students). Another special feature is the special background of the external trainers: Business information specialists and instructors for Jiu-Jitsu which both influence the case study/training.At the end of the semester, the module was concluded with an oral exam. The second author had often experienced students here in the past who reproduced knowledge but had limited understanding of what it meant and had difficulty bringing examples. This year, things were different: the students were able to give a lively account of change management based on the training and were able to substantiate the contents of the lecture with practical examples. The primary objective, as stated previously, can be seen as achieved as almost all students were able to reflect on the challenges with changes. One student struggled explaining reasons for resistance against changes in a company in the oral exam and just repeated words from the lecture slides. In this case the professor switched back to the role as a learning companion and encouraged the student to reflect on how they felt while being confronted with the wooden board challenge. Now the technical knowledge was connected to the emotional side and struggles with changes were explained in a livelier way.Action Learning as an innovative teaching method not only have advantages but also disadvantages in higher education settings. The following disadvantages and how we have tried to mitigate them should be mentioned: Applying AL is time consuming, and it has to fit in the university's schedule. We met this challenge through early and transparent (semester) planning. For AL, they were scheduled for a whole day and the session took longer than the usual lecture and exercise slot in the timetable. To counter this, the lecture room showed by a different seating (seating circle), which suggested a different teaching method, it was the day with excess length, and the integration of external facilitators made it clear that today is not a normal lecture.The case study makes the authors quite optimistic that Action Learning could be integrated in the curriculum to gain more time for the implementation and to enable a sustainable learning effect. Notably, certain factors have emerged as influential in promoting success in our case: the necessity of implementing AL in smaller group settings, the acquisition of external facilitators, the proactive scheduling of additional time slots within the semester plan, and the clear, advance communication of these schedule adjustments to enable students to align their plans accordingly. Importantly, there was an active expression of interest from some students for more sessions of this nature. In future, we are also planning to try out shorter formats to test whether AL could also be implemented in a regular course, i.e. 90 minutes.

Keywords: Action Learning, Change Management, higher education, Teaching, digital transformation

Received: 21 Feb 2024; Accepted: 11 Apr 2024.

Copyright: © 2024 Ruhland and Jung. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Mx. Anja Ruhland, Frankfurt University of Applied Sciences, Frankfurt, Germany

People also looked at

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

COMMENTS

  1. Proceedings of the 2023 ACM Symposium on Cloud Computing

    New York. NY. United States. Conference: SoCC '23: ACM Symposium on Cloud Computing Santa Cruz CA USA 30 October 2023 - 1 November 2023. ISBN: 979-8-4007-0387-4. Published: 31 October 2023.

  2. Home page

    The Journal of Cloud Computing: Advances, Systems and Applications (JoCCASA) will publish research articles on all aspects of Cloud Computing. Principally, articles will address topics that are core to Cloud Computing, focusing on the Cloud applications, the Cloud systems, and the advances that will lead to the Clouds of the future.

  3. cloud computing Latest Research Papers

    The paper further compares and reviews different layout model for the discovery of services, selection of services and composition of services in Cloud computing. Recent research trends in service composition are identified and then research about microservices are evaluated and shown in the form of table and graphs. Download Full-text.

  4. 2023 IEEE 16th International Conference on Cloud Computing (CLOUD

    Profile Information. Communications Preferences. Profession and Education. Technical Interests. Need Help? US & Canada:+1 800 678 4333. Worldwide: +1 732 981 0060. Contact & Support. About IEEE Xplore.

  5. Articles

    Huimin Han, Harold Neira-Molina, Asad Khan, Meie Fang, Haitham A. Mahmoud, Emad Mahrous, Bilal Ahmed and Yazeed Yasin Ghadi. Journal of Cloud Computing 2024 13 :77. Correction Published on: 27 March 2024. The original article was published in Journal of Cloud Computing 2024 13 :20. Full Text.

  6. A survey of Kubernetes scheduling algorithms

    As cloud services expand, the need to improve the performance of data center infrastructure becomes more important. High-performance computing, advanced networking solutions, and resource optimization strategies can help data centers maintain the speed and efficiency necessary to provide high-quality cloud services. Running containerized applications is one such optimization strategy, offering ...

  7. Volumes and issues

    Find a journal Publish with us Track your research Search. Cart. Journal of Cloud Computing. Advances, Systems and Applications. Search within journal. Search. Volumes and issues. Volume 13 December 2024. December 2024, issue 1; Volume 12 December 2023. December 2023, issue 1; Volume 11 December 2022. December 2022, issue 1; Volume 10 December ...

  8. IEEE CLOUD 2023

    The conference is a prime international forum for researchers, academics, businesses, industry, and standard bodies to exchange the latest fundamental advances in the state of the art and practice of cloud computing, identify emerging research topics, and define the future of cloud computing. IEEE CLOUD 2023 invites original papers addressing ...

  9. IEEE CLOUD 2023

    About IEEE Cloud 2023. The IEEE International Conference on Cloud Computing (CLOUD) has been a prime international forum for both researchers and industry practitioners to exchange the latest fundamental advances in the state of the art and practice of cloud computing, identify emerging research topics, and define the future of cloud computing.

  10. Cloud Computing: A Systematic Literature Review and Future Agenda

    review is thought to inspire enterprises and managers that would like to use cloud computing in. terms of the scope, solution methods, factors, dimensions, and the results achieved in a holistic ...

  11. Cloud Computing

    lgrativol/fl_exps • • 23 Oct 2023. Federated Learning (FL) is a promising distributed method for edge-level machine learning, particularly for privacysensitive applications such as those in military and medical domains, where client data cannot be shared or transferred to a cloud computing server.

  12. PDF CALL FOR PAPERS

    of the art and practice of cloud computing, identify emerging research topics, and define the future of cloud computing. IEEE CLOUD 2023 invites original papers addressing all aspects of cloud computing infrastructure, applications, and business innovations.Topics . of interest include but are not limited to the following: Cloud & AI Infrastructure

  13. CLOUD COMPUTING 2023 Call for Papers

    CLOUD COMPUTING 2023 is intended as an event to prospect the applications supported by the new paradigm and validate the techniques and the mechanisms. A complementary target is to identify the open issues and the challenges to fix them, especially on security, privacy, and inter- and intra-clouds protocols. We solicit both academic, research ...

  14. IEEE Cloud Computing

    Profile Information. Communications Preferences. Profession and Education. Technical Interests. Need Help? US & Canada:+1 800 678 4333. Worldwide: +1 732 981 0060. Contact & Support. About IEEE Xplore.

  15. Adoption of cloud computing as innovation in the organization

    We also explore the cybersecurity elements associated with cloud computing, focusing on intrusion detection and prevention and understanding how that can be applied in the cloud. Finally, we investigate the future research directions for cloud computing and expand this paper into further articles with experiments and results.

  16. 2023 Cloud Study: Trends & Insights • Foundry

    About the research. Foundry's 2023 Cloud Computing Survey is the 10th year of this research and was conducted to measure cloud computing trends among technology decision-makers including: adoption plans, spending, business drivers, challenges, and top cloud growth areas, such as AI. The study was fielded throughout August 2023 and is based on the responses of 893 global IT decision-makers ...

  17. Papers

    Full Research Papers. Research papers that were submitted for Artifact Evaluation (AE) and successfully passed are awarded the Open Research Objects (ORO) and/or Research Objects Reviewed (ROR) badge (s). See NISO RP-21-2021 report for definitions of the badges. Open Research Objects (ORO) | Research Objects Reviewed (ROR) | ★ Best Paper ...

  18. PDF 2023 Cloud Security Report

    We would like to thank ISC2 for supporting this unique research. We hope you enjoy this report. Thank you, Holger Schulze Introduction Holger Schulze CEO and Founder Cybersecurity Insiders. 2023 CLOUD SECURIT REPORT ©2023 Cybersecurity Insiders All Rights ... These findings indicate a growing recognition of the bene fits of cloud computing ...

  19. PDF White paper The 2023 Cloud Modernization Research Report

    the cloud journey they're in. Cloud modernization is more than just a migration to the cloud; it's the process of optimizing costs, modernizing applications and security, and, when necessary, cloud native application replatforming. The benefits of cloud modernization Cloud modernization promotes efficiency, security and cost reduction ...

  20. Cloud Computing and its Impact in Education, Teaching and Research-A

    Cloud Computing and its Impact in Education, Teaching and Research-A Scientific Review Book chapter in "Emergence and Research in Interdisciplinary Management and Information Technology" edited by P.K. Paul et al. Published by New Delhi Publishers, New Delhi, India, pp. 1-26. 2023

  21. cloud security Latest Research Papers

    This paper provides a review of security research in the field of cloud security and storage services of the AWS cloud platform. After security and storage, we have presented the working of AWS (Amazon Web Service) cloud computing. AWS is the most trusted provider of cloud computing which not only provides excellent cloud security but also ...

  22. Top 10 Cloud Computing Research Topics of 2024

    4. Blockchain data-based cloud data integrity protection mechanism. The "Blockchain data-based cloud data integrity protection mechanism" paper suggests a method for safeguarding the integrity of cloud data and which is one of the Cloud computing research topics. In order to store and process massive amounts of data, cloud computing has grown ...

  23. Cloud-Based Malware Detection: Review

    This article offers an in-depth analysis of current methods for detecting malware in the cloud, as well as an examination of the historical development of these approaches. Furthermore, this paper aims to raise awareness of the importance of cloud environments in safeguarding users' data from computer hackers

  24. Action Learning for Change Management in Digital Transformation

    Digital Transformation is not only a technology endeavour but affects the whole organisation, like a company or Non-Profit-Organisation (Tabrizi et al., 2019). Technologies like Artificial Intelligence, Data Science or Cloud Computing are relevant (Sebastian et al., 2017) but rather enable improvements (Pasqual et al., 2023;Vogelsang et al., 2019). Real benefits can only be achieved by a new ...

  25. IEEE Transactions on Cloud Computing

    Profession and Education. Technical Interests. Need Help? US & Canada:+1 800 678 4333. Worldwide: +1 732 981 0060. Contact & Support. About IEEE Xplore. Contact Us. Help.