Decomposition Analysis and Machine Learning in a Workflow-Forecast Approach to the Task Scheduling Problem for High-Loaded Distributed Systems

  •  Andrey Gritsenko    
  •  Nikita Demurchev    
  •  Vladimir Kopytov    
  •  Andrey Shulgin    


The aim of this paper is to provide a description of machine learning based scheduling approach for high-loaded distributed systems that have patterns of tasks/queries that occur recurrently in workflow. The core of this approach is to predict the future workflow of the system depending on previous tasks/queries using supervised learning. First of all, the workflow is analyzed using hierarchical clustering to reveal sets of tasks/queries. Revealed sets of tasks/queries then undergo restructuring to represent patterns of recurrent tasks/queries. Later these patterns become the object of the forecasting process performed using neural network. Information on predicted tasks/queries is used by the resource management system (RMS) to perform efficient schedule. To estimate the performance of the described method it was at first realized as a module of the simulation tool Alea that models the work of high-performance distributed systems and then compared with other state-of-the-art scheduling algorithms. The simulation was produced for two datasets: in one of the experiments the proposed method showed best results, and in the other it was inferior to just a single method, though it was much better than commonly used standard scheduling algorithms.

This work is licensed under a Creative Commons Attribution 4.0 License.