Authors:
(1) Limeng Zhang, Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, Australia;
(2) M. Ali Babar, Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, Australia. Table of Links Abstract and 1 Introduction 1.1 Configuration Parameter Tuning Challenges and 1.2 Contributions 2 Tuning Objectives 3 Overview of Tuning Framework 4 Workload Characterization and 4.1 Query-level Characterization 4.2 Runtime-based Characterization 5 Feature Pruning and 5.1 Workload-level Pruning 5.2 Configuration-level Pruning 5.3 Summary 6 Knowledge from Experience 7 Configuration Recommendation and 7.1 Bayesian Optimization 7.2 Neural Network 7.3 Reinforcement Learning 7.4 Search-based Solutions 8 Experimental Setting 9 Related Work 10 Discussion and Conclusion, and References 7.4 Search-based Solutions Search-based methods iteratively search for optimal configuration parameters by evaluating the system’s performance under different configurations. The core of search-based methods is to design the search strategy to handle the tradeoff between exploration(searching broadly across the configuration space to discover new and potentially promising regions) and exploitation (intensively searching in regions believed to be promising based on the available information). BestConfig [10] recommends configuration parameters through an iterative process consisting of the divide & diverge sampling for sample collection and the recursive bound & search for configuration parameter recommendation. Specifically, it first discretizes the parameter ranges for each parameter into several intervals (divide), and then selects a set of samples by considering each interval of a parameter once (diverge). In the recursive bound & search phase, it starts by searching the point with the best performance among the selected samples. Then, it searches from its surrounding points for better performance (bound). If such point are found, it conducts sampling around the new sample. This optimization process is iteratively conducted until no point with better performance is found. AutoTune [11] is an automatic parameter tuning system that aims to optimize application execution time on big data analytics frameworks. It executes experiments on both the production cluster and on a smaller-scale testbed to perform more test runs. It consists of three main steps. First, AutoTune executes the application using different data sizes and numbers of machines to build a parametric model. The model is used to select the data and cluster size for running the experiments within a time budget. The second step involves (i) exploration via using latin hypercube sampling (LHS) for generating different configurations to execute and (ii) exploitation via building and employing a Random Forest model for finding promising configurations to execute. In the final step, the best q configurations are used for executing the application on the production and determining the best one. UDAO [20] considers the tuning problem as a principled multi-objective optimization (MOO) approach that takes multiple, possibly conflicting, objectives, computes a Pareto optimal set of configurations (i.e., not dominated by any other configuration in all objectives), and returns one from the set that best suits the objectives. To be specific, it incrementally transforming a MOO problem to a set of constrained optimization (CO) problems, where each CO problem can be solved individually to return a Pareto optimal point. This paper is available on arxiv under CC BY 4.0 DEED. Authors: (1) Limeng Zhang, Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, Australia; (2) M. Ali Babar, Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, Australia. Authors: Authors: (1) Limeng Zhang, Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, Australia; (2) M. Ali Babar, Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, Australia. Table of Links Abstract and 1 Introduction Abstract and 1 Introduction 1.1 Configuration Parameter Tuning Challenges and 1.2 Contributions 1.1 Configuration Parameter Tuning Challenges and 1.2 Contributions 2 Tuning Objectives 2 Tuning Objectives 3 Overview of Tuning Framework 3 Overview of Tuning Framework 4 Workload Characterization and 4.1 Query-level Characterization 4 Workload Characterization and 4.1 Query-level Characterization 4.2 Runtime-based Characterization 4.2 Runtime-based Characterization 5 Feature Pruning and 5.1 Workload-level Pruning 5 Feature Pruning and 5.1 Workload-level Pruning 5.2 Configuration-level Pruning 5.2 Configuration-level Pruning 5.3 Summary 5.3 Summary 6 Knowledge from Experience 6 Knowledge from Experience 7 Configuration Recommendation and 7.1 Bayesian Optimization 7 Configuration Recommendation and 7.1 Bayesian Optimization 7.2 Neural Network 7.2 Neural Network 7.3 Reinforcement Learning 7.3 Reinforcement Learning 7.4 Search-based Solutions 7.4 Search-based Solutions 8 Experimental Setting 8 Experimental Setting 9 Related Work 9 Related Work 10 Discussion and Conclusion, and References 10 Discussion and Conclusion, and References 7.4 Search-based Solutions Search-based methods iteratively search for optimal configuration parameters by evaluating the system’s performance under different configurations. The core of search-based methods is to design the search strategy to handle the tradeoff between exploration(searching broadly across the configuration space to discover new and potentially promising regions) and exploitation (intensively searching in regions believed to be promising based on the available information). BestConfig [10] recommends configuration parameters through an iterative process consisting of the divide & diverge sampling for sample collection and the recursive bound & search for configuration parameter recommendation. Specifically, it first discretizes the parameter ranges for each parameter into several intervals (divide), and then selects a set of samples by considering each interval of a parameter once (diverge). In the recursive bound & search phase, it starts by searching the point with the best performance among the selected samples. Then, it searches from its surrounding points for better performance (bound). If such point are found, it conducts sampling around the new sample. This optimization process is iteratively conducted until no point with better performance is found. AutoTune [11] is an automatic parameter tuning system that aims to optimize application execution time on big data analytics frameworks. It executes experiments on both the production cluster and on a smaller-scale testbed to perform more test runs. It consists of three main steps. First, AutoTune executes the application using different data sizes and numbers of machines to build a parametric model. The model is used to select the data and cluster size for running the experiments within a time budget. The second step involves (i) exploration via using latin hypercube sampling (LHS) for generating different configurations to execute and (ii) exploitation via building and employing a Random Forest model for finding promising configurations to execute. In the final step, the best q configurations are used for executing the application on the production and determining the best one. UDAO [20] considers the tuning problem as a principled multi-objective optimization (MOO) approach that takes multiple, possibly conflicting, objectives, computes a Pareto optimal set of configurations (i.e., not dominated by any other configuration in all objectives), and returns one from the set that best suits the objectives. To be specific, it incrementally transforming a MOO problem to a set of constrained optimization (CO) problems, where each CO problem can be solved individually to return a Pareto optimal point. This paper is available on arxiv under CC BY 4.0 DEED. This paper is available on arxiv under CC BY 4.0 DEED. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Search, Tune, Repeat: The Search-Based Approach to Perfecting Cloud Performance

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

A New Testing Platform and Testing Platform Roles

The Noonification: Feature Optimization for Price Prediction (11/26/2023)

10 Ways to Optimize Your Database

10 Essential Computer Skills for Data Mining

10 Most Evolving Big Data Technologies to Catch Up on in 2022

Top 10 JavaScript Charting Libraries for Every Data Visualization Need

A New Testing Platform and Testing Platform Roles

The Noonification: Feature Optimization for Price Prediction (11/26/2023)

10 Ways to Optimize Your Database

10 Essential Computer Skills for Data Mining

10 Most Evolving Big Data Technologies to Catch Up on in 2022

Top 10 JavaScript Charting Libraries for Every Data Visualization Need

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps