This story draft by @configuring has not been reviewed by an editor, YET.
Authors:
(1) Limeng Zhang, Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, Australia;
(2) M. Ali Babar, Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, Australia.
1.1 Configuration Parameter Tuning Challenges and 1.2 Contributions
3 Overview of Tuning Framework
4 Workload Characterization and 4.1 Query-level Characterization
4.2 Runtime-based Characterization
5 Feature Pruning and 5.1 Workload-level Pruning
5.2 Configuration-level Pruning
7 Configuration Recommendation and 7.1 Bayesian Optimization
10 Discussion and Conclusion, and References
Neural Network (NN)-based Solutions for automatic tuning leverage the power of artificial neural networks to model the complex relationships between configuration parameters and system performance, enabling efficient tuning without the need for exhaustive search or manual intervention.
7.2.1 NN-based solutions
Tan et al. propose iBTune [15], individualized buffer tuning, to automatically reduce the buffer size for any individual database instance while still maintaining the quality of service for its response time. It leverages information from similar workloads to determine the tolerable miss ratio of each instance. Then, it utilizes the relationship between miss ratios and allocated memory sizes to individually optimize the target buffer pool sizes based on a large deviation analysis for the LRU caching model. To provide a guaranteed level of service level agreement (SLA), it designs a pairwise deep neural network (DNN) that predicts the upper bounds of the request response time by learning features from measurements on pairs of instances. Specifically, in the training phase, the NN network takes input of the performance metrics (logical reads, I/O reads, queries per second (QPS), CPU usage, miss ratio) of the target database instance and a similar database instance, the response time (RT) of the similar instance, and the encoding of the current time. The output of the network is the RT. In the testing phase, the previously observed sample of the target is selected as the similar instance. If the predicted RT meets the requirement, the computed new buffer pool size is sent for execution.
Authors in [14] proposed Ottertune-DNN by adopting the OtterTune framework [8] and replacing its gaussian process model with a neural network model. Same to Ottertune [8], the raw data for each previous workload is collected and compared with the target workload. Data from the most similar previous workload are then merged with the target workload data to train a neural network model, which replaces the GP model in the original OtterTune. Finally, the algorithm recommends the next configuration to run by optimizing the model. The neural network model is a stacked 2-layer neural network with 64 neurons connected using ReLU as the activation function, and there is one dropout layer among the two hidden layers. Additionally, Gaussian noise is added to the neural network parameters to control the trade-off between exploration and exploitation, with increased exploitation throughout the tuning session achieved by reducing the scale of the noise.
LITE [21] achieves configuration tuning for Spark applications by leveraging the power of NN to analyze the application code semantics as well as scheduling information during its execution. Its framework consists of an offline training phase to estimate the performance of a Spark application for a given knob configuration, and an online recommendation phase to recommend appropriate knobs for a given Spark application. In the offline training phase, LITE proposes a code learning framework, Neural Estimator via Code and Scheduler representation (NECS), which extracts stage-level code and Directed Acyclic Graph (DAG) Scheduler as training features. These features are then concatenated with configuration features (knob values), data features, and computing environment features as inputs to predict execution time by an MLP (Multi-Layer Perceptron) module. In its online recommendation phase, LITE introduces an Adaptive Candidate Generation module to initially construct a configuration set. This process involves utilizing a Random Forest Regression (RFR) model to ascertain the center of the promising knob range, which is determined based on factors such as the input data size and the specific application under consideration. Subsequently, the module establishes the knob searching boundary, derived from the input data. Once these boundaries are defined, the module randomly samples a small number of candidates from the selected search space. Finally, it evaluates the performance of each candidate configuration using the NECS model and recommends the knob values associated with the best-estimated performance. Meanwhile, NECS designs an Adaptive Model Update model to periodically finetunes the NECS model through adversarial learning when a predefined batch of new instances are collected.
This paper is available on arxiv under CC BY 4.0 DEED.