paint-brush

This story draft by @configuring has not been reviewed by an editor, YET.

Runtime-Based Workload Characterization in DBMS Tuning

featured image - Runtime-Based Workload Characterization in DBMS Tuning
Configuring HackerNoon profile picture
0-item

Authors:

(1) Limeng Zhang, Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, Australia;

(2) M. Ali Babar, Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, Australia.

Table of Links

Abstract and 1 Introduction

1.1 Configuration Parameter Tuning Challenges and 1.2 Contributions

2 Tuning Objectives

3 Overview of Tuning Framework

4 Workload Characterization and 4.1 Query-level Characterization

4.2 Runtime-based Characterization

5 Feature Pruning and 5.1 Workload-level Pruning

5.2 Configuration-level Pruning

5.3 Summary

6 Knowledge from Experience

7 Configuration Recommendation and 7.1 Bayesian Optimization

7.2 Neural Network

7.3 Reinforcement Learning

7.4 Search-based Solutions

8 Experimental Setting

9 Related Work

10 Discussion and Conclusion, and References

4.2 Runtime-based Characterization

In addition to characterizing a workload in terms of queries, a workload can also be characterized by its runtime characteristics. Modern DBMSs provide extensive information about workload running behavior [8]. For example, MySQL’s InnoDB engine [24] provides statistics on the number of pages read/written, query cache utilization, and locking overhead. OtterTune [8] characterizes a workload using such numeric metrics, such as the number of pages read/written, query cache utilization, and locking overhead, etc, to reflect various aspects of its runtime behavior. Additionally, researchers can also define new performance indicators tailored to unique requirements. For instance, RelM [19] profiles the memory allocation of a BDAF application with different configuration parameters, along with customized performance metrics regarding memory management decisions at multiple levels, including the resource management level, container level, application level, and inside the Java Virtual Machine.


Apart from workload-related and runtime-related features, other factors associated with workload execution can also be considered, such as features related to running experiments and data. LITE [21] provides insights on incorporating code semantics features and scheduler features alongside model input data features (such as column number, rows, iteration number, and partition number), as well as cluster environment factors related to nodes, memory, CPU frequency, and bandwidth when generating configurations for applications in Spark.


This paper is available on arxiv under CC BY 4.0 DEED.