Adaptive Action Masking: Accelerating Decision-Making in Database Tuning

Written by instancing | Published 2025/12/24
Tech Story Tags: deep-reinforcement-learning | td3-td-swar-model | adaptive-action-masking | index-selection-problem | action-space-pruning | database-optimization-strategy | continuous-control-drl | workload-execution-efficiency

TLDRThe TD3-TD-SWAR model advances database optimization by framing index selection as a DRL problem with adaptive action masking for faster training.via the TL;DR App

Abstract and 1. Introduction

  1. Related Works

    2.1 Traditional Index Selection Approaches

    2.2 RL-based Index Selection Approaches

  2. Index Selection Problem

  3. Methodology

    4.1 Formulation of the DRL Problem

    4.2 Instance-Aware Deep Reinforcement Learning for Efficient Index Selection

  4. System Framework of IA2

    5.1 Preprocessing Phase

    5.2 RL Training and Application Phase

  5. Experiments

    6.1 Experimental Setting

    6.2 Experimental Results

    6.3 End-to-End Performance Comparison

    6.4 Key Insights

  6. Conclusion and Future Work, and References

4 Methodology

In this section, we outline our novel approach to the Index Selection Problem (ISP), significantly advancing the application of Deep Reinforcement Learning (DRL). Our methodology not only frames the ISP within a DRL context but also introduces a groundbreaking RL model, the Twin Delayed Deep Deterministic Policy Gradients-Temporal Difference-State-Wise Action Refinery (TD3-TD-SWAR). Drawing on the innovative work of [12] and [15], TD3-TD-SWAR is specifically designed to efficiently manage the ISP’s complex solution space. A key novelty of our approach is the adaptive action masking mechanism, which accelerates training and sharpens decision-making by filtering out less beneficial actions based on current conditions, achieving optimal index selection with minimal decision steps.

4.1 Formulation of the DRL Problem

Deep Reinforcement Learning (DRL) offers a method for sequential decision-making to optimize database index configurations, addressing the Index Selection Problem (ISP) [7, 13]. The goal in ISP is for an agent to find an optimal index set 𝐼∗ from candidates to reduce workload execution costs. Given budget constraints, the agent strategically sequences new index additions, navigating the dynamic environment to find configurations that improve database performance over time, showcasing ISP as a fundamental DRL challenge that underscores strategic decision-making under constraints.

While basic Deep Reinforcement Learning (DRL) techniques are applicable, the challenge of large action space, i.e., too many index candidates, necessitates an advanced solution. Our Twin Delayed Deep Deterministic Policy GradientsTemporal Difference-State-Wise Action Refinery (TD3-TDSWAR) model addresses this by refining action space pruning, ensuring efficiency and precision in tackling the complexities of index selection under storage and training limitations.

Authors:

(1) Taiyi Wang, University of Cambridge, Cambridge, United Kingdom ([email protected]);

(2) Eiko Yoneki, University of Cambridge, Cambridge, United Kingdom ([email protected]).


This paper is available on arxiv under CC BY-NC-SA 4.0 Deed (Attribution-Noncommercial-Sharelike 4.0 International) license.


Written by instancing | Pioneering instance management, driving innovative solutions for efficient resource utilization, and enabling a more sus
Published by HackerNoon on 2025/12/24