Authors:
(1) Ankur Nath, Department of Computer Science and Engineering, Texas A&M University;
(2) Alan Kuhnle, Department of Computer Science and Engineering, Texas A&M University.
Summary and Outlook, References
We use the publicly available implementation of ECO-DQN[2] and GNNSAT[3] as our code base. To ensure a fair comparison, we employ the pretrained models from the original papers. However, for S2V-DQN applied to ER graphs with |V | = 60, we conduct training from scratch due to the inability to load the pretrained model with the hyperparameters provided in the original paper. We provide all our experimental results, code, and data at this link[4].
For training skewed graphs for the Max-Cut problem, we adopt the hyperparameters outlined in the original paper and modify the dataset while keeping other settings intact.
Experimental Setup We run all experiments on two NVIDIA RTX A4000 (16GB memory) GPUs with Intel Xeon(R) w5-2445 CPUs at 3.10 GHz and 64 GB of memory
In this section, you can find the tables and plots omitted in the main paper due to space constraints.
Figure 4 and 5 present an analysis of the intra-episode behavior of ECO-DQN and SoftTabu agents across a diverse range of distributions.
Understanding the generalization of agents is crucial, as it determines their practical utility in real world applications where the data distribution may vary or evolve over time. Table 4 and 5 provide detailed analysis of how well these agents can adapt and perform in scenarios that go beyond their training data.
Table 4: Generalisation of agents trained on ER graphs of size |V | = 40 to unseen graph sizes and structures.
Table 5: Generalisation of agents trained on BA graphs of size |V | = 40 to unseen graph sizes and structures.
Table 6 presents a performance analysis of agents that have been trained on skew graphs, specifically those with |V | = 200. The primary focus is on how these agents perform when faced with established benchmark tasks, with the best-performing results being emphasized and displayed in bold for clarity.
Table 6: Average performance of agents trained on skew graphs of size |V | = 200 on known benchmarks (best in bold).
In Table 7, we compare the performance of learned heuristics alongside the WalkSAT algorithm. For WalkSAT, we have set the value of the parameter p to 0.5, following the experimental setup used in GNNSAT. Additionally, we have fine-tuned the algorithm to determine the optimal p value.
Table 7: Performance of the learned heuristics and WalkSAT. For WalkSAT, we set the value of p = 0.5 following the experimental setup of GNNSAT and also tune the algorithm for the optimal value. In each cell, there are three metrics (from top to bottom): the ratio of the average number of steps, the median number of steps, and the percentage solved. *Values as reported in GNNSAT for reference.
[2] Code available at: https://github.com/tomdbar/eco-dqn
[3] Code available at: https://github.com/emreyolcu/sat
[4] Code available at: https://tinyurl.com/52ykxtaj
This paper is available on arxiv under CC 4.0 DEED license.