This is paper is available on arxiv under CC 4.0 DEED license.
Authors:
(1) Leah Chrestien, Czech Technical University in Prague;
(2) Tomá˘s Pevný, Czech Technical University in Prague, and Gen Digital, Inc.;
(3) Stefan Edelkamp, Czech Technical University in Prague;
(4) Antonín Komenda, Czech Technical University in Prague.
Let’s now design a loss function minimizing the number of violated conditions in Definition 1 for a problem instance (Γ, s0, S ∗ ) and its optimal plan π. Assuming a heuristic function h(s, θ) with parameters θ ∈ Θ, the number of violated conditions can be counted as
where
While optimization of the surrogate loss might look as if one optimizes a different problem, it has been shown that for certain classes of functions, optimization of surrogate losses solves the original problem (3) as the number of training samples approach infinity [35, 50].
Cost-to-goal provides a false sense of optimality It has been already mentioned above that A* with h∗ is strictly optimally efficient up to resolving states with the same value of g(s) + h∗ (s). This means that for problems with a large (even exponential) number of optimal solutions of equal cost (see [23] for examples), A* will be very inefficient unless some mechanism for breaking ties is used. Moreover, since learning is inherently imprecise, a heuristic close but not equal to h∗ will likely be less efficient than the one close to a perfect-ranking heuristic.
Size of the solution set The set of all functions satisfying Equation (5) is likely smaller than that satisfying Equation (3), because the solution set of (3) is invariant to transformations by stricly monotone functions, unlike the solution set of (5). From that, we conjecture that finding the solution of (5) is more difficult. The precise theorem is difficult to prove since the solution of (5) is not a solution of (3) due to tie resolution in Definition 1.
Conflicts in training set Unlike regression loss, the proposed family of ranking losses is potentially sensitive to conflicts in the training set, which occur when the training set contains examples of two (or more) different solution paths of the same problem instance. This might prevent achieving a zero loss. The effect on the heuristic depends on the composition of the training set. Either solution paths more frequent in the training set will be preferred, or there will be ties, but their resolution do not affect optimally efficiency. Appendix 8.5 provides an illustrative example.