Partial Ordering: An enigma wrapped inside of a riddle

Written by barryrevzin | Published 2017/02/25
Tech Story Tags: programming | cplusplus | partial-ordering | enigma-wrapped-riddle | cplus

TLDRvia the TL;DR App

In my opinion, the most complex rules in the C++ language have to do with the ordering of function templates and class template specializations. They are dense, difficult to follow, underspecified, involve jumping from section to section in the standard, and, on top of all of that, the major compilers don’t even agree with the standard anyway. Let’s explore what I mean.

When doing overload resolution, there’s a lot of steps we go through to determine which function to actually call. We rank different kinds of conversions, we prefer some kinds of references to others, we prefer non-templates to templates, all in careful order. The very last tie-breaker, before we give up and consider the call ambiguous, is to determine if one function template is more specialized than another. A simple example:

Obviously, #2 is more specialized, since pointer types are a strict subset of all types. That’s all well and good that we can figure this out — but what do the rules say about how the compiler is supposed to reason about this?

The process is, roughly, the compiler synthesizes new types for each template parameter of one function template and attempts to perform template deduction against the other function template. It then does this in the reverse order. If deduction succeeds in both directions, or neither direction, neither function template is more specialized than the other. If deduction only succeeds in one direction, then that is the reverse direction of specialization.

In this simple example, we first attempt to deduce #1 from #2. We synthesize some new type, call it UniqueA, and attempt to perform deduction:

Function: template <class T> void foo(T ); // #1Arguments: UniqueA*

This deduction succeeds, with T=UniqueA*. Now, we do the reverse. We synthesize another new type, call it UniqueB, and attempt to perform deduction:

Function: template <class T> void foo(T* ); // #2Arguments: UniqueB

This deduction fails. UniqueB is not a pointer type. Since deduction only succeeded in one direction, #2 → #1, that makes #2 the more specialized function template. This whole process answers our original question: which function do we call? #2.

It’s worth going through the entire process for a simple case, because we’re about to add a wrinkle that will tear our world apart: non-deduced contexts. We understand non-deduced contexts in the normal case of template deduction: template parameters in non-deduced contexts are not deduced (as the name may suggest!). Such parameters must either be provided explicitly or deduced from other arguments. If neither happens, deduction fails. In this example:

Determining what happens is straightforward. #1 is a viable candidate, deduction succeeds with T=int. In #2, T appears only in a non-deduced context and is not provided. Hence, deduction fails. We only have one viable candidate, so we’re done. This calls #1. But what happens if we add a wrinkle:

Now, both candidates are viable, and both are exact matches. As before, we’re down to the final tiebreaker: partial ordering. Which function template is more specialized? Unlike the earlier case of T vs T*, we don’t have any clear intuition here as to which should be more specialized. At least, I don’t. So let’s just follow the rules and find out. We start by synthesizing new types and perform deduction in both directions. Let’s start by deducing #1 from #2:

Function: template <class T> void foo(T );Arguments: identity_t<UniqueA>

Now, identity_t<UniqueA> isn’t UniqueA. It’s just some type. We don’t perform any instantiation in this context. We can think of this as some other unique type, call it UniqueA'. In this direction, deduction succeeds, with T=UniqueA'. What about the other direction:

Function: template <class T> void foo(identity_t<T> );Arguments: UniqueB

We have a non-deduced context, so deduction fails. Since deduction only succeeds in one direction, #2 → #1, #2 is considered the more specialized function template, so that’s the one that gets invoked. (In case there’s any doubt, every compiler agrees with this interpretation. )

Does this jibe with intuition? I’m not sure. Is identity_t<T> a subset of types of T? But hold on Dorothy, this ride’s about to get more fun. Let’ s add another parameter.

What if we add one more argument (I originally asked about this example on StackOverflow):

Here, even though T is a non-deduced context in the second parameter of #2, it’s deduced from the first parameter as int, so deduction succeeds without having to explicitly provide anything. How’s our intuition going to work for here? If identity_t<T> is more specialized than T, then surely specialization should work under composition and (T, identity_t<T>) should be more specialized than (T,T)? Well, intuition is overrated, let’s do the work.

Let’s deduce #1 from #2:

Function: template <class T> void foo(T, T);Arguments: UniqueA, identity_t<UniqueA> (or UniqueA, UniqueA')

This deduction fails, we deduce two different types for T: UniqueA and UniqueA'. What about the other direction?

Function: template <class T> void foo(T, identity_t<T> );Arguments: UniqueB, UniqueB

Deduction succeeds in the first argument, but what do we actually do in the second? What do we do in partial ordering when the parameter is a non-deduced context? The only guidance the standard gives us is in [temp.deduct.partial]:

In most cases, all template parameters must have values in order for deduction to succeed, but for partial ordering purposes a template parameter may remain without a value provided it is not used in the types being used for partial ordering. [ Note: A template parameter used in a non-deduced context is considered used. — end note ] [ Example:

template <class T> T f(int); // #1template <class T, class U> T f(U); // #2

void g() {f<int>(1); // calls #1}

— end example ]

What does that mean? T is a non-deduced context in #2, so in partial ordering, we could deduce U=int but we can’t deduce T. This point indicates that even though we can’t deduce T, we just consider that we could. Template parameters appearing only in non-deduced contexts count as if they were deduced. I think. Maybe.

This means, in our original example, that deduction succeeds #1 → #2 (we ignore the non-deduced context) but not #2 → #1 (we deduce different types for T), so we call #1: the (T,T) function, which is the opposite of what my intuition would have suggested. But here we also have our first divergence: while gcc, clang, and icc all agree that #1 is more specialized, MSVC considers the call ambiguous. Does that make MSVC wrong? I don’t actually know. Interestingly, MSVC does not consider the quoted standard example to be ambiguous. Why the difference in reasoning?

Let’s add another wrinkle on top of this old man skin. Class template partial specialization. How do we pick which class template partial specialization to use? We use the partial ordering rules. From [temp.class.order]:

For two class template partial specializations, the first is more specialized than the second if, given the following rewrite to two function templates, the first function template is more specialized than the second according to the ordering rules for function templates (14.5.6.2):

(1.1) — Each of the two function templates has the same template parameters as the corresponding partial specialization.

(1.2) — Each function template has a single function parameter whose type is a class template specialization where the template arguments are the corresponding template parameters from the function template for each template argument in the template-argument-list of the simple-template-id of the partial specialization.

Consider this example (which I also originally asked about on StackOverflow):

void_t is a popular way to write type traits, so seems appropriate to feature it in this post somehow. To answer the question of what this program prints, we have to know which specialization of A is chosen. And, as specified, that question is answered by determining which of these function is most specialized by the partial ordering rules:

Comparing #1 to #2 and #3, it’s clear that we can deduce #2 → #1 and #3 → #1 but not the reverse, so it’s between #2 and #3. Let’s try #3 → #2:

Function: template <class T> void foo(A<T, void_t<T>> ); // #2Arguments: A<UniqueA*, void>

This succeeds. We can match T=UniqueA* from the first parameter and then, as before, we ignore the non-deduced context. And #2 → #3:

Function: template <class T> void foo(A<T*, void> ); // #3Arguments: A<UniqueB, void_t<UniqueB>>

This fails. void_t<UniqueB> is not void (despite what the name and Walter Brown may tell you!) Since deduction only succeeds in one direction, great! #3 is the most specialized. We also have compiler unity on this: gcc, clang, icc, and MSVC all agree: #3 is the most specialized function here.

So back to the original question. What does A<int*, void>::value yield? It obviously yields 2 right, we just went through the process of determining this, all the compiler agreed with us!

Wrong. Every compiler consider this an ambiguous specialization between #2 and #3. This, to me, is quite baffling. The rule expressed in the standard is clear: the most specialized class template specialization is the one that gives the most specialized synthesized function template. All the compilers agree that #3 meets the bill for the latter, and yet they don’t agree that #3 is the most specialized class template specialization. This is a fascinating bit of compiler unity. I’m always hesitant to file a compiler bug even when I find one. But usually there’s there’s one compiler that’s right and one that’s wrong, I don’t even know what to do in this case. File bugs for everybody? How did this situation even arise?

Ultimately, this seems like a problem of underspecification. It would be nice if the standard gave more clarity on this subject. What do we actually do with non-deduced contexts in partial ordering? Especially as void_t and similar tools are used more commonly in code, it’s important to have clear guidance on what the code is actually supposed to do. SFINAE isn’t even well defined in class specializations and yet it’s prolific in it’s use there.

Regardless, there’s one thing that’s abundantly clear: C++ is really complicated.


Published by HackerNoon on 2017/02/25