Introduction:
Today's fundamental question is the same as last time's, with one alteration: we'll be focusing on techniques for implementing structs and classes rather than functions. The big question we'll be answering is: When writing code that can be instantiated with many types, how do I reuse code for some types but select different fundamental implementations for others? What if there are patterns to the "exceptions" to the "default code"?
Why is this important? It facilitates composition: I can have a function foo
templated over any type T
and know that I can construct/use a MyType<T>
even if the behavior is different from one type to the next. I don't need to call a MyTypeForNumerics<T>
when T
is a numeric type like int
or double
and a MyTypeForStrings<T>
when T
is some string type.
We'll be analyzing how the technique we learned in the last article applies to structs and what its limitations are in that context. We'll then look at a technique that lets us overcome those limitations, but that has a few of its own.
Case study:
The case study we'll be using today is std::unique_ptr
. The short story on that class is that it wraps a pointer and properly deletes it when that wrapper object destructs (falls out of scope). The key fragment of that description is 'properly deletes it'. If the pointer referred to a single value, we need to call delete ptr;
, but if it referred to an array, we must call delete[] ptr;
, otherwise, the implementations are essentially identical. In other words: I want to present a single interface, std::unique_ptr
that takes single values (like unique_ptr<int>
) and array types (like unique_ptr<int[]>
) for which a portion of the implementation is different. Today we'll be focusing on the particular fragment of unique_ptr
's implementation that enables this difference in freeing resources.
So, beginning with discussing our last technique and moving on to a new one, let's start answering the questions from this article's prelude.
Q: What was the goal of overloading/tag dispatch? Why was it useful for implementing multiple fundamental implementations of a particular function?
A: The goal of overloading was to allow us to write 2 distinct function bodies to perform a single conceptual task; without overloading those functions would have to have different names. The goal of tag dispatch was to select between those overloads; we used type information in combination with overload resolution to perform that selection.
Q: Can we apply what we learned in the last article to provide multiple fundamental implementations of a struct? What are the limitations of overloading/tag dispatch when it comes to writing structs?
A: To an extent. If you'll recall from the first article in the series, you cannot overload a struct, only functions. The bright side is that overloading works just fine with member functions. Here's a quick example of tag dispatch with a member function:
struct IntegralValueCounter {
unsigned integralValuesSeen() { return integralCounter; }
template<typename T>
void readValue(const T& value) {
//No runtime hit to select an overload
readValueImpl(value, typename std::is_integral<T>::type{});
}
private:
template<typename T>
void readValueImpl(const T& value, std::true_type) { ++integralCounter; }
template<typename T>
void readValueImpl(const T& value, std::false_type) {} //No-op
unsigned integralCounter = 0u;
};
Nothing there should look too foreign by now. If something's confusing, I recommend revisiting the last article and/or posting below in the comments section.
One nice property of doing this in a struct is that we can hide our Impl functions nicely in the private section, so that's a small victory.
The downside to using tag dispatch with structs is that you might need to do it for several member functions... or all of them. If I had 10 member functions and the behavior of those functions was completely different for 2 different kinds of types, tag dispatch holds less appeal. In such a case, tag dispatch requires writing 3 function bodies for each function I want in my interface (not even counting helpers/private functions). That gets unwieldy pretty fast.
In context of unique_ptr
:
For our purposes, most of the unique_ptr
implementation is actually the same over both kinds of types (single values and arrays), but tag dispatch isn't directly applicable because the part that varies is exposed directly to the user and subject to their input. By default unique_ptr
assumes you allocated the memory the pointer refers to with new
, and so it should call either delete
or delete[]
, but it's possible that you'd allocate some memory with malloc
instead (perhaps to avoid initializing it/calling constructors), in which case you'd need to call free
. unique_ptr
facilitates this possibility by templating over the deleter. In particular, the declaration of unique_ptr
is:
template<typename T, typename Deleter = std::default_delete<T>>
class unique_ptr;
Because there are potentially infinitely many "fundamental implementations" of unique_ptr
, given by potentially infinitely many deleters a client might instantiate it with, tag dispatch isn't going to help us. This interface for std::unique_ptr
is actually pretty cool; we brought users into the discussion of how our object should behave. As long as they provide a type that accepts a T*
when called like a function, it's compatible as a deleter.
The bit we'll be implementing today is the std::default_delete
struct, which serves as a functor whose function operator overload calls the correct form of delete. We could use tag dispatch to implement this, but since the behavior of all ofdefault_delete<T>
is different from all ofdefault_delete<T[]>
, it seems more appropriate to use a new technique to generate less clutter.
Q: What technique allows us to similarly divide one concept, in this case, a type, into multiple implementations?
A: Partial template specialization. Essentially this technique says, "If the type(s) I'm templated over look like this, select this version". The interesting part is that unlike using tag dispatch for member functions, with this technique the whole struct is distinct/different [1]. Here's what that looks likes [2]:
template<typename T>
struct default_delete {
void operator()(T* ptr) const {
delete ptr;
}
};
template<typename T>
struct default_delete<T[]> {
void operator()(T* ptr) const {
delete[] ptr;
}
};
Things to pay attention to:
1) There are two entire copies of the struct. If I wrote a member foo()
in the first, but not the second, then you'd only be able to call foo()
when default_delete
was instantiated with a non-array. We replace everything.
2) A side-effect of the first item is that this implementation detail, using partial template specialization, is not something you can easily hide from your users; you can't tuck it away in some Impl function like you can with tag dispatch. The code responsible for partial template specialization is part of your interface. Most of the time this is a non-issue, but it's worth being aware of.
3) Each version calls the appropriate form of delete.
4) The line struct default_delete<T[]> {
. This is where the partial template specialization "happens". We say: for types where T[]
matches the whole template argument that was passed to us, select this implementation. It's a pattern matching system. Even though the non-specialized implementation would also match types of this form, the C++ language prefers/selects more specific/specialized templates.
And that's how they do it for unique_ptr
[2]. (How anti-climactic)
Q: In what ways can you select implementations using partial template specialization? For example, how might I apply the technique to single cases that must be handled specially (like vector<bool>
being different from all other implementations of vector
) vs. patterns of differences (as in how the implementation of unique_ptr<T>
varies from unique_ptr<T[]>
)?
A: You've already seen how the technique applies to unique_ptr
. Let's take a look at what other kinds of pattern matching are possible with partial template specialization:
template<typename T>
struct Foo {};
template<>
struct Foo<int> {};
This has a specialized version for the type int
. Notice how we 'fixed' the type in the specialization to a specific type. It will only accept the type int
(const int
would not go to the partial specialization, however). Because the type is not able to vary, we do not include a typename T
in the compile-time (template) arguments. The template<>
portion is significant and must be present to tell the compiler that you're talking about a specialization on a templated struct. Note: This is similar to how the vector<bool>
specialization is done.
template<typename T>
struct Bar {};
template<typename T>
struct Bar<vector<T>> {};
This has a specialized version for when you instantiate with a vector
holding any type T
, as in Bar<vector<int>>
.
template<typename T1, typename T2>
struct Baz {};
template<typename T>
struct Baz<T, int> {};
template<>
struct Baz<double, int> {};
This has 2 partial specializations.
The first specialization is used when T1
is anything and T2
is int
. Note how the template portion refers to only one varying type. Also see how that varying type must be repeated after the struct name to disambiguate whether it is the first or second template parameter that is being specialized. I chose to change the name of the templated type to T
in the specialization only to show that the name itself has no meaning, feel free to continue calling it T1
if you prefer that.
The second specialization is used only when you instantiate Baz with precisely a double
and an int
in that order. Note that the template portion is still necessary and should be empty. Even though the first partial template specialization would also match this case, because this version is more constrained/precise, it is the one that gets used when instantiating Baz<double, int>
.
template<typename T, bool = std::is_integral<T>::value>
struct Widget {};
template<typename T>
struct Widget<T, true> {};
Here the partial specialization accepts any integral type (int
, char
, long long
, etc). This one was a bit more complicated, we artificially introduced a value of type bool to the compile-time (template) arguments whose default value indicates whether or not T
is integral. We then partially specialized over that value, so that we can have an implementation that is distinct for all integral types. Our hope is that the user ignores the bool in the interface, though you could hide this implementation detail by adding an Impl class or factory to create a layer of indirection to hide it in.
For some of you, having a value rather than a type as a template parameter will be an unusual sight. I like to think of the template part of the code as a set of "compile time parameters", of which some happen to be types. Think of it this way and it should hopefully make more sense. Note: I could have also used typename std::is_integral<T>::type
, and overloaded on std::true_type
but personally I find the code is a little more ugly that way.
Q: What are the limitations on the scope of partial template specialization? It may be useful to review the answer to the first question of this article when answering this.
A: The opposite of our problem with tag dispatch: We must provide a completely different implementation of our struct/class in each partial specialization. If there are things they would share in common, in the naive solution those pieces must be duplicated. With tag dispatch we change small portions of the whole, with partial template specialization we change everything. [1]
We actually already saw one way to overcome this limitation earlier in this article: move the portion of the code that is different into a separate struct, then partially specialize that struct and use it as a data member inside the larger whole (as std::unique_ptr
does with std::default_delete
).
We will explore other ways of overcoming this limitation in future articles.
Recap:
We answered the fundamental question: When writing code that can be instantiated with many types, how do I reuse code for some types but select different fundamental implementations for others? What if there are patterns to the "exceptions" to the "default code"? Specifically, we answered that question in the context of objects: structs and classes.
We reused the technique of tag dispatch from the last article to demonstrate how to have multiple versions of member functions, selected by type. We then noted that for an object where most of the behavior changed when it was templated with one type versus another, that tag dispatch became a little unwieldy and came with an increase in code-size that varied linearly with the number of functions.
It became apparent that if we want to bring users into the conversation of "how can the behavior of this object vary?", that tag dispatch is too limiting; it does not allow users of our code to provide their own implementations of the varying behavior. So we looked at what the STL does with unique_ptr
: it separates out the varying behavior into its own struct which becomes the default argument to a template parameter on unique_ptr
. This is great because it means that the unique_ptr
implementation for single values and arrays is the same; it's only the deleter that varies!
We learned a new technique for code reuse when working with objects: partial template specialization. This facilitates reuse because it allows the user of our code to remain ignorant of implementation details and just use unique_ptr
, lets say, over any type, without worrying too much about what type we give it. This is especially great because it facilitates composition: we can have a function that accepts a template type V
and then makes a unique_ptr<V>
, knowing that that line will compile regardless of whether or not V
is an array type.
We noticed that partial template specialization is a technique that shows up in your interface. It's harder to disguise.
We also examined what kinds of patterns can be used with partial template specialization, including complex ones using type traits.
Lastly, we talked about the limitations of partial template specialization: we noticed that it requires that you change the content of the entire struct. You cannot trivially opt to only substitute the implementation of a few functions with this technique. We discussed one way to overcome this limitation, but left more complex strategies as subject matter for future articles.
Footnotes:
[1] For completeness' sake, there is actually something called "total specialization", which lets you specialize individual member functions of a class template. If you can specify what all of the template types are for your function implementation "totally specializing it" on exactly some set of types, you can do this. It looks like the following:
template<typename T1, typename T2>
struct Foo {
void bar() {}
};
template<> void Foo<int, int>::bar() {
cout << "Foo<int, int>::bar is totally specialized and does something special" << endl;
}
I don't find this to be particularly useful, as it cannot be any more general than this. I can't leave one of my two template types variable and use this technique. It's all or nothing.
[2] A proper implementation of std::default_delete
also has constructors that vary between the two implementations. The revised C++17 version also separately templates the type of the pointer argument the function operator overload.