Sunday, February 17, 2013

Concept Based Polymorphism

Problem

We often wish to collect data into containers in order to manage large numbers of things. To put it less correctly but more clearly, we want collections of objects that represent a family. The reason this is wrong is because fundamental types aren't often thought of as objects, and really we aren't talking about families, rather things to be processed a certain way. Why this distinction matters, I hope, will become clear.

Base Class Solution

The almost reflexive solution is to derive everything from a common base class. Ideally this is an interface class that is pure virtual. There are a number of problems with this. Many experts, and I also agree, call this the most abused feature of C++.

Fundamental types, and STL types do not inherit from anything requiring they be wrapped. Often objects bridge responsibilities requiring either multiple interfaces, or overly broad interfaces (kitchen sink classes).
Classes often are shoehorned into an interface, the famous birds analogy. When ostriches and emus refuse to fly, are they no longer birds?

Inheritance is often not a good solution.

Internal Polymorphism

Classes could be internally polymorphic. They could rely on internal flags to determine what kind of thing they are today, and this one, amorphous blob class can, internally, be whatever it wants. This is too horrific to consider further.

Adaptor and Decorator Classes

We could wrap the class. How much do we wrap? Do we need every function adapted? Doesn't that start to bring back the same problem? Wrapping can also be tedious as it is difficult to automate. What if a class needs to behave as if it has functionality, but it doesn't. What if they have different memory sizes? We are getting close, but aren't quite there.

Solution

Wrapping data has some definite benefits. We don't need to modify the classes. They can be written in a way that best expresses their use without imposing design from outside. But how do we solve the wrapping problems. We can use Concepts, and non-member functions.

Concepts

Concepts are a generic programming term that describes operations on a type. It's like an interface class, but there is no inheritance and as of now, no compiler support, but that's OK. We can still use them. A Concept is really just a generic programmer telling programmers who use his stuff what functions and signatures are required. We've been using these all along. When we increment an iterator, there is no base class that defines that functionality. It's part of the documentation. Therefore, we can pass iterators into generic algorithms that expect that functionality.

Non-Member Functions Increase Encapsulation

I know. This is just wrong! Wrong, wrong, wrong, wrong, wrong, wrong, wrong!

Except it's not.

The reason we encapsulate things is to protect us against change. If I directly manipulate the data in a struct, that would not be well encapsulated. On the opposite extreme, it's hard to get better encapsulation than objects who's entire functionality resides in its constructor and destructor.

Encapsulation is measured by the number of functions that have to be changed if we change the implementation of our class. A class with n functions is more encapsulated than a class with n+1 functions. Non-member, non-friend functions do not increase n.

Here's an even bigger one. By using non-member functions, we effectively have polymorphism via function overloading. Rather than object->Draw(), and needing VTables, we can Draw(object) and leave it to compile time. This makes the problem much easier to solve.

Really, The Solution

Always start by writing the code you wish to write.
class MyClass
{
};

void Draw(const MyClass& mc, ostream& out, size_t position)
{
  out << "MyClass" << endl;
}

int main()
{
  Document document;
  document.reserve(5);

  document.emplace_back(0);
  document.emplace_back(string("Hello"));
  document.emplace_back(document);
  document.emplace_back(MyClass());

  reverse(begin(document), end(document));
  reverse(begin(document), end(document));

  Draw(document, cout, 0);
  return 0;
} 
We have a collection of things that have no relation whatsoever to one another, yet we are adding them to a common container. How does this work?

Here's the whole thing.
#pragma once
#include <iostream>
#include <vector>
#include <string>
#include <memory>

template <typename T>
void Draw(const T& x, std::ostream& out, size_t position)

 out << std::string(position, ' ') << x << std::endl;
}

class DrawConcept
{
public:
 template <typename T>
 DrawConcept(const T& x) : m_object(new model<T>(x))
 {
  std::cout << "Ctor" << std::endl;
 }

 DrawConcept(const DrawConcept& x) : m_object(x.m_object->copy_())
 {
  std::cout << "Copy" << std::endl;
 }

 DrawConcept(DrawConcept&& x) : m_object(std::move(x.m_object)) {;}

 DrawConcept& operator=(DrawConcept x)
 {
  m_object = std::move(x.m_object); return *this;
 }

 friend void Draw(const DrawConcept& x, std::ostream& out, size_t position)
 {
  x.m_object->draw_(out, position);
 }

private:
 struct concept_t
 {
  virtual ~concept_t() {;}
  virtual concept_t* copy_() = 0;
  virtual void draw_(std::ostream& out, size_t position) const = 0;
 };

 template <typename T>
 struct model : concept_t
 {
  model(const T& x) : m_data(x) {;}
  concept_t* copy_() { return new model(*this); }
  void draw_(std::ostream& out, size_t position) const
  {
   Draw(m_data, out, position);
  }
   T m_data;
 };
 std::unique_ptr<concept_t> m_object;
};

typedef std::vector<DrawConcept> Document;

void Draw(const Document& x, std::ostream& out, size_t position)
{
 out << std::string(position, ' ') << "<document>"<< std::endl;
 for(auto& e : x) Draw(e, out, position + 2);
 out << std::string(position, ' ') << "</document>"<< std::endl;
}
To see how it works, let's take it apart.
struct concept_t
  {
    virtual ~concept_t() {;}
    virtual concept_t* copy_() = 0;
    virtual void draw_(std::ostream& out, size_t position) const = 0;
  };
This is just a simple interface class. Note the copy_(). This is to ensure we get a copy of the actual derived class, not the base class. It's also written as clone().

The derived class is next.
template <typename T>

  struct model : concept_t

  {

    model(const T& x) : m_data(x) {;}

    concept_t* copy_() { return new model(*this); }

    void draw_(std::ostream& out, size_t position) const

    {

      Draw(m_data, out, position);

    }

     T m_data;

  };
This is the wrapper around the draw-able object. By deriving from concept_t we get an object that we can hold type agnostic. By templatizing the class, we can hold anything. Copy returns a derived class copy, but the real trick is the draw function. Here we take the virtual member draw and make it a non-member draw, allowing compile-time polymorphism.

There's just one more piece.
template <typename T>
 DrawConcept(const T& x) : m_object(new model<T>(x))
 {
    std::cout << "Ctor" << std::endl;
 } 
Even though the constructor is templatized, this is not a template class. That's important, because if it was it would be impossible to add to containers.

That's it.

The template function at the top is optional. It provides a common way to draw, but can be overloaded if an object has a different way to draw such as MyClass above.

The rest is just copy and assignment code common to most classes.

Conclusion

There are alternatives to typical inheritance. By using concept based polymorphism, you can leave your class interface alone to be the best representation of your abstraction and not a collection of externally forced design. Returning to those birds that refuse to fly, simply don't add them to a fly concept. If you do, rather than have a silent error running in your game, the compiler will squawk (pun intended) and the error gets fixed.

References

Sean Parent's excellent talk

Adobe Run Time Concepts

Tuesday, January 15, 2013

Swaptimizations - exploiting copy elision and RVO

Performance critical code for years has relied on passing everything by pointers to heap allocated objects. Often this is not the best, or even the fastest way to do this. Passing by value and returning by value use stack based entities which are much faster to allocate, de-allocate, and don't cause fragmentation of memory.
"C++ is my favorite garbage collected language because it generates so little garbage" - Bjarne Stroustrup
This is true if you write in such a way as to make this true. Smart pointers made it pretty easy to eliminate memory leaks, but it can be even better. Using RVO and copy elision we can pass by value pretty much all the time without the overhead of temporaries. Consider this code:
Texture MakeBigTexture
{
  Texture bigTexture();
  Do stuff
  return bigTexture;
}

void Init()
{
  Texture temp = MakeBigTexture();
}
This code seems wrong. What about the giant temporary? It doesn't exist. Compilers have long taken advantage of Return Value Optimization or RVO. The way it works is that the compiler knows the return is temporary. It also knows where it will ultimately end up. Why not put the temporary where it will ultimately end up? And that's what it does. 
Copy elision is the opposite. Here's an example
string FlipString(string str)
{
  reverse(str.begin(), str.end());
  return str;
}

string Source()
{
  return "Flip This String";
}

void main
{
  cout << FlipString(Source()) << endl;
}
The output of Source is a temporary so again the compiler moves the temporary to its final destination. In this case str.
Now the tricky part. You do not get RVO here. The reason is kind of simple. The compiler can put a temporary in one place or the other, but not both. So we will get a copy. But what if that copy is too expensive? Do we have to revert to pointers and references? No. We can do Swaptimization. Here's how it works.
string FlipString(string str)
{
  string result;
  result.swap(str);
  reverse(result.begin(), result.end());
  return result;
}

string Source()
{
  return "Flip This String";
}

void main
{
  cout << FlipString(Source()) << endl;
}
Result uses RVO and str uses copy elision. C++ 11 gives us rvalues and so we don't need to do this, but that is for another time.

Saturday, December 8, 2012

Safe_Printf

Making a safe printf().
This is quickly stubbed in so I can get the code up for the Ultracode meet up. I'll write a detailed explanation soon. There also seems to be a Microsoft bug in this, so std::string doesn't properly convert to a c style string. This is not a complete example, but finishing it would be a pretty easy task.

#pragma once
#include <type_traits>
#include <string>
#include <exception>
#include <assert.h>
#include <cstdio>
#include <iostream>

template<class _Ty>
struct _is_char : std::false_type {};

template<>
struct _is_char<char> : std::true_type {};

template <class _Ty>
struct _is_c_string : std::false_type {};

template<class _Ty>
struct _is_c_string<_Ty *> : _is_char<typename std::remove_cv<_Ty>::type> {};

template<class _Ty>
struct is_c_string : _is_c_string<typename std::remove_cv<_Ty>::type> {};

template <typename T>
typename std::enable_if<std::is_integral<T>::value, long>::type normalizeArg(T arg)
{std::cout << "LongType/n"; arg;}

template <typename T>
typename std::enable_if<std::is_floating_point<T>::value, double>::type normalizeArg(T arg)
{std::cout << "FloatType/n";return arg;}

template <typename T>
typename std::enable_if<std::is_pointer<T>::value, T>::type normalizeArg(T arg)
{std::cout << "PointerType/n";return arg;}

const char* normalizeArg(const std::string& arg)
{std::cout << "StringType/n";return arg.c_str();}

void check_printf(const char* f)
{
 for (; *f; ++f)
 {
  if (*f != '%' || *++f == '%') continue;
  throw std::exception("Too many format specifiers");
 }
}

template <typename T, typename... Ts>
void check_printf(const char* f, const T& t, const Ts&... ts)
{
 for (; *f; ++f)
 {
  if (*f != '%' || *++f == '%') continue;

  switch (*f)
  {
  default: throw std::exception("Invalid format");
  case 'd':
   if(!std::is_integral<T>::value)
   {
    throw std::exception("T is not an integral");
   }
   break;
  case 'f': case 'g':
   if(!std::is_floating_point<T>::value)
   {
    throw std::exception("T is not a float");
   }
   break;
  case 's':
   if(!is_c_string<T>::value)
   {
    throw std::exception("T is not a c string");
   }
   break;
  }
  return check_printf(++f, ts...);
 }
 throw std::exception("Too few format specifiers");
}

template <typename... Ts>
int safe_printf(const char* f, const Ts&... ts)
{
 check_printf(f, normalizeArg(ts)...);
 return printf(f, normalizeArg(ts)...);
}

Here is how to use it. Pretty basic.

#include "SafePrintf.h"
#include <string>
#include <iostream>

void main()
{
 std::string str("Try this out.");
 const int i = 5;
 float f = 0.123f;

 try
 {
  safe_printf("This is from safe printf. int %d, string %s, and a float %f", i, str, f);
 }
 catch(std::exception e)
 {
  std::cout << e.what();
 }
}


Wednesday, December 5, 2012

Move Semantics and ctors and assignment


Move Semantics

OK, so what are they? Consider this example.

const BigThing bigThing = makeBigThing();

Anyone worried about performance is going to cringe. Look at the wasted temporaries. We create data on the heap in the function, data in the intermediate temporary, and finally data again in the bigThing.So we do this instead.

BigThing bigThing;
makeBigThing(bigThing);

But that's not what we want at all, and it's harder to understand. Of course we could do this all with pointers, but that causes huge problems and syntax often looks weird, especially when using operators.

C++ now has move semantics, and perfect forwarding. This is a perfect example of why it was created. What do we know about those temporary objects? Well, they're temporary. So why not steal their guts? Turn them into organ donors. That's what move semantics does. Today I'll explain one of the ways we can use move semantics using rValue references. I'll show how we can take advantage of them to make a highly efficient assignment operator.

So how do we write them? What do they look like? What is an rValue anyway?

Let's go through these and others with a very common example. Let's do an rValue reference copy constructor. We will assume a class Foo that holds a class Bar by pointer and we want everything to be deep copies. Here is our class Bar:

class Bar
{
    SomeBigThing m_Big;
public:
    Bar() = default;
    Bar(const& Bar) = default;
    ~Bar() = default;
    Bar& operator=(const Bar&) = default;
};

We're using the default keyword here. It may seem a bit strange to say you are using the functions that are created for you anyway, but on the other hand, isn't it nice that you now know what I intend that class to do? It's an efficient way to tell everyone you want those default functions. delete removes the function so you can easily prevent copying.

So now we get to the rValue copy constructor.

class Foo
{
    Bar* m_Bar;
public:
    Foo(Foo&& rhs) : m_bar(rhs.m_bar) { rhs.m_bar = nullptr; }
};

So what the hell is Foo&& ?????
That is the rValue reference. OK. It looks a bit weird, but it's not too bad. The real question is what is an rValue. The full, 100% correct answer is a bit tedious, but fortunately the concept is easy enough to grasp with a few examples and applying to questions to an object. First some examples.

    int i; // lValue
    int j = i; // lValue
    int k = someFunc(); // k is an lValue, 
                        // the return value of someFunc is an rValue
    42; // rValue
    4 * 5; // rValue

rValues are the temporary, unnamed objects we are trying to get rid of. That leads to a very easy way of thinking about rValues. They have two properties that are simple to test for.
  • They do not have a name.
  • You cannot get their address.
Simple.

OK? There is one slight hitch, and I promise I'm not trying to make this difficult. The parameter coming in to Foo(Foo&&) is an rValue, but... rhs is an lValue. How do we know this? It has a name and you can get its address. But it's OK. We know it came in as an rValue so we can use it as an organ donor. 

So let's look at that code again.

class Foo
{
    Bar* m_Bar;
public:
    Foo(Foo&& rhs) : m_bar(rhs.m_bar) { rhs.m_bar = nullptr; }
};


What's going on here? Well, we simply make a shallow copy of the pointer and set rhs.m_Bar to null. This is important! The destructor WILL be called on rhs and if m_Bar is not set to null, it will be deleted, destroying the whole point of doing the rValue copy in the first place.

In the last post, I talked about the assignment operator. I also wrote that that old paradigm was now different because of rValue references and move semantics. If you remember we created a temporary and then immediately deleted it. What a waste, but in the past there really was no way around it. But what is

    Foo(rhs); 

It's an rValue! And what do we know about rValues? They are temporary and you can steal from them. So our assignment operator becomes this:

    Foo& operator=(Foo rhs)
    {
        *this = std::move(rhs);
        return *this;
    }

Now all that's left is is to write the rValue ref version of the assignment operator.

    Foo& operator=(Foo&& rhs)
    {
        delete m_bar;
        m_bar = rhs.m_bar;
        rhs.m_bar = nullptr;
        return *this;
    }

We know that calling delete on a nullptr is fine. It won't do anything, so we don't need a conditional. We steal something that is temporary anyway, so that is fast. And we make sure what we have isn't deleted by setting where we stole it from to nullptr. Now we have exception safe code with no extra copies.

We just need an lValue copy constructor (remember, we can't use rValue because we're stealing).

    Foo(const Foo& rhs) : m_bar((rhs.m_bar) ? new Bar(*rhs.m_bar) : nullptr) {;} 

And here's everything together.

class Bar
{
    int i;
public:
//     Bar() = default;
};

class Foo
{
    Bar* m_Bar;
public:
    Foo() : m_bar(new Bar()) {;}

    Foo(const Foo& rhs) : m_bar((rhs.m_bar) ? new Bar(*rhs.m_bar) : nullptr) {;}

    Foo(Foo&& rhs) : m_bar(rhs.m_bar) { rhs.m_bar = nullptr; }

    Foo& operator=(Foo&& rhs)
    {
        delete m_bar;
        m_bar = rhs.m_bar;
        rhs.m_bar = nullptr;
        return *this;
    }

    Foo& operator=(Foo rhs)
    {
        *this = std::move(rhs);
        return *this;
    }
    ~Foo() { delete m_bar; }
};

Saturday, December 1, 2012

Rule of Three, or how not to blow your leg off.


It's been far too long and there has been too much good stuff not to write about. This is going to be all about Move semantics. Writing safe, clean, fast code can be easy if you follow some basic principles. Memory leaks are very often minimized or eliminated if some basic principles are applied. The first is to use shared_ptr and make_shared instead of new and raw pointers. shared_ptr is such an old pattern that compiler writers and chip designers already account for it. Now that it is standard they can do even more.

Sometimes we need raw pointers. Maybe we need to make our own container or represent a bitmap. That being said, my use of raw pointers is extremely rare. However, if you do need them, or want to ignore the advice and use them anyway, you should be aware of the "rule of three."

Rule of Three

The rule of three is simple. Copy constructor, assignment operator, destructor; if you write one, you should write the other two or disable them by setting them to delete. delete and default are new C++11 features.

class Bar
{
public:
    Bar() = default;
    Bar(const Bar&) = delete;
};

Default says to use the plain old version and delete says don't make one. While it may seem silly to explicitly say I'm using the one that automatically gets created, it's important to let others know your intent.

The reason for this rule is simple. Generally we write the destructor because we have allocated a resource. Maybe it's a file or memory or whatever. If we do, we need to make sure it gets properly taken care of. If you don't delete the assignment operator, you're in for a rude surprise.

Writing an assignment operator is not an easy thing to do and is my standard whiteboard question. I have had very few candidates get it correct. Getting it right can be very tricky, unless you know the trick. The best news is that trick just got better with move semantics, but that will be another post.

So how hard could it be?

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        m_Bar = rhs.m_Bar;
        return *this;
    }
};

This is just a shallow copy, exactly what the compiler would do only wrong. We want a deep copy. We'll pretend we didn't delete the copy constructor.

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        m_Bar = new Bar(rhs.m_Bar);
        return *this;
    }
};

Well, that's a problem because we probably have a memory leak. Let's try again.

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        delete m_Bar;
        m_Bar = new Bar(rhs.m_Bar);
        return *this;
    }
};

OK, but what if rhs.m_Bar = nullptr? <sigh>

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        delete m_Bar;
        if(rhs.m_Bar)
        {
            m_Bar = new Bar(rhs.m_Bar);
        }
        return *this;
    }
};

What if I do something really stupid, but with references and aliases, you know it will happen.

    Foo f1;
    ... Do stuff
    f1 = f1;

<Are you kidding?>

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        if (this == &rhs) return *this;
        delete m_Bar;
        if(rhs.m_Bar)
        {
            m_Bar = new Bar(rhs.m_Bar);
        }
        return *this;
    }
};

OK, but <now what?!?!?!> What if new throws an exception? <@#$*&(#@>
The whole body can be written in 3 lines of code.

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        Bar temp(rhs);
        std::swap(rhs, *m_Bar);
        return *this;
    }
};

That is the second important rule

Copy and Swap

Now you know it, time to forget it because C++11 has something even better!

Saturday, March 31, 2012

Why Windows 8 will change everything.

I finally get out of bed. My alarm went off 2 hours earlier than normal. I pick up my phone and see I have a Skype call I forgot about. While getting dressed I glance down at my tablet and see how many mails and messages I have, and that I have a flight tomorrow. I also notice it's going to rain... again. I haven't even signed in yet. I sign in and continue getting ready. See who the mail and messages are from, but I'll check them later, I haven't opened a single app. I head to my PC and sign in. Skype is telling me I have a conference call. I realize Everything for the call is still at work. So I launch Photoshop and open the files from my work computer. I note the new project delivery dates and we change some product features. After the call I head off to work. With my pad I check the project status. The new tasks are in the project and I check on the project progress. There's a new web service we have to use so I create an account. Finally at work. I sign into my computer and open Chrome. I go to the web service and I am already logged into my account. I open Visual Studio and all my new tasks are ready for me to accept. Before I start I see some friends on facebook are having people for dinner. I accept. Near the end of day my phone alerts me that I have a dinner party to go to.

Maybe that doesn't seem so different to what can be done now. But the story is not what was done, but rather what was not done. I did not log into see if I had mail. I did not open mail after logging in to see who wrote me and what the subject line was. I did not even set my alarm clock. I did not copy files from my work computer to my home computer and then open them in Photoshop, I opened them in Photoshop from my work computer. Photoshop did not add new code to get this behavior. I did not tell Skype I had a call. I did not sign into a website twice even though I went to it on two different machines. I did not tell my phone that I had accepted a facebook event. This is amazing stuff and it's only going to get better.

We are now a user moving through a world of devices. Every device we sign into knows all our preferences, all our needed information. I think very soon we will not log in at all but use bio-metrics. Some already do. If I finish reading 30 pages of a book at home, my phone knows what page I'm on if I continue. I don't have to go through every level of a game because I have moved to another computer. I am a user moving through a world of devices. The devices conform to me.

On the coding side, most programs will just work like this. They don't even have to know about each other. When programmers want new interactions they provide small, simple apps that extend the functionality of every application I have. No coding required from the other apps. 

My favorite part is that no longer do we need to learn a new language for a new technology. All languages are treated equal. All languages can inter-operate. I can write a class in C++ that is called by C# or java script. I can call C# from C++. There are minor extensions to C++, and some proposed extensions to HTML5, but unlike the past, MS is working with the organizations to extend the standard rather than having MS versions of what is standard.

To some this will be a frightening threat to personal liberties. I am so glad I will no longer have to remember all the apps that need to know the same stupid piece of information, and all the devices I use and which apps are on them.

Microsoft, My baby's all grown up!

Most people who know me would never accuse me of being a Microsoft evangelist. Several would accuse me of being a Microsoft heathen. I gave MS a hard time, not because I hate MS, rather I knew they could do better but didn't. Well My Baby's All Grown Up! The excuses have been left behind, the sleazy business models have been tossed in the trash, and MS is out to kick ass!

Windows 8 is simply the best thing to happen to computers since Windows 95, and will be as big a change to all our lives. Anyone who remembers dual booting to play a game knows what I mean. To everyone else, it was a big, big deal.

I'm going to write a few posts about Windows 8 and my thoughts on what it means. I'll write about the business impact, the design impact, and the programming impact. The world is about to change. Are you ready?