Thinking In Code - How to design, approach, and solve programming problems.: December 2012

Saturday, December 8, 2012

Safe_Printf

Making a safe printf().
This is quickly stubbed in so I can get the code up for the Ultracode meet up. I'll write a detailed explanation soon. There also seems to be a Microsoft bug in this, so std::string doesn't properly convert to a c style string. This is not a complete example, but finishing it would be a pretty easy task.

#pragma once
#include <type_traits>
#include <string>
#include <exception>
#include <assert.h>
#include <cstdio>
#include <iostream>

template<class _Ty>
struct _is_char : std::false_type {};

template<>
struct _is_char<char> : std::true_type {};

template <class _Ty>
struct _is_c_string : std::false_type {};

template<class _Ty>
struct _is_c_string<_Ty *> : _is_char<typename std::remove_cv<_Ty>::type> {};

template<class _Ty>
struct is_c_string : _is_c_string<typename std::remove_cv<_Ty>::type> {};

template <typename T>
typename std::enable_if<std::is_integral<T>::value, long>::type normalizeArg(T arg)
{std::cout << "LongType/n"; arg;}

template <typename T>
typename std::enable_if<std::is_floating_point<T>::value, double>::type normalizeArg(T arg)
{std::cout << "FloatType/n";return arg;}

template <typename T>
typename std::enable_if<std::is_pointer<T>::value, T>::type normalizeArg(T arg)
{std::cout << "PointerType/n";return arg;}

const char* normalizeArg(const std::string& arg)
{std::cout << "StringType/n";return arg.c_str();}

void check_printf(const char* f)
{
 for (; *f; ++f)
 {
  if (*f != '%' || *++f == '%') continue;
  throw std::exception("Too many format specifiers");
 }
}

template <typename T, typename... Ts>
void check_printf(const char* f, const T& t, const Ts&... ts)
{
 for (; *f; ++f)
 {
  if (*f != '%' || *++f == '%') continue;

  switch (*f)
  {
  default: throw std::exception("Invalid format");
  case 'd':
   if(!std::is_integral<T>::value)
   {
    throw std::exception("T is not an integral");
   }
   break;
  case 'f': case 'g':
   if(!std::is_floating_point<T>::value)
   {
    throw std::exception("T is not a float");
   }
   break;
  case 's':
   if(!is_c_string<T>::value)
   {
    throw std::exception("T is not a c string");
   }
   break;
  }
  return check_printf(++f, ts...);
 }
 throw std::exception("Too few format specifiers");
}

template <typename... Ts>
int safe_printf(const char* f, const Ts&... ts)
{
 check_printf(f, normalizeArg(ts)...);
 return printf(f, normalizeArg(ts)...);
}

Here is how to use it. Pretty basic.

#include "SafePrintf.h"
#include <string>
#include <iostream>

void main()
{
 std::string str("Try this out.");
 const int i = 5;
 float f = 0.123f;

 try
 {
  safe_printf("This is from safe printf. int %d, string %s, and a float %f", i, str, f);
 }
 catch(std::exception e)
 {
  std::cout << e.what();
 }
}

Wednesday, December 5, 2012

Move Semantics and ctors and assignment

Move Semantics

OK, so what are they? Consider this example.

const BigThing bigThing = makeBigThing();

Anyone worried about performance is going to cringe. Look at the wasted temporaries. We create data on the heap in the function, data in the intermediate temporary, and finally data again in the bigThing.So we do this instead.

BigThing bigThing;
makeBigThing(bigThing);

But that's not what we want at all, and it's harder to understand. Of course we could do this all with pointers, but that causes huge problems and syntax often looks weird, especially when using operators.

C++ now has move semantics, and perfect forwarding. This is a perfect example of why it was created. What do we know about those temporary objects? Well, they're temporary. So why not steal their guts? Turn them into organ donors. That's what move semantics does. Today I'll explain one of the ways we can use move semantics using rValue references. I'll show how we can take advantage of them to make a highly efficient assignment operator.

So how do we write them? What do they look like? What is an rValue anyway?

Let's go through these and others with a very common example. Let's do an rValue reference copy constructor. We will assume a class Foo that holds a class Bar by pointer and we want everything to be deep copies. Here is our class Bar:

class Bar
{
    SomeBigThing m_Big;
public:
    Bar() = default;
    Bar(const& Bar) = default;
    ~Bar() = default;
    Bar& operator=(const Bar&) = default;
};

We're using the default keyword here. It may seem a bit strange to say you are using the functions that are created for you anyway, but on the other hand, isn't it nice that you now know what I intend that class to do? It's an efficient way to tell everyone you want those default functions. delete removes the function so you can easily prevent copying.

So now we get to the rValue copy constructor.

class Foo
{
    Bar* m_Bar;
public:
    Foo(Foo&& rhs) : m_bar(rhs.m_bar) { rhs.m_bar = nullptr; }
};

So what the hell is Foo&& ?????
That is the rValue reference. OK. It looks a bit weird, but it's not too bad. The real question is what is an rValue. The full, 100% correct answer is a bit tedious, but fortunately the concept is easy enough to grasp with a few examples and applying to questions to an object. First some examples.

    int i; // lValue
    int j = i; // lValue
    int k = someFunc(); // k is an lValue,

                        // the return value of someFunc is an rValue
    42; // rValue
    4 * 5; // rValue

rValues are the temporary, unnamed objects we are trying to get rid of. That leads to a very easy way of thinking about rValues. They have two properties that are simple to test for.

They do not have a name.
You cannot get their address.

Simple.

OK? There is one slight hitch, and I promise I'm not trying to make this difficult. The parameter coming in to Foo(Foo&&) is an rValue, but... rhs is an lValue. How do we know this? It has a name and you can get its address. But it's OK. We know it came in as an rValue so we can use it as an organ donor.

So let's look at that code again.

class Foo
{
    Bar* m_Bar;
public:
    Foo(Foo&& rhs) : m_bar(rhs.m_bar) { rhs.m_bar = nullptr; }
};

What's going on here? Well, we simply make a shallow copy of the pointer and set rhs.m_Bar to null. This is important! The destructor WILL be called on rhs and if m_Bar is not set to null, it will be deleted, destroying the whole point of doing the rValue copy in the first place.

In the last post, I talked about the assignment operator. I also wrote that that old paradigm was now different because of rValue references and move semantics. If you remember we created a temporary and then immediately deleted it. What a waste, but in the past there really was no way around it. But what is

    Foo(rhs);

It's an rValue! And what do we know about rValues? They are temporary and you can steal from them. So our assignment operator becomes this:

    Foo& operator=(Foo rhs)
    {
        *this = std::move(rhs);
        return *this;
    }

Now all that's left is is to write the rValue ref version of the assignment operator.

    Foo& operator=(Foo&& rhs)
    {
        delete m_bar;
        m_bar = rhs.m_bar;
        rhs.m_bar = nullptr;

        return *this;
    }

We know that calling delete on a nullptr is fine. It won't do anything, so we don't need a conditional. We steal something that is temporary anyway, so that is fast. And we make sure what we have isn't deleted by setting where we stole it from to nullptr. Now we have exception safe code with no extra copies.

We just need an lValue copy constructor (remember, we can't use rValue because we're stealing).

    Foo(const Foo& rhs) : m_bar((rhs.m_bar) ? new Bar(*rhs.m_bar) : nullptr) {;}

And here's everything together.

class Bar
{
    int i;
public:
//     Bar() = default;
};

class Foo
{
    Bar* m_Bar;
public:
    Foo() : m_bar(new Bar()) {;}

    Foo(const Foo& rhs) : m_bar((rhs.m_bar) ? new Bar(*rhs.m_bar) : nullptr) {;}

    Foo(Foo&& rhs) : m_bar(rhs.m_bar) { rhs.m_bar = nullptr; }

    Foo& operator=(Foo&& rhs)
    {
        delete m_bar;
        m_bar = rhs.m_bar;
        rhs.m_bar = nullptr;
        return *this;
    }

    Foo& operator=(Foo rhs)
    {
        *this = std::move(rhs);
        return *this;
    }
    ~Foo() { delete m_bar; }
};

Saturday, December 1, 2012

Rule of Three, or how not to blow your leg off.

It's been far too long and there has been too much good stuff not to write about. This is going to be all about Move semantics. Writing safe, clean, fast code can be easy if you follow some basic principles. Memory leaks are very often minimized or eliminated if some basic principles are applied. The first is to use shared_ptr and make_shared instead of new and raw pointers. shared_ptr is such an old pattern that compiler writers and chip designers already account for it. Now that it is standard they can do even more.

Sometimes we need raw pointers. Maybe we need to make our own container or represent a bitmap. That being said, my use of raw pointers is extremely rare. However, if you do need them, or want to ignore the advice and use them anyway, you should be aware of the "rule of three."

Rule of Three

The rule of three is simple. Copy constructor, assignment operator, destructor; if you write one, you should write the other two or disable them by setting them to delete. delete and default are new C++11 features.

class Bar
{
public:
    Bar() = default;
    Bar(const Bar&) = delete;
};

Default says to use the plain old version and delete says don't make one. While it may seem silly to explicitly say I'm using the one that automatically gets created, it's important to let others know your intent.

The reason for this rule is simple. Generally we write the destructor because we have allocated a resource. Maybe it's a file or memory or whatever. If we do, we need to make sure it gets properly taken care of. If you don't delete the assignment operator, you're in for a rude surprise.

Writing an assignment operator is not an easy thing to do and is my standard whiteboard question. I have had very few candidates get it correct. Getting it right can be very tricky, unless you know the trick. The best news is that trick just got better with move semantics, but that will be another post.

So how hard could it be?

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        m_Bar = rhs.m_Bar;
        return *this;
    }
};

This is just a shallow copy, exactly what the compiler would do only wrong. We want a deep copy. We'll pretend we didn't delete the copy constructor.

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        m_Bar = new Bar(rhs.m_Bar);
        return *this;
    }
};

Well, that's a problem because we probably have a memory leak. Let's try again.

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        delete m_Bar;
        m_Bar = new Bar(rhs.m_Bar);
        return *this;
    }
};

OK, but what if rhs.m_Bar = nullptr? <sigh>

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        delete m_Bar;
        if(rhs.m_Bar)
        {
            m_Bar = new Bar(rhs.m_Bar);
        }
        return *this;
    }
};

What if I do something really stupid, but with references and aliases, you know it will happen.

    Foo f1;
    ... Do stuff
    f1 = f1;

<Are you kidding?>

class Foo

{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        if (this == &rhs) return *this;
        delete m_Bar;
        if(rhs.m_Bar)
        {
            m_Bar = new Bar(rhs.m_Bar);
        }
        return *this;
    }
};

OK, but <now what?!?!?!> What if new throws an exception? <@#$*&(#@>
The whole body can be written in 3 lines of code.

class Foo
{
    Bar* m_Bar;
public:
    Foo& operator=(const Foo& rhs)
    {
        Bar temp(rhs);
        std::swap(rhs, *m_Bar);
        return *this;
    }
};

That is the second important rule

Copy and Swap

Now you know it, time to forget it because C++11 has something even better!