Thinking In Code - How to design, approach, and solve programming problems.: Getting Started

This is a problem that I think shows how I approach programming in a pretty simple example.

Write the insert function for a doubly linked list. First I'll start with what I often see from new programmers. Don't worry, we'll clean it up.

void Insert(int location, Data* data)
{
    ListNode* ptr = m_Head;  
    if (!ptr)  
    {  
      m_Head = new ListNode();  
      m_Head->m_Data = data;  
      m_Head->m_Prev = null;  
      m_Head->m_Next = null;  
      return;  
    }  
    while(location && ptr->m_Next)  
    {  
      ptr = ptr->m_Next;  
      --location;  
    }  
    ListNode* oldNext = ptr->Next;  
    ptr->m_Next= new ListNode();  
    ptr->m_Next->m_Data = data;  
    ptr->m_Next->m_Prev = ptr;  
    ptr->m_Next->m_Next->m_Next = oldNext;  
}

OK. I think that has it. There are some bugs in there and some really tweeky ways of doing things. We won't fix this code. We will look at how to write it in a way there's nothing to fix.

Let's start at the beginning:

 void Insert(int location, Data* data)

First, it returns void so there really isn't any way to know if there are errors. And there can be no errors because the code just blasts them away by making assumptions. This restricts the ability of code use in larger projects and in projects you haven't even thought of yet.

Second, ListNode is treated like a struct. We're in C++. Even structs can do things for us.

 m_Head = new ListNode(data);

should have been all that was necessary.

But, I want to draw attention to the fact that this function does 4 things. When I write functions they do only one thing. Sometimes that one thing is to assemble other functions. Here are the 4 things as I see it.

Create a new ListNode and assign its data.
Find a location in the list.
Validating the position.
Add the node to the location.

So how can we write this better?

void Insert(int location, ListNode data)  
 {  
    List back = Split(location);  
    PushBack(data);  
    PushBack(back);  
 }

So that is much cleaner and has just as many bugs and quirkiness as before. But, which would be easier to fix? Let's fix it.

Iterator Insert(Iterator location, Data data)  
 {
    List back = Split(location);
    PushBack(data);
    Iterator rVal = end();
    --rVal;
    PushBack(back);
    return rVal;
 }

First, to be clear, this is not the right way to make a list class, but the methodology is the same as the STL. If you read through that code you will see many, many small functions with each changing often only a single element and passing the result down the chain. You might wonder if all I did was move the same code to another function. Even if that was all, it's an improvement. We now have three new useful functions for our list class. But let's look at what we now have.

If the location is the same as end() then Split() returns an empty list. So we have no conditionals in our code for end() checking.
PushBack() knows it is the end so it doesn't bother checking if there is anything else to connect.
The second version of PushBack can handle empty Lists in whatever manner best suits the implementation.

Any checks that remain are needed in other functionality anyway and others have been removed. The biggest thing is the code is easier to read, use, and maintain.

There are a few notes for code I have seen far too often. Notice I do not name one function PushBackNode() and the other PushBackList(). This is unnecessary and makes the code harder to use. Second I don't have the function create the iterator or the node. PushBack() handles data to Node conversion. In the case of the node you could use a single argument constructor on ListNode and not declare it explicit. This will automatically convert data to ListNode. Validating the iterator is not the responsibility of this class. Add a hierarchy of calls to add finding and validating above the actual function that does the work. Finally, and this is very important, don't worry about function call overhead. Working in games I hear performance as an excuse for poor code far too often. It's nonsense. Good code is fast. If your code is slow, your algorithm is slow. Compilers and computers have gotten very good at dealing with many small functions. If you need to go even faster then look for dumb performance mistakes like initializing a variable unnecessarily. After you have exhausted all else, start at the top and look again.

I hope this was useful. Feel free to comment, suggest, ask questions, or request topics.

2 comments:

AndrewJanuary 3, 2012 at 4:03 AM
Good article, especially the last big paragraph.

I am a C# programmer, so I did not completely follow (or try to follow) all of your code here, however I found your insight on the purpose of functions (to do only one thing) and that it's okay to use a lot of smaller functions to be helpful. I suggest and request that you try to make future posts more about concepts than C++ code examples if possible.

I also have a specific request that I would like you to make one or more posts about if possible:

A lot of times I find myself thinking through how I will make an entire class (for example) ahead of time, but then when I start to code it I make slight changes here and there and I completely forget how some parts of it are supposed to work. Is there a more efficient approach I can take to write a lot of code that is all dependent on the rest of the code or something?

Thanks!

Thinking In Code - How to design, approach, and solve programming problems.

Sunday, November 13, 2011

Getting Started - Inserting Into A Doubly Linked List

2 comments: