Friday, September 13, 2013

Coding for readability

I've read many books, sites and blog posts that explore the topic of what it means to write "good code". As a general rule they seem to mostly come down to "write the most efficient code possible". While as a rule of thumb I feel you should keep an eye on efficiency, I would argue that a more important point is writing code for readability. 

The reality is that over time even the most inefficiently written code will run better on faster hardware. Horribly inefficient code written 10 years ago runs blazingly fast today. That's not to say that you should sacrifice efficiency because it doesn't matter, but if you sacrifice readability for efficiency you end up causing problems later on. Let me give you an example.

    public class HelperClass
    {
        public HelperClass ()
        {
        }

        public void All(int x, int y)
        {
            for (int i =x; i<=y; i++) 
                Console.WriteLine (i);
        }

        public void Evens(int x, int y)
        {
            for (int i =x; i<=y; i++) 
                
//if its modulus is 0 it must be even
                if (i % 2 == 0) 
                    Console.WriteLine (i);
        }

        public void Odds(int x, int y{
            for (int i =x; i<=y; i++) 
            
//if its modulus is not 0 it must be odd
            if (i % 2 != 0) 
            Console.WriteLine (i);
        }

        public static void DoIt(int x, int y)
        {
            
HelperClass hlpr = new HelperClass();
            hlpr.All(x, y);
            hlpr.Evens(x, y);
            hlpr.Odds(x, y);
        }
   }


Look its a simple example, I know. I'm not trying to show off some elite coding skill, that's not the point. I want this to be useful to as many people as possible. So let me bring up a few things that could seriously increase readability.

Naming Your Classes

public class HelperClass

The word helper is REALLY tempting to use. After all, if you are coding very quickly (and lets be honest with deadlines we all take shortcuts), it can be tempting to just say "I'm going to be using this code in more than one place, because I'm trying to write good DRY (Don't Repeat Yourself) code". This leads to two problems though. The first is that when you go to actually call your code you end up with calls like this:

    var hlpr = new HelperClass();
    hlpr.All(x, y);

Ugly. Taken out of context, what the fuck is this actually doing? When you are trying to rip through this code in three weeks time trying to debug something, you are going to waste time circling back and reminding yourself what it is you were trying to write. The Second problem it brings up is that when you name something helper, it almost IMMEDIATELY becomes a dumping ground. It's amazing what ends up in those classes after a team of developers has had one of those around for a while. After a while they become a huge amorphous blob of crap and cruft that has to then has to be refactored thereby changing HUGE underpinnings of your application with it. It's not pretty I don't recommend it. So whats the easiest way to solve this? Call it what it is! In this case, public class NumberPrinter!

    NumberPrinter prtr = new NumberPrinter();
    prtr.All(x, y);

That's better but I think this leads to my next point

Naming Your Variables

    NumberPrinter prtr = new NumberPrinter();
    prtr.All(x, y);

Guys, you really aren't saving anything other than keystrokes if you don't give good descriptive variable names. The compiler doesn't care what you call your variable, but you know who will? The next developer who comes along. Changing this variable name to numberPrinter goes along way to making people happy.

    NumberPrinter numberPrinter = new NumberPrinter();
    
numberPrinter.All(x, y);

Well that's good, but now look at the line numberPrinter.All(x, y). I mean that doesn't exactly read well. Plus going back to my previous point of dumping grounds, the word All in this case is a little generic...

Naming Your Methods

        public void All(int x, int y)
        {
            for (int i =x; i<y; i++) 
                Console.WriteLine (i);
        }


What is this method actually doing? All is not very descriptive. How about giving it a more accurate name, like PrintAllNumbersBetweenBeginningNumberAndEndNumber? Sure its a lot longer but lets look at it in use:

    public void PrintAllNumbersBetweenBeginningNumberAndEndNumber(int x, int y)
    {
        for (int i =x; i<y; i++) 
            Console.WriteLine (i);
    }

That parts not too bad.

    NumberPrinter numberPrinter = new NumberPrinter();
    
numberPrinter.PrintAllNumbersBetweenBeginningNumberAndEndNumber(x, y);

Holy crap! I now know what that is doing without looking at the code! YAY! But wait, which parameter is the beginning and which one is the end... I damn, guess that needs fixing too!

Naming Your Parameters

        public void PrintAllOddBetweenBeginningNumberAndEndNumber(int x, int y)
        {
            for (int i =x; i<=y; i++) 
                
//if its modulus is not 0 it must be odd
                if (i % 2 != 0) 
                    Console.WriteLine (i);
        }

Anyone want to guess what this method is doing? Oh wait, you don't have to guess! You can tell from the name! How awesome is that! Unfortunately we don't know which parameter is the beginning and which is the end. Naming a parameter follows the same idea as both variables methods and classes! Be descriptive! Lets label our parameters with what they actually are to the method! So x becomes beginningNumber, and y becomes endingNumber! Brilliant!

    public void PrintAllOddBetweenBeginningNumberAndEndNumber(int beginningNumber, 
                                                          int endingNumber)
    {
        for (int i = beginningNumber; <= endingNumber; i++) 
            //if its modulus is not 0 it must be odd
            if (i % 2 != 0) 
                Console.WriteLine (i);
    }

Well that IS better. I mean look:

NumberPrinter numberPrinter = new NumberPrinter();numberPrinter.PrintAllNumbersBetweenBeginningNumberAndEndNumber(beginningNumber, 
                                                                endingNumber);

But Alex, you are thinking what about the i variable in the for loop! That's not descriptive. There is so many ways you can describe the counter in a for loop. counter being one of them. If you want to get creative that's fine, but most people learn this structure using i, so generally this is a pretty safe use for a crappy variable name. But you know whats bothering me about the method code? Lack of curly braces...

Use White space, and available Lexemes!

What to I mean by Lexeme? Well in this case I'm going to referring to Curly braces! We all know that in C languages you don't HAVE to use them all the time, there are exceptions. But in this case I think we may want to look and make sure we are making our code readable:

public void PrintAllOddBetweenBeginningNumberAndEndNumber(int beginningNumber, 
                                                          int endingNumber) {
            for (int i =beginningNumber; i<=endingNumber; i++) 
            
//if its modulus is not 0 it must be odd
            if (i % 2 != 0) 
            Console.WriteLine (i);
        }


Look Ma! I moved my white space and curly braces wherever I felt like! Guess what the computer can read that correctly! Other developers, will probably fuck up, and then get pissed off when can't figure out why there code isn't working. Just because you don't HAVE to use something, doesn't mean you shouldn't. I don't HAVE to use the turning signal on my car, but generally I feel its a pretty good idea. It makes it a bunch easier for other drivers on the road to know what it is I am actually trying to do... So lets fix this!

public void PrintAllOddBetweenBeginningNumberAndEndNumber(int beginningNumber, 
                                                          int endingNumber) {
            for (int i =beginningNumber; i<=endingNumber; i++) 
            {                //if its modulus is not 0 it must be odd
                if (i % 2 != 0)
                { 
                    Console.WriteLine (i);
                }
             }        }

Well that's better! But does anyone see an issue here? Were using two different styles for our curly braces! That's a recipe for future WTFs. 

Be Consistent

Use your lexemes in a consistent manner. If you want to put curly braces at the end of your if statement, use that for all statements! If you work with other developers make sure you ALL agree on a usage. Having differences in White spacing and statement terminators is a quick way to create a debugging nightmare.



Code Comments

That's right I put a code comment in there! Generally I'm not a huge fan of code comments. I think to many people rely on them for really poorly written code. If you write more comments than code, then quite frankly, you are doing it wrong. Now I realize in this case I put a comment over a line with modulus. Honestly that was the simplest thing I could think of that people may not recognize. I've met a few developers who have never come across the modulus operator, it happens. So keeping it short, I am basically telling the next developer "If you don't know what the % symbol means, it means modulus, and if you don't know what that is, google it!"

So Where did we end up?


public class NumberPrinter
{
    public NumberPrinter ()
    {
    }
 
    public void PrintAllNumbersBetweenBeginningNumberAndEndNumber(
                                                            int beginningNumber, 
                                                            int endingNumber)
    {
        for (int i =beginningNumber; i<=endingNumber; i++) 
        {
            Console.WriteLine (i);
        }
    }

    public void PrintAllEvenNumbersBetweenBeginningNumberAndEndNumber(
                                                            int beginningNumber, 
                                                            int endingNumber)
    {
        for (int i =beginningNumber; i<=endingNumber; i++) 
        {
           //if its modulus is 0 it must be even
           if (i % 2 == 0) 
           {
               Console.WriteLine (i);
           }
        }
    }

    public void PrintAllOddBetweenBeginningNumberAndEndNumber(
                                                            int beginningNumber, 
                                                            int endingNumber)
    {
        for (int i =beginningNumber; i<=endingNumber; i++) 
        {
            //if its modulus is not 0 it must be odd
            if (i % 2 != 0) 
            {
                Console.WriteLine (i);
            }
        }
    }

    public static void PrintAllNumbersEvenNumbersAndOddNumbersBetweenBeginningNumberAndEndingNumber(
                                                            int beginningNumber, 
                                                            int endingNumber)
    {
        NumberPrinter numberPrinter = new NumberPrinter();
        numberPrinter.All(beginningNumber, y);
        numberPrinter.Evens(beginningNumber, y);
        numberPrinter.Odds(beginningNumber, endingNumber);
    }
}

Well that's certainly more human readable, and you know what we didn't do a damn thing to the efficiency! Win! It's as simple as this, if you use a compiler, then any code you write can be read by a computer. Honestly that's the easy part. Writing code that future developers can read, understand and extend? That's hard. So many times we get lost in the sea of efficiency of our code from the machines perspective, but we need to make sure we don't lose sight of the efficiency of bring a new hire up to speed. Or reading through code that hasn't been touched in a long time because we need to update or extend its functionality. Or god forbid even having to dig into the deep dark reaches of the code for a heisenbug! The bottom line is don't neglect the human aspect of efficiency. After all, humans don't get faster hardware every few years.

2 comments:

  1. Readability is king, which is why we should all code in Python. ;-)

    I think your overall point is bang on, but I will take some issue with the "code written 10 years ago runs blazingly fast today". If the workload is the same, that statement is generally true. However, workloads tend to grow to suit the capacity of systems, and over time not everything follows Moore's Law. Specifically, any that is latency sensitive tends to need tweaking, because the trade-offs between CPU/memory/disk *have* changed over time. A good example of this are file systems. FAT was actually pretty efficient for the problem it set out to address, but it's lousy bad for today's much larger disks and files.

    ReplyDelete
    Replies
    1. I love me some Python ;-) I've also seen some awfully written Python that made me wish people actually followed some of the above advice!

      I will concede that there are some cases where my 10 year statement does not hold true. As you mentioned latency and increased system capacity can definitely cause issues and tend to require tweaks to keep up. The purpose of the statement was more to make the point of don't over think what you are coding, especially in terms of efficiency. After all as you mentioned. FAT was an efficient solution at the time, but now could probably use a refresh because things have changed. That will be a much harder proposition if the code isn't clearly written.

      Delete