Code comments - taking a closer look

Should I use comments in my code? This is probably one of the most debated subjects in software development it's been going for a while. There are developers that insist on commenting everything under the sun and then some. You've seen code like this before, where there is a sea of green (if you're working in Visual Studio) and sometimes the comments may even outnumber the actual code. On the other hand, you have that prefer to leave the code totally undocumented. No green insight, anywhere! But no way to decipher the code either. So what is the best practice when it comes to comments?

To comment or not to comment, that's the question

Stale Comments

Some developers argue that your code should be well documented so that when working against a specific part of the codebase you can quickly understand what's happening. Imagine joining a new team or starting a new job and being faced with thousands of lines of code. The ultimate goal for any developer, is to be able to decipher any code and many would argue that this is exactly why you need comments. However, there is one fundamental problem with comments. Unless they are treated like live code, then they get stale. Below is an sample code where the code has moved on but the comments haven't

///Delete a user by user name
public void DeleteUser(int userId)
{
    // ...omitted for clarity
}

Usually developers change code in response to bug or a feature request but they rarely change the associated comments as part of the same change. I've certainly done it in the past and I'm sure many others have. Eventually, after a few iterations the code becomes totally disconnected from the comments which end up not only being irrelevant and "noisy" but they have the the potential to cause issues by referring to something wrong or deprecated. Example below:

///Delete a user by userid
public void DeleteUser(int userId)
{
	// lots of comments
    // and more comments
    
    var user = UserRepository.GetUser(userId);
    
    // we don't delete the user anymore
    // instead we execute a soft delete.
    // We shouldn't use this anymore
    // user.Delete();
}

How easy is it to spot the fact the user.Delete() has been deprecated and replaced by a soft delete? Someone new to the code would assume that calling this method would delete the user? Instead of being helpful, the comments become an impediment to productivity and can be the source of many wasted hours.

Over-commenting

Another big problem with comments is over-commenting. This is particularly common among junior developers that join a new team and they are told that they need comment their code. However, instead of being told how to properly do it, they're left to their 'own devices' and they go overboard. They comment everything! Unfortunately, I see the same with some MSDN articles and walkthroughs where they also tend to over-comment code. Some may argue that the comments are required to make the code samples easy to understand, but I would argue that even in this case, not all comments are necessary and this practice may be setting a bad example. With so many developers visiting the MSDN pages on a daily basis, it's easy to see how some developers may thing this is good practice. I would prefer if the
articles were structured in 2 sections:

  • The first section should contain a sample implementation of the method/feature/property described
  • The second section, if necessary, should explain parts of the code that are not easy to understand

This way, developers get familiar with the clean code pattern through the examples and the article remains a great reference material by explaining what the code actually does. Any analysis of the code should happen outside the code sample providing clear separation of concerns.

Does the following sample really need comments? You'll be the judge of that:

public static void Main()
{
	// The path to save the file
	string path = @"c:\temp\MyTest.txt";

	// Check if the file exists 
	if (!File.Exists(path))
	{
		// It doesn't, so let's create a file and write to it. 
		string createText = "Hello and Welcome" + Environment.NewLine;
		File.WriteAllText(path, createText);
	}

	// Append some test
	// Append makes the file longer over time 
	// if it is not deleted. 
	string appendText = "This is extra text" + Environment.NewLine;
	File.AppendAllText(path, appendText);

	// Open the file to read from. 
	string readText = File.ReadAllText(path);
	
	// Output to screen
	Console.WriteLine(readText);
}

Comment Noise

Over-commenting and stale comments have another implication: "noise". When the code is littered with stale comments or comments that don't add real value e.g. //this is a string parameter developers learn to ignore them totally. The sea of green becomes useless and anything important is lost within the junk. Even if there is a comment that could help you in troubleshooting or fixing something, there is a very good chance that you will miss it because it's obfuscated by the noise.

Where do we go from here?

The biggest act of kindness you can do to yourself and fellow developers and hope that the next person taking over your project/codebase doesn't hunt you down and kill you (because you left a pile of unworkable mess) is to write meaningful code. Your goal is to write beautiful, functional and verbose code. In the words of Uncle Bob "your code should read like an article. There should be a beginning, middle and end, it should be self-documenting and should follow a logical order. By using verbose and purposeful wording you can ensure that your code is self-explanatory. Let's look at the following interface for example:

public interface IDocumentManager
{
	void FormatDocumentForPrinting(SomeObject documentToFormat);
	void SaveDocumentDataToDisk(string locationToSave, string fileName, string dataToSave);
	void PrintDocumentAsPdf(SomeObject documentToPrint);
	string ConverDocumentToHtml(SomeObject documenToFormat);
	void MergeDocuments(SomeObject[] documentsToMerge);
}

In the example above, we have an imaginary interface that manages an imaginary document object SomeObject and defines 5 methods that act on that object. The methods have clearly defined functionality (print, merge, format etc) as the name indicates. Obviously it's the responsibility of the developer to make sure that each method is implemented to do "what the label says", but anyone looking at this code should be able to quickly infer full functionality. There is no need to comment the methods because they already tells us what they do! Below, I've implemented one of the methods as and example:

public void SaveDocumentToDisk(string locationToSave, string fileName, string dataToSave)
{
	if (!Directory.Exists(locationToSave))
	{
		Directory.Create(locationToSave);
	}
	
	string fullFilePath = Path.Combine(locationToSave, fileName);
	File.WriteAllText(fullFilePath, dataToSave);
}

The above method doesn't need comments because the code simply describes itself. There is also a good, logical flow:

  1. Check if the directory exists
  2. Create it if it's not there
  3. Get the full path to the desired location we want to save the file
  4. Write the data to the file

I agree, this is an oversimplified example but any code you write should look like this. Concise, clean and readable.

A comment is an apology

If you believe that there is a valid case for a comment because the code performs something really complicated or executes a very convoluted set of steps then your comment sounds like an apology. Instead of fixing the issue, you chose to comment over it and move on. You need to remember that the code you write is not only for compilers but for programmers too. Eventually your code will become someone else's code and you will, in turn, inherit someone else's work. If we all followed the same standards, then there would be no need to read apologies and instead we would be enjoying improving and building upon the right foundations laid before us.

Commented out code

<rant>
This is a side-issue but all to common and it's a pet hate of mine. If you're using source control, you know that any change you perform can be reverted. If you're not using source control, then your problems are much bigger than comments! You should be confident that the changes you just performed superseed the previous, commented out code and once you proved that your new code works, do yourself a favor and remove all the comments. Any left over commented code will instill doubt to anyone having to work with it and will add unnecessary noise. </rant>

The exception

There is one exception to what I wrote above regarding code comments. Thanks to my friend James james*m*south for reminding me that some automated tools (documentation tools or API Help pages etc) require XML documents in order to work. I am happy with this exception, even though I still find it annoying. Pretty dissappointing that there are no better tools out there that could work directly with code and if there are, please share with us so we don't have to endure the pain of XML documents any more.

Conclusion

Closing, I would urge you to try and adopt a positive (boy scout) attitude towards your code because this is not just your work, but your legacy. There are excellent sources that go in a lot more depth about the subject and I would recommend that start with Uncle Bob's excellent books: Clean Code and Clean Coder. Both books are full of invaluable advice that will definitely help you become aware of a lot of mistakes that we've all made at some point in our career (and possibly still make) and will help you become a better developer. I tend to return to those books from time to time as there is always something new to pick up.

What is your take on the whole "comments" subject? Let me know in the comments.