четвер, 14 липня 2011 р.

Keep the code clean

Dealing with a lot of projects that were developed in different times in my company by my former and current colleagues I usually come across a lot of code-garbage.  So far I've been able to identify and differentiate at least 7 types of code garbage (if you know more, please add in the comments).

1. Unneeded commented out pieces of code, or irrelevant textual comments.
2. Unneeded logging that was used for debugging of some issue at some point, and was not removed after the issue has been fixed.
3. Stale code which is not called from anywhere, left as a legacy after some refactoring.  Or some empty methods not actually doing any job.
4. Compiler warnings.  Not a really code pollution - but still pollutes the compiler output and you might miss some important warning.
5. Spelling/grammar errors in symbol names.
6. Methods or member functions with trivial implementation, which just calls super class implementation.
7. More than one class definition in a class file.

I have a strong opinion, that there is no legitimate reason why any of these types of code garbage should be kept in your code.  Even if you have a reason - this is just an excuse for not maintaining a good enough and consistent codebase.  Typically I hear these kinds of excuses:

1. "A commented out piece of code might be used sometime in the future, I commented it out, so that if I need it - I don't forget how I implemented something previously"
An answer to this is - in 99% of cases the commented out piece of code is just dangling there and is never used again.  Also there are version control systems - if you removed some code that you will really really need in the future - you will dig it out from history.  So really as I said - there is no reason to keep commented out code in the project.  This also has to deal with some autogenerated code (by your IDE for example) that you never use, but for some reason it is kept commented out and polluting the code.

2. "I fixed the issue, but left the debug logging in place so that I am able to see if something goes wrong" - in reality if something often goes wrong - you haven't really fixed the issue.  You can leave logging of exceptions or errors, but not additional logging of some regular control flow of the program which was used to help diagnose the issue.  So what I am saying here - when you start debugging a new issue - you spend more time reading irrelevant log messages instead of the ones dealing with the currently debugged issue.

3. For keeping the stale code in the project I can't even think of a typical excuse, so just simply don't do it.  Just delete the stale pieces of code.

4. There is no excuse for having warnings either.  Warnings - are errors.  Your code should compile clean and silently.  If anything is wrong it should stick out like a sore thumb - so you notice the warning and fix it right away.

5. When the variable or function name is spelled incorrectly - you spend some time on thinking why is it so.  What is worse - if you are a perfectionist like I am - you will have to correct the spelling error so it does not bother you again.  This will involve some refactoring or plain group replacement - and this where you lose your precious time.  So really when your mate does not care to spell the names correctly - he/she wastes your time.

6. Methods that just call super - are unneeded in most cases. If you have a correct inheritance implementation in your language - the superclass method will be called automatically - no need to override the implementation in order to just pass the execution to the same superclass method.

7. Some languages do not allow defining more than one class in a single source code file (like Java) - and for a good reason.  Are you saving the space in your FAT?  I don't see any other reasons why you should keep several class declarations/definitions in a single file.  The class is a granular entity - it should be in a different file (different compile module for such languages like C#/Java/C++/ObjC).  It also means that probably your classes are so coupled that should be used only together, then why are they separate classes?  Or if I want to use always both of them, why one of them is not aggregating another one and presenting interface to its services?  This might go deeper into semantics and not just the layout of your code.

Code pollution harms readability, maintainability and the beauty (if any) of your code.
I tend to be violent about any code-garbage I encounter.  Whether it is my own left-over or somebody else's.  In somebody else's codebase (actually if you started working on it - it is already yours as well) you tend to try being more gentle and careful.  You start to think: maybe this commented out piece of code - was meant for something, or maybe the original author wanted at some point reuse it or anything, or maybe this logging was important to the original author.  Don't even spend time on thinking that.  Don't hesitate.  Just kill that piece of garbage and that's it.  You'll do good to everyone - to yourself and your successors who will inherit this code from you.

So what I really would like to see - is the clean code.  Every unneeded piece of junk pollutes the code and makes it harder to read and modify.
The best is when you take the codebase and it
a) builds without warnings, silently
b) produces no (or very small) debug output - and if any - only in case of errors or exceptions
c) has no commented out pieces of code, and all the comments it has are relevant textual comments
d) has no stale pieces of code - if the code is there - it should be called at some point of execution
e) the function/variable names are spelled correctly not destructing your clear mind
f) contains no trivial methods that only call super
g) each class declaration/definition is in a separate file