Design by Contract - for efficient coding
Yesterday I picked up this tweet from Joel:
Getting into PHP Exceptions. Very useful :) Example - userExists() function should raise exception when no user passed, not return false!
Hang on, should it? While exceptions in programming languages are without doubt a useful tool, his example reminded me of a problematic style of coding I’ve experienced - something I consider an Antipattern. The style I have in mind is when each function starts with code validating the input parameters before the main work of the function is executed. Here’s an example:
function addUserToGroup($user, $group, $done_by) {
if (!($user instanceof User)) { throw new Exception(..); }
if (!($group instanceof Group)) { throw new Exception(..); }
if (!($done_by instanceof User)) { throw new Exception(..); }
// ok, we're (finally) good to go:
$ref = new UserGroupRef();
$ref->setUser($user);
$ref->setGroup($group);
$ref->setDoneBy($done_by);
$ref->save();
}
If you’d never dream of writing code like this, move along - nothing of interest to see below! If you’re wondering why I think this is a bad approach, though - what’s the problem here?
The code is verbose - 50% is performing “meta” tasks. This takes time to write and makes it harder to maintain and interpret in the future. (I’m a big believer in self-documenting code - key to this is keeping code short and reducing the amount of unimportant information).
The issue gets stronger if we go on to include similar checks in the implementations of UserGroupRef->setUser(), setGroup() and setDoneBy(). The complete system is now “nice and robust” but has doubled in terms of lines of code, and you can imagine how when the code is executed the same checks are called repeatedly at different levels of the call stack. If those other functions are only ever called by code that has already ensured the parameters passed will be valid, surely all the time and effort to write and maintain the additional checks isn’t strictly necessary..
Design By Contract
So how do we tell when we need to do this validation? Enter the concept of “Design by Contract” - a “technique to reduce the programming effort for large projects” (source). The basic idea is that each function comes with a “contract” that should be agreed to when calling it, in the same way that two people may have a contract when agreeing to work together in the real world.
The key elements of Design by Contract are:
- a Precondition - what must be true before the function is called - i.e. the responsibilities of the calling scope. For example, the $user parameter will contain a User object.
- a Postcondition - what will be true after the function has executed - i.e. the responsibilities of the function itself. For example. $user will be a member of $group.
- a Class Invariant - what the function guarantees to leave unchanged (for completeness - so we don’t get unexpected side-effects).
The part I want to focus on is the precondition - specifically what parameters the function will be given - values, datatypes etc. Design by Contract places the responsibility on the calling code to provide valid parameters, not on the called code to validate them. The core idea is “fail hard” - if any part of the contract is broken the entire thing is void - and there’s no guarantee what will happen(!)
That obviously sounds quite scary - I think this interpretation of DBC is a relevant approach only for internal functions within a module or self-contained system. DBC is not a good model for dealing with user input at run-time(!) - and for open-source code or APIs that will be used by other teams or released publicly the extra code to check for and report violation of the contract is probably worth the saving in confusion and support questions.
Using Design By Contract for efficient coding
So the point is to use DBC as the justification for not filling your classes and methods with code validating input parameters, resulting in lengthy code that’s harder to understand and takes longer to maintain. You just need to ensure that every member of the team writing code that uses a particular module works on the principle that it is the calling code’s responsibility to provide valid parameters, something that is often possible with small teams and bespoke modules. DBC also simplifies debugging and unit testing since the intended behaviour of each routine is clearly specified.
Back to Joel’s example - should userExists() throw an exception if passed invalid input? The answer really depends on how it will be used, and what the “contract” for function is understood to be. The contract could be written that userExists() will take responsibility for validating it’s input. However it’s often the case that the calling code will also need to know if it’s not working with valid data, which indicates that validation is best done at the earliest point in the call stack than repeatedly throughout.
It’s also important to keep in mind what real-world effect invalid input may have - for some functions it could result in critical data corruption. However the chances are that when testing for the existence of an invalid user, simply returning false will cause the system to function fine as expected. This leads us to one of my favourite concepts of coding - YAGNI ;)
Comments welcome.
