JSR 303 and AOP – A powerfull combination

 
Some days ago we had an interesting discussion at university about preventive and retrospective fault detection in software systems as it is described by Sommerville [1] and many other experts in the field of software engineering. While the discussion was actually a very theoretic one, in the last days I was more and more thinking about a practical approach to deal with such validation problems.
In most cases software specifications give us useful information about the characterizations of values, which might be either instance variables (which actually means the state of an objects) or method parameters. Let’s consider an easy example; we want to write an algorithm that calculates a factorial of a given number n. What do we get from the specification. Well, first of all, we know, that the number we are putting into our algorithm is an integer (lets don’t discuss about the limited size of integers in java and similar languages, and just assume for a moment that it would be big enough). Furthermore we know, our output is also an integer. So some developer would start to define the algorithm signature as following:

Integer factorial(Integer n);
But is that really everything which we got from the specification? No it is not! We furthermore know that the input Integer has to be non-negative. We also know that the return value of the algorithm is also always positive. If we take mathematical foundations into consideration we could also calculate some upper and lower bounds for the return value (see Stirling’s formula [2] for more information). So if we have all this information, where did it go? Well, actually the information will be reflected by the implementation of the algorithm, or to let’s be honest, we could hope that the guy who is implementing the algorithm will check all these conditions. So let us have a look at a simple implementation of the algorithm.


Integer factorial(Integer n) {
            if (n <= 1)
                        return 1;
else
            return n * factiorial(n-1);
}
What we see is that the implementation of the factorial algorithm is quite short and I would say very easy to understand as it is not polluted by further concerns. Unluckily our developer did not put in any of the checks even though they have been in the implementation. So let us pretend we are a better developer and put in some checks.
Integer factorial(Integer n) {
if (n isNull)
            throw new SomeException();
if (n < 0)
            throw new SomeException();
            if (n <= 1)
                        return 1;
            int ret = n * factiorial(n-1);
if (ret isOutOfBoundary)
            throw new SomeException();
return ret;           
}
To be honest this cold looks not so nice to me, it is hard to read, hard to maintain and in many cases it gets even more complicated. Just imagine we would have passed in a more complex object than just an integer into some other function. What would have to check, that the object itself is not null, that some of that objects values are valid in respect to the specification. As long as we are not able to add constraints to the interfaces instead of adding them in the code, we will always have redundant places of validation and checking our constraints. So what we actually need is a way to add constraints to the interface definition. With the JSR 303 [3], also known as Java Bean Validation, there comes a mechanism to the java language to do similar things. In Scala there are also some constraints which could be applied to method parameters and other values with the requires function. So there are mechanisms to implement this constraining and validation in a more general and clean way. So our Interface for the factorial function might look like something similar to the following (this is not JSR 303 nor any other concrete implementation).
@NotNull(for=n)
@NotLessThan(for=n,value=0)
@Constraint(for=return, InBounds())
Integer factorial(Integer n);
A different approach would be to not use the pure Integer Type but to subclass it anonymously (which unfortunate is not applicable in any language I have seen so far) and let the Type itself handle that validation. This comes close to directly add the constraints to the parameters and return values themselves rather than adding it as meta information to the method signature.

Integer{@Constraint(InBounds())} factorial(Integer{@NotNull, @NotLessThan(0)} n);
Just as we have seen this for methods we could also apply this to instance variables of objects, to verify that during the runtime of the system the variable will never reach an invalid state. This is also yet implemented in the JSR 303 and further languages and libraries. And with the power of Aspect orientation we could trigger validation every time a variable is assigned a new value. By doing this, we could immediately react on inconsistent states in the system. The way of how the system would react could be implemented in any way, it might be just an entry in some log file, or the throw of an exception, the rollback of a previous action, whatever is most applicable for the actual situation. In the case of our factorial example it would probably be a good solution to throw an Exception whenever a constraint violation is reached. Just as we did in the by hand implementation, but in a more elegant and maintainable way.
You might think, isn’t all this a bit too much, just to be sure (or more sure, cause you can never be completely sure) that a simple algorithm like the factorial calculation is working correct. You are right, but think about more complex programs, for example system in which hundreds of threads are manipulating thousands of values in parallel. Usually in such systems you reach invalid states at some point, and in many case you don’t even notice this at that moment, but you will notice it later when your system crashes at a total different point. Sadly at that time you will probably not be able to tell in which moment, or during which operation the system state got corrupted in the first place. I have seen developers debugging into programs and checking all the values of their variables, just to see if there is any inconsistent state. Wanna know why they are doing this, and why it actually help them finding the bugs? It is simple they just use their knowledge about constraints on certain variables, which unfortunately are nowhere checked in the code.
Would it be nice to put this constraints into the code (and preferable at exactly one point) and have them checked during runtime. That way your system could immediately tell you if some data inconsistency occurs. You won’t have to wait till your system actually crashes and try to find out what the reasons for that has been. As aforementioned there are already some languages and libraries which do approach this problem, and I think in the near future we will see additional and even more powerful solutions to these kind of problems. And I am really looking forward to the day I will have a runtime environment which will inform me immediately when some data gets corrupted, no matter how the corruption occurred.

Advertisements

Some words about (non programming!) languages

It took me quite some time to consider if I should write my blog postings either in German or in English. On one hand an English posting could obviously read by more people, so it serves more readers. On the other hand, I have to say, that there are a lot of very very interesting and good English blogs out there, and to be honest good German blogs, dealing about software development, are very rare. And I still remember the times when I was a small boy curious about all this computer stuff and especially software development, but not having the ability to understand all this English written stuff out there. So adding a bit more German content (and I hope I will produce at least read-worthy content) might also be a good idea.
To be honest I still cant make a decision, what might be the better choice. So please don’t be mad if you see some posts in German and some in English. Furthermore, please forgive me the mistakes I am making when writing in English, it is not my native-tongue but I will give my best.