The Spark That Lights The Flame

Friday, 23 September 2011

A Story Forgotten...

Recently I have been reading a book titled ‘My Voice Will Go with You: Teaching Tales of Milton H. Erikson’ by Milton H. Erickson and Sidney Rosen. Milton Erikson was a doctor and psychologist that practiced from the 1950’s through to the 1970’s and is credited with rewriting many of the techniques used by psychologists’ of his day. So far I have only got a couple of chapter in, but have found that many of the stories bring a smile to my face as well as being educational.

The opening sections of the book describe some of Erikson’s beliefs. One particular belief is that, a story which is forgotten is more powerful when changing behaviour, than a story remembered. At first this seems highly illogical, how can you act upon the moral of the story when you cannot remember it?

Well it occurred to me that the essence of the story could go into your mind, and the details be forgotten. This would prevent analysis by the conscious mind and hence help prevent resistance to the moral of the story; however the essence would remain in the unconscious to be acted upon.

So Erikson was probably right, that a story that is forgotten by the conscious mind is more effective at changing behaviour, which comes largely from the unconscious...

Improving Data Quality using Custom Forms in Trac

Synopsis
When using a convention for your tickets, it is easy for ticket authors to break the convention. A solution is presented to help ticket authors abide by the convention.

The Problem
If you have a convention for your ticket description, and a high error rate in the use of the convention, how do you reduce the errors introduced by ticket authors?

A Solution
Provide as much content of the ticket description as possible as a default value. This works because it acts as a mental prop for authors.

The Implementation
When using Trac there are a couple of wrinkles in providing custom forms, however it is generally very easy. One solution is to embed a form in a wiki page which submits to the new ticket creation page. When the author enters some information, the form can enter default values and substitute textual descriptions for values that are meaningful to the reports.

A custom form implemented as a wiki page

The result of the custom form above, where the description has been populated and the textual description of the work package (WP) has been substituted for a value relevant to the reports.

The wrinkle here is that the wiki pages pass a cookie to track the users session, however the ticketing pages pass a hidden form field. So we use a little java script to copy the value from the cookie to the form field. The wiki page source that works around this is here.

Have fun with custom forms in Trac!

Bringing Exceptional Order to Exception Chaos

Synopsis
In this blog post I attempt to:

start describing a reasonable exception handling strategy,
give examples of what needs to be documented in an exception handling policy,
and present a design that matches the described policy to get you started.

The exception handling policy revolves around a web application, but I aim to make it general enough that it can be used with any application.

Introduction
Why Do We Need An Exception Handling Strategy? We need an exception handling strategy to ensure that all errors are caught and handled in an appropriate manner which is essential to providing a good user experience; to give a counter example, how cheesed are you when a program crashes and looses your data?
Another useful benefit, is that categorising exceptions reduces the code required to handle them.
A few figures to concentrate the mind:

Of the five major projects I have worked on to date,

only one had an explicit exception policies,
and another two had an emergent policy based around the rules in Effective Java and the Spring framework respectively.

No project I have worked on has had time planned at the start to design and build an exception policy.
With a few exceptions, all of the programmers I have worked with do not follow the micro principles laid out in Effective Java. This policy was inspired by Brett Norris, who has subsequently left the company.

Information SourcesThere are already plenty of sources of advice, some of which are primary sources of information for this post:

Effective Java - Amazon
Spring Exception Handling - DAO Exceptions and Exception Handlers

There are plenty of contradictions in what different authors think makes good exception handling. I have based my primary sources around Effective Java as it is detailed and well reasoned, and Spring because when I have used it, I have found it to be elegant and functional.

Designing an Exception Policy
The problem that an exception policy is trying to solve is that if exceptions are not properly categorized then they have a tendency to be handled by the container or the programmer. In line with the principle of least pain, the framework should handle as many exceptions as possible, this means that developers do not have to handle the exception, and it is handled gracefully to give a good user experience.
Things to think about when you are designing exceptions:

Should the exception be a checked or unchecked exception?

Effective Java - Item 40: Use checked exceptions for recoverable conditions and run-time exceptions for programming errors.

Categorization to help the framework handle the exceptions. Who are the clients? What are the recovery strategies?

Effective Java - Item 41: Avoid unnecessary use of checked exceptions
Effective Java - Item 42: Favour the use of standard exceptions
Effective Java - Item 43: Throw exceptions that are appropriate to the abstraction

What data should be stored in exceptions for use by the user and the framework?

Effective Java - Item 45: Include failure-capture information in detail messages

The attached exception policy, creates a new exception for each category of exception that could be used, and when implemented chooses the correct nature of whether it is checked or not. The policy document also includes advice for when to use each type of exception. This policy may not be perfect, but it:

Is practical as it follows the basic rules laid out in Effective Java,
It follows the categorisation of exceptions as Spring does. The exceptions depart from Spring in that they assume that they will generate behaviour in the application by terminating a request etc, rather than categorising on what caused the exception. These exceptions are designed to be used in an application, not a framework, so I think this is a reasonable choice to make as the programmer should have the information to make the choice and it should speed up development by limiting the number of exception cases that have to be handled. This does not preclude the use of exception handlers and exception translation to get different behaviour.
Another important factor is that it is simple – there are only a few rules for developers to remember.

Implementing the Exception Policy
Following the advice above, I derived the following exceptions:

Categorisation of exceptions

Those of you familiar with Effective Java will point out that I am breaking the advice that is issued in Item 40 that you should not subclass Error. My reasoning for doing this is that if we want to terminate the application then we do not want a programmer to catch the exception, and even if they have code like the code below, it the desired behaviour will still be exhibited.

try {
    methodWhichTerminatesApplication();
}
catch(Exception e) {
    // Try to stop the application from terminating by wrapping the error.
    throw new RuntimeException(e);
}

Clearly these errors need to be used with caution.

You can then add failure capture information to these classes, by implementing an ExceptionInformation interface and making all the implemented methods delegate to an helper class – ExceptionInformationHelper. This is required because we have to inherit from one of the subclasses of Throwable to get the exceptional behaviour from the JVM.

Failure Capture Information

This results in code like the following when throwing exceptions:

String givenName = request.getGivenName();
String lastName = request.getLastName();
PostCode postcode = request.getPostCode();
throw new SoftwareException().set("givenName", givenName).set("lastName", lastName).set("postCode", postCode);

Alternatively you could use either the variable argument version of the set method or use a factory method to create the exception.

throw new SecurityException.create(request);

When modifying the exception policy to your situation, you may be tempted to introduce new classes. These may be handled by the framework, but in a sufficiently different way as to justify a separate class. When tweaking the policy you should ask yourself whether an existing exception is suitable, for example “Could a RecoverableException be used for a ValidationException, and should recoverable exception be a superclass of ValidationException?”

Implementing the Framework
The framework is going to be specific to your application, however in a typical web application you would need to do the following:

Log the exception. As your exceptions have a uniform interface to get information out of them, the getState() method, then the logging portion framework can handle all instances of ExceptionInformation in the same way. A useful way of logging large object graphs that these exceptions can carry in their state, is to use an Apache Commons ReflectiveToStringBuilder class. You have to be careful that lazy instantiation from objects, such as Hibernate proxies, does not generate another error in the error handling code. A custom ToStringStyle class can be used to avoid this problem.
Display error screens. Typically you want to have a set of error screens, one for security exceptions which gives very little information and at least a general catch all some screen for everything else. The general catch all screen gives some information that allows you to tie the response on screen to the message in the log; this should be a unique request ID.
Another helpful features include a screen ID on every screen so that when the user phones up to report an error, the help desk can identify exactly which screen the error is coming from without having to trawl log files. This can help solve difficulties quickly.

What do you have to do to use this policy?
You will have to:

Implement the design, Example
Write and publish your policy document. Some tweaking to the policy might be necessary, such as adding security exceptions and validation exceptions. Example
Ensure that developers understand and use the policy. This is best done if team leaders have buy in to the policy and can police it as it is no easy task.

What are the consequences of using this policy?
Some of the consequences are:

Subclassing of exceptions should become much less common as developers have exceptions that provide most of what they need. Between the standard java exceptions, and general categories already defined they should be able to cope with more than ninety percent of exceptions in a typical application. The remaining exceptions should probably subclass RecoverableException.
The security of the application should be easier to reason about because of the simplified design. This should lead to better security.
The framework provides a common place to handle exceptions and log them. This, coupled with a uniform interface to access the exception data, allows logging to be performed by a single piece of code. This means that log messages will all be in the same format and contain useful information. This increases the chance that the team maintaining the code, will be able to identify the problem without calling the user back to get a detailed use case, reproducing the defect in a development environment, and attaching a debugger. This makes the process much faster.

I hope this has given you some ideas for handling errors in a more stream lined manner in your own application. Have fun!

In The Red Corner Inheritance, In The Blue Corner Composition

The Problem
I’m currently working on a development project and see a couple of mistakes that seem to be repeated over and over. One of these mistakes is the over use of inheritance, where composition should be used instead.

These problems usually start with one developer writing a subclass to perform a function. Another developer will then come along and decide that another function is required in the hierarchy, and before you know it you have an explosion of subclasses. Some of the problems of hierarchies grown like this are that no-one really knows what each class does, there are classes for each permutation of responsibilities and the concepts behind the original classes are eroded to the point where they bear no resemblance to the software that is actually in the repository.

An Example
We are currently working with a component model, where each component has a set of ports. These ports are primarily a way of identifying a channel for messages to be sent from one component to another.

However while building the test harness it became clear that it would be very useful to capture the messages that are being sent over this channel, so a new type of port was born, the MessageCollectingPort. The MessageCollectingPort was only used in test code, and was substituted into the component graph where it was deemed necessary to capture the messages.

As the design progressed, another developer on the team decided that certain domain functionality was most easily placed in the port as well, leading to a DomainPort, and of course people wanted to capture the messages sent through domain ports.

The First Problem
Now you can start to see some of the problems with this “minor alteration” to the hierarchy. There are three responsibilities in the hierarchy:

Message delivery
Message capture
Message manipulation

The consequence of this is that approach is that for every responsibility the hierarchy doubles the size of the hierarchy. This means that for a hierarchy of five responsibilities would have sixteen classes if all permutations were realised. Anyone that has been in the position of trying to understand this number of classes will know that it is not easy...

The Second Problem
A further problem with this approach is that if we want to decouple our logic from the hierarchy, and not use polymorphism, it causes us to write multiple conditions in our code:

if(object instanceof MessageCollectingPort)

becomes:

if(object instanceof MessageCollectingPort || object instanceof MessageCollectingDomainPort)

This means that multiple places in the code need to be updated whenever a new class is added to the hierarchy. This violates the principle of locality, which states that the changes required should be close, in the developers mind, to change that caused them. This principle is core to writing maintainable software.

The Solution
Now that we have a grasp of the problems involved, we can start to look at solutions.

Ideally we want to add one new class for each new responsibility rather than two to the power of the number of responsibilities.

A powerful solution for dealing with this problem is to change from using inheritance to using composition, by using a decorator pattern.

See the decorator pattern in the ‘Gang Of Four’ for further details. This solution fulfils our ideal design criteria of one class per responsibility and means that we have a single criterion to check.

It still is not ideal, as we have to loop over the IPort instances in detect capabilities, or find some other method, such as using an object to represent the capability rather than using an instanceof check. It is however a significant improvement over the original.

The Behaviours
So what sort of thought processes, lead to the behaviours that generate this? Typically the justification that you get is “it was already done like this”. This indicates that the developer has not stood back and thought about the consequences of the design. This forethought can be encouraged by the team leader or design authority asking questions of the design before it starts.

In The Red Corner Inheritance, In The Blue Corner Composition, Round 2

My last blog post talked about how to use composition instead of inheritance. There is another design that takes the previous lesson from the last blog one step further. Thanks for this one go to Kevin Thomas.

The Problem
When I was working on CLS, there was a pair of products that we needed to support, NDF and OPR. CLS had a specific design requirement to be a framework for further products to be inserted into the system.

NDF was the first product to be implemented, and had two search screens, one for matched trades and another for unmatched trades. Matched trades allowed you to search against the matching partner, and the date on which the match was made, etc.

The team set about implementing the screens and we ended up with three classes:

SearchForm – this class was responsible for collecting the data from the UI and making a query available to the framework to search for the trades. This is the super class for the following two classes.
MatchedSearchForm – The matched trades form had some extra fields for the matching partner, and match time. The object provided a customised query for these fields.
UnmatchedSearchForm – The unmatched trades form also had extra fields and supplied a customised query for these.

This worked quite well for the NDF release, however when we came to the seconds release the implementation spawned a new classes and renamed the existing classes to:

SearchForm
OPRMatchedSearchForm
NDFMatchedSearchForm
OPRUnmatchedSearchForm
NDFUnmatchedSearchForm

Just as in the design from the last blog post, there are multiple responsibilities in each of these objects. Specific responsibilities are the NDF search, OPR search, matched search, unmatched search.
This lead to a similar set of problems with the number of classes growing with each product added.

The Solution
Unlike the previous solution, the interface changes for the storage of data into the forms. So the solution was to use composition to break up the SearchForm into a container that contained a ProductSearchForm and a MatchStateSearchForm. These two were the start of hierarchies that had a single responsibility each. The refactored solution looked like this:

One important point to note is that the criteria generation is now split between two classes, MatchingStatus and Product, and are combined by the SearchForm. We can do this using an object oriented criteria such as Hibernate Criteria. As it was some years ago, I have given a solution which does not quite match the real system.

So you are probably wondering, did we not make the system more complex by doing this, as we have actually increased the number of classes? We have increased the number of classes, however:

The number of operations per class has gone down, because each class has only a single responsibility, making each class easier to understand.
There are no combinations of responsibilities to understand, or generate weird interaction defects. This was something that was experienced in the real system, and is not particularly apparent from the class diagrams above.

In summary, this was an easier solution to understand and maintain.

The Behaviours
By understanding the responsibilities of each class at a high level, the designer gives themselves the big picture. Once they understand the big picture they can:

Start to look for good aspects of the solution and bad smells.
Once these have been identified, the candidate solutions are generated,
And the pros and cons of each solution can be weighed up to find the best solution, or to combine candidates to generate better candidates.

Project Reporting Metrics

I would like to describe some of the things that I have been contributing to on PROPHET that are freeing up our time, allowing us to move onto other improvements.

When I arrived on PROPHET certain decisions had already been taken, including the tool set:

Enterprise Architect
Trac
Java
Maven
JUnit

It has been part of my job, to glue these together for the project manager and design authority. The purpose of this article is to communicate the mechanisms used on PROPHET, with the aspiration that they can be used on other projects.

The story of the metrics starts with the effort and measurement of effort and then moves on to progress against required functionality.

Effort Estimation

The first set of tickets that we were managing were design tickets. We attempted to put estimates on these tickets and needed somewhere to store the estimates. Keeping the estimation related to the work in the same place as the ticket, so we searched for a Trac plugin that would allow us to put estimates on the tickets. The Time and Estimation plugin, allows both the effort estimate and the actual time spent to be recorded.

Collecting the Data

The data collected is only of use if it can assist decision making and to do this it needs to be of sufficient quality. The time and estimation plugin gives some reports to help enforce the collection of data, such as a handy developer work summary report that allows team leaders to see when developers last logged time in Trac.

Another choice that is important that the correct measurements, for example the choice of effort which is a constant, rather than an end date which can and will move, resulting in the ticket values having to be rewritten for the new plan. This rewriting is of course a waste of effort if the correct constant values can be chosen.

Once collected the data is extracted from Trac using a custom report, and exported into a spread sheet. The spreadsheet gives a view of the figures varying over time. This allows the comparison of the previous week’s progress against the current weeks estimate.

Walking through the process from the start, a wiki page is used to provide the parameters that are required to generate the report. The wiki page allows an HTML form to be embedded in the page.

wiki page to execute the custom report

The result of the custom report is CSV, which can be downloaded from Trac and loaded into Excel. The data can then be copied into a spreadsheet which contains macros and the previous weeks data.

custom report

Interpreting the Information

Once a couple of milestones have been delivered you can assess the trend of estimation. Are they always underestimates? This could indicate that the team is not as experienced as the estimator expects or the estimator has under estimated the complexity of the system. Is there a high variability in the estimates? This could indicate that the grasp of the requirements was weak when the estimate was made. The case of underestimation is fairly easy to manage when you discover it, by applying a velocity multiplier to the estimates. The case of a weak understanding of the requirements is more difficult to fix once the ticket is under way and the plan made. Either way having the figures gives you some evidence to back up your gut feeling, as to how your project is going. Hopefully it will also give you confidence to take action to fix any issues that have been identified.

Setting Expectations about End Dates

When dependencies have been identified and reliable estimates a Gantt chart can be produced showing the estimated end date of the project.

planning Gantt chart

Measurement of Progress Against Functionality

After we moved from the design phase into the implementation phase, we started to focus on the progress that we were making against the plan. The first question that we asked was how we were going to track progress, whether it was by class responsibilities, UI widgets or functional tests? Well class responsibilities will vary depending upon the design, and interpretation of the system. Likewise, UI widgets suffer the same issue.

Firstly, whatever metric is chosen it should be tied back to the client expectations, as not delivering on their expectations is a fast way of losing your client. As the clients expectations have been expressed by their requirements, this is a good place to start. Secondly, the metric must be measurable by the client, so must be related to a working system. The closest candidate that I have seen is results of automated acceptance tests. To ensure that the acceptance tests reflect the requirements, a requirements trace is performed.

Collecting the Data

Firstly we specified our convention for the test definition in the ticket description, where we had a tests heading and an ordered list of tests. This gave each test a unique identifier of the ticket number and test id. A sub list spelt out what steps were required of the test, but was not strictly required to perform the measurement of progress.

Trac ticket description with tests

Next we decided how to write our test implementation, using JUnit. By tying the definition and implementation together using a convention in the test method name we were able to extract the ticket and test definition from the JUnit report results, allowing us to map the definition to the implementation, to the test result. The final piece was to mark the requirements with a ticket and test number in Enterprise Architect.

We had a couple of choices about how to mark our test methods with the ticket and test number: Java annotations, Javadoc annotations, or a method name convention. We actually implemented a Java annotation, and started writing test methods before we had started automating the solution. This caused a headache down the line as I was unable to get the combination of the JDT compiler and Tycho to perform annotation processing. After working my way round the problem, we switched to using a method name convention as it was guaranteed to work. We had to update our methods, however all the important information had already been captured, so the task was donkey work for half a day rather than it resulting in missing tests. For this sort of thing it is better to take a guess on the implementation and capture the information, than to delay the process and lose the information.

Whenever you are setting expectations, it is important to have a plan of progress and the measurement of actual progress so they can be compared to check that you will deliver on the expectations you have set. By making an estimate of the test definitions from the available effort and inserting identifiable placeholders into the tests we were able to defer writing the actual tests until after the process had started to be used and proved to work. The placeholder tests were identified as only having a summary of a single letter of the alphabet, allowing the progress of writing test definitions to be tracked. This gave the Project Manager an idea of what the likely final figure for the tests would be, so he could set expectations with the client and management. The placeholder identification statistics allowed him to spot a likely bottleneck in the process.

Automating the Report Generation

We now had a good idea of how the process was going to work, started capturing data, and set about automating the process. I wrote a maven plugin that grabbed the test definitions and blended them together with the test results to provide a tracing report containing:

ticket
test definition
class name
method name
test result (could be pass, fail, skipped)

tracing report

The test definitions were grabbed from Trac by reading an RSS feed from a custom query. By altering the query we were able to produce reports for individual teams, or any other criteria that the PM was interested in.

The output of the plugin was an HTML file, which could be imported into excel. A similar export was taken from EA, matching up the ticket and test number to give a final spreadsheet that could be incorporated into our deliverables.

The same code was used to count the tests written, the estimated progress, the tests implemented, and the tests passing which was the actual progress. These statistics were supplied to Logica management by the Project Management to demonstrate progress.

statistics report

The final part of the progress reporting was to build a summarised view of the test definitions which counted the number of tests on each ticket. The PM could then use this, along with the ticket completion dates from the Gantt chart to work out how many tests should be complete on a given date. This was plotted as a chart of planned progress, to which was added a line of tests passing on a give date, the actual progress.

progress chart

The entirety of this process, took a few weeks to implement when interspersed with other work. All the reports have been exposed via a browser interface for easy access by the Project manager and Team Leaders, which we have done by using Hudson functionality to publish artifacts.

An important point to note is that the test results in the Hudson screen shot show both tests that implement test definitions and any other JUnit tests.

Summary

We have removed a very labour intensive and error prone part of the project by automating the reporting. This should lead to a better product as we are able to see where the holes are and take action before the client has a chance to see the product. By giving ourselves this foresight we should be able to deliver to our promises and reassure the client that we are going to deliver to their expectations.

The Future

Future work revolves around the review process, speeding it up and making the process easier for the reviewer. These are things such as ensuring that the review process has been finished, and providing a way of comparing the test definition to the test implementation.

Speeding up the process will reduce rework and improve quality by closing the time between developers making non-optimal decisions and the review catching these.

Finally it will help planning by detecting bottlenecks in the delivery of tickets at the review stage.

Other work includes the improvement in identifying dependencies and the effect of dependencies, such as whether they are end to start dependencies or not. This extra information is stored as text, in a bulleted list of dependencies. The dependencies can then be used by the Project Manager to construct the Gantt.

How Applicable is this to Other Projects

Integrating this with your issue management system may be somewhat different, however the only real integration point is the RSS feed that supplies ticket information and the custom queries. Happy metrics!

The Perception of Rushing and the Evil of the 'J' Word

The perception of those around us rushing will cause us to take short cuts. When under stress, it will even short cuts the decision of whether to take a short cut!

It is these decisions that we have failed to fully consider that often come back to haunt us; either in creating more work further down the line, or put our colleagues backs up. This is what John Kotter calls a false sense of urgency in his book 'A Sense of Urgency'.

1. So give your colleagues a break and resist the urge to rush into decisions!

You can often spot when people have short cut the consequences of what they are telling you because they use the word 'Just'. They are hiding a massive something by making it sound small with the word 'just'.

"Can you just...?"

"I just want to...?"

The 'J' word can also be used to belittle people when describing something they do or have done. In fact I could only think of one positive use of the word just, relating to justice:

"Is it just to do...?"

2. Whenever you hear yourself using the 'J' word, take a moment to ask yourself whether you really want the outcome that you are asking for, or whether there is a more effective approach.

I'll leave some of the IT consequences for another post...