The terms fault and failure are sometimes used loosely to mean the same thing but they are actually quite different. A fault is something inherent in the software – a failure is something that happens in the real world. Faults do not necessarily lead to failures and failures often occur in software that is not ‘faulty’.
The reason for this is that whether some behaviour is a failure or not, depends on the judgement of the observer and their expectations of the software. For example, I recently tried to buy 2 day passes on the Lisbon metro for myself and my wife. They use reusable cards so you buy 2 cards then credit them with the appropriate pass. The dialogue with the machine went as follows:
How many cards (0.5€ each): 2
How many passes (3.7€ each): 2
Total to pay: 15.8€
To put it mildly, I was surprised. I tried twice, the same thing happened. I then bought the passes one at a time and all was fine – I paid the correct fee of 8.4€.
From my perspective, this was a software failure. It meant that I had to spend longer than I should have buying these passes. On the train, I tried to think about what might have happened. What I guess is the situation is that it is possible to have buy more than 1 day pass at a time and have it credited to the card. So, the 2nd question should have been:
How many passes on each card?
From a testing perspective, the software was probably fine and free of defects and, if you understood the system, then you would have entered 1 pass per card.
So, failures are not some absolute thing that can be tested for. They will always happen because different people will have different expectations of systems. That’s the theme of my keynote talk at SEPGEurope 2010 conference in Porto. We need to design software to help people understand what its doing and help them recover from failures.