Today's software comes with extensive documentation: user guides, reference manuals, and design documents. There are on-line help systems, interactive tutorials, and friendly `introductions for dummies'. Yet the behaviour of software is often a surprise to users and designers alike. Components interact and interfere, undesirable properties emerge, and systems fail to meet their requirements.
The more spectacular consequences make the headlines: aircraft have crashed, trains have collided, people have received fatal doses of radiation, and emergency telephone services have been withdrawn. The less spectacular we face every day: time is wasted, effort is expended to no avail, important projects are scrapped, and our health is damaged by sheer frustration. All of this, and more, because software fails to live up to our expectations.
There are many explanations for this: the requirements upon a piece of software are hard to define, the ways in which a system may be used are hard to anticipate, and there is always a demand for additional functionality. Indeed, the fact that many pieces of software actually work, and work well, is some indication of the skill of those whose job it is to develop them.
One way to improve the quality of software is to change the way in which software is documented: at the design stage, during development, and after release. Existing methods of documentation offer large amounts of text, pictures, and diagrams, but these are often imprecise and ambiguous. Important information is hidden amongst irrelevant detail, and design flaws are discovered too late, making them expensive or impossible to correct.
There is an alternative. Formal methods, based upon elementary mathematics, can be used to produce precise, unambiguous documentation, in which information is structured and presented at an appropriate level of abstraction. This documentation can be used to support the design process, and as a guide to subsequent development, testing, and maintenance.
It seems likely that the use of formal methods will become standard practice in software engineering. The mathematical basis is different from that of civil or mechanical engineering, but it has the same purpose: to add precision, to aid understanding, and to reason about properties of a design. Whatever the discipline, the use of mathematics can be expensive, but it is our experience that it can actually reduce costs.
Existing applications of formal methods include: the use of probability theory in performance modelling; the use of context-free grammars in compiler design; the use of the relational calculus in database theory. The formal method described in this book has been used in the specification and design of large software systems. It is intended for the description of state and state-based properties, and includes a theory of refinement that allows mathematics to be used at every stage of program development.
CICS is one of the most successful pieces of software in the world: there are over 30~000 licences, and most of the world's top companies use it. CICS stands for Customer Information Control System, a family of transaction processing products produced by IBM UK Laboratories at Hursley Park. CICS provides data access, communications, integrity, and security services. Put simply, CICS manages information.
When we use an automated teller machine in San Francisco, an account at our local branch in Oxford is debited, even though the machine is thousands of miles away. During the busiest times, there may be many thousands of customers of the bank using the service all over the world, and we all expect to be served within a reasonable time. CICS offers a way of achieving this.
There have been regular releases of CICS since the mid-1970s. Each release has introduced additional features and extended the structure of the existing code. In the early 1980s, the complexity of the system started to become a serious problem for the company. A decision was made to re-design some of the CICS modules with the aim of making extensions easier. An important part of the proposed solution involved finding a more precise way to specify functionality.
Such precision requires the use of mathematical techniques that were, at that time, little known outside academia. A happy coincidence brought the CICS manager, Tony Kenny, and the Oxford professor, Tony Hoare, together at a conference. They hatched a plan to apply Oxford's ideas to Hursley's problems. Oxford advised on how formal methods could be used for the specification and design of new CICS modules. Hursley showed how the methods could be adapted to problems on an industrial scale.
A particular formal method, the Z notation, was used to specify the new CICS functionality. Hursley's programmers were used to writing specifications in English, and the rigorous, mathematical notation was seen as a challenge. In practice, the notation proved easy to learn and to apply, even for programmers with no previous experience of mathematics. The result was a perceived improvement in the quality and reliability of delivered code.
The first CICS product to be designed using Z was CICS/ESA version 3, announced in June 1989. In April 1992, the Queen's Award for Technological Achievement was conferred upon IBM United Kingdom Laboratories Limited and Oxford University Computing Laboratory for `the development and use of an advanced programming method that reduces development costs and significantly enhances quality and reliability': namely, Z.
The Z notation is based upon set theory and mathematical logic. The set theory used includes standard set operators, set comprehensions, Cartesian products, and power sets. The mathematical logic is a first-order predicate calculus. Together, they make up a mathematical language that is easy to learn and to apply. However, this language is only one aspect of Z.
Another aspect is the way in which the mathematics can be structured. Mathematical objects and their properties can be collected together in schemas: patterns of declaration and constraint. The schema language can be used to describe the state of a system, and the ways in which that state may change. It can also be used to describe system properties, and to reason about possible refinements of a design.
A characteristic feature of Z is the use of types. Every object in the mathematical language has a unique type, represented as a maximal set in the current specification. As well as providing a useful link to programming practice, this notion of types means that an algorithm can be written to check the type of every object in a specification; several type-checking tools exist to support the practical use of Z.
A third aspect is the use of natural language. We use mathematics to state the problem, to discover solutions, and to prove that the chosen design meets the specification. We use natural language to relate the mathematics to objects in the real world; this job is often partly achieved by the judicious naming of variables, but additional commentary is vital. A well-written specification should be perfectly obvious to the reader.
A fourth aspect is refinement. We may develop a system by constructing a model of a design, using simple mathematical data types to identify the desired behaviour. We may then refine this description by constructing another model which respects the design decisions made, and yet is closer to implementation. Where appropriate, this process of refinement can be continued until executable code is produced.
The Z notation, then, is a mathematical language with a powerful structuring mechanism. In combination with natural language, it can be used to produce formal specifications. We may reason about these specifications using the proof techniques of mathematical logic. We may also refine a specification, yielding another description that is closer to executable code.
Z is not intended for the description of non-functional properties, such as usability, performance, size, and reliability. Neither is it intended for the description of timed or concurrent behaviour. However, there are other formal methods that are well suited for these purposes. We may use these methods in combination with Z to relate state and state-change information to complementary aspects of design.
In this book, we place considerable emphasis upon proof. When we introduce the language of mathematical logic, we explain the use of a proof system. When we introduce the language of sets and relations, we explain how formal proofs may be constructed about such objects. When we introduce the language of schemas, we show how to prove that a specification is consistent, and how to prove that one specification refines another. Our intentions are two-fold: first, to show that proof adds quality to software development; second, to show that proof is a feasible part of the industrial use of formal methods.
If we reason about a specification, if we attempt to construct proofs about its properties, then we are more likely to detect problems at an early stage of system development. The process of constructing proofs can help us to understand the requirements upon a system, and can assist us in identifying any hidden assumptions. Proof at the specification stage can make a significant contribution to the quality of software.
At the design stage, a proof can show us not only that a design is correct, but also why it is correct. The additional insight that this affords can be invaluable: as requirements evolve and the design is modified, the consequences are easier to investigate. At the implementation stage, a proof can help us to ensure that a piece of code behaves according to the specification. Again, a significant contribution to quality can be made.
The construction of proofs is an essential part of writing a specification, just as proof-reading is an essential part of writing a book. A specification without proofs is untested: it may be inconsistent; it may describe properties that were not intended, or omit those that were; it may make inappropriate assumptions. The practice of proof makes for better specifications.
It seems to be part of software engineering folklore that proof is impossible on an industrial scale; however, our experience has been different. We have been involved in many large-scale applications of formal methods; some involved proof, others did not. We have seen that techniques involving proof are successful where formal methods are used with a light touch, and where proofs are conducted at an appropriate level of formality.
In many situations, a rigorous argument, or a semi-formal justification, will be sufficient to bring about the desired improvement in quality. In other, more critical situations, it may be necessary to increase the level of formality until the correctness of the design is beyond doubt. In some situations, a completely formal proof may be required. The trick of using formal methods effectively is to know when proofs are worth doing and when they are not.
An essential property of a good specification is an appropriate choice of abstraction. A good example of this is provided by the various maps of the London Underground. When the first map was published in 1908, it was faithful to the geography of the lines: all the twists and turns of the tracks and the relative distances between stations were recorded faithfully and to scale. However, the purpose of the map was to show travellers the order of stations on each line, and the various interchanges between lines; the fidelity of the map made it difficult to extract this information.
In 1933, the map was changed to a more abstract representation, called the Diagram. Here, the connectivity of stations on the network was preserved, and at last, passengers could see at a glance the route to their destination. Abstraction from superfluous detail - in this case the physical layout of the lines - was the key to the usefulness of the Diagram. Figures 1 and 2 show published versions before and after the change.
The Diagram was, and still is, a good specification of the London Underground network. It is
`It might have been expected that the advent of such an abstract, schematised image would generate a response similar to one's first sight of a Mondrian painting: respectful, awed, intrigued, maybe mystified, but surely never affectionate? Yet it was so.'
Furthermore, the Diagram may be used to predict the result of travelling on the Underground network. We might observe that if we start at Oxford Circus, travel eastbound on the Central Line and change trains at Tottenham Court Road, then take the Northern Line, we may arrive at Mornington Crescent. In mathematical terms, this property is a theorem of the system; in practical terms, it describes a possible route.
The Diagram has served its purpose well; if only every specification were as good as this. Interestingly, the first sketch of the Diagram was rejected by the Publicity Department of the Underground. They thought that the idea of a 90-degree and 45-degree schematic treatment was too `revolutionary'. The abstract notation was thought to be too strange and incomprehensible for the ordinary user of the Underground network.
From Using Z by Jim Woodcock and Jim Davies (c) 1996 Prentice Hall International
| text | slides | exercises | solutions | contents |