Abstract Interpretation
Web page maintained by P. Cousot
Last update: Aug 5, 2008
From: Abstract Interpretation
Introduction to Abstract Interpretation
The
formal verification of a program (and more generally acomputer system) consists in proving that its
semantics(describing "what the program executions actually do") satisfies its
specification (describing "what the program executions aresupposed to do").
Abstract interpretation[] formalizes the ideathat this formal proof can be done at some level of abstraction whereirrelevant details about the semantics and the specification areignored. This amounts to proving that an
abstract semanticssatisfies an
abstract specification. An example of abstractsemantics is Hoare logic while examples of abstract specifications areinvariance, partial, or total correctness. This abstracts away fromconcrete properties such as execution times.Abstractions should preferably be
sound (no conclusion derivedfrom the abstract semantics is wrong relative to the program concretesemantics and specification). Otherwise stated, a proof that theabstract semantics satisfies the abstract specification should implythat the concrete semantics also satisfies the concrete specification.Hoare logic is a sound verification method, debugging is not (sincesome executions are left out), bounded model checking is not either(since parts of some executions are left out). Unsound abstractionslead to
false negatives (the program may be claimed to becorrect/non erroneous with respect to the specification whereas it isindeed incorrect).Abstractions should also preferably be
complete (no aspect ofthe semantics relevant to the specification is left out). So if theconcrete semantics satisfies the concrete specification this should beprovable in the abstract. However program proofs are undecidable, andso automatic tools for reasoning about programs are all incomplete(for non trivial program properties such as safety, liveness, orsecurity) and must therefore fail on some programs. This can beachieved by allowing the tool not to terminate, to be unsound (e.g. debugging tools omit possible executions), or to be incomplete (e.g. static analysis tools may produce false alarms). Incompleteabstractions lead to
false positives or
false alarms(the specification is claimed to be potentially violated by someprogram executions while it is not).Potentially non-terminating and unsound program verification tools areeasy to design. Terminating and sound tools are much more difficultto design. Complete tools are impossible to design, byundecidability. However static analysis tools producing very few orno false alarms have been designed and used in industrial contexts forspecific families of properties and programs [].In all cases, abstract interpretation provides a systematicconstruction method based on the effective approximation of theconcrete semantics, which can be (partly) automated and/or formallyverified.Abstract interpretation aims at
- providing a basic coherent and conceptual theory forunderstanding in a unified framework the thousands of ideas, concepts,reasonings, methods, and tools on formal program analysis andverification [];
- guiding the correct formal design of automatic tools for programanalysis (computing an abstract semantics) and programverification (proving that an abstract semantics satisfies anabstract specification) [].
Abstract interpretation theory studies semantics (formal models ofcomputer systems), abstractions, their soundness, and completeness.The informal presentation"AbstractInterpretation in a Nutshell" aims at providing a short intuitiveintroduction to the theory. The"basicconceptsofabstractinterpretation" and an elementary"courseon abstract interpretation" can also be found on the web.
What can be formalized by abstract interpretation?
Abstract interpretation is useful in computer science to formalizereasonings involving the [sound [and complete]] approximation of thesemantics of formal systems. A non-exhaustive list of typicalexamples of application is given below.
2.1 Syntax
The analysis of properties of grammars as well as parsing are abstractinterpretations of the grammar operational semantics and itsabstractions (such as parse trees, protolanguages, or terminallanguages) [].
2.2 Semantics
The semantics of programs describes their possible runtime executionsin all possible execution environments at some level of abstraction.The hierarchy of semantics, including trace, small-step, and big-stepoperational semantics, relational semantics, denotational semantics,predicate transformers, and axiomatic semantics in their angelic,natural, and demoniac versions can be designed by abstractinterpretation [].
2.3 Proofs
Formal proofs of program correctness involve an abstraction sincespecifications always ignore some aspects of program execution whichneed not be taken into account in the proof [].
2.4 Static analysis
Static analysis is the automatic determination of abstract programproperties[], etc. This wasthe motivating application behind the introduction of the abstractinterpretation theory.
2.5 Types
Static typing and type inference [].
2.6 Model-checking
Model-checking exhaustively verifies temporal properties on afinite model of hardware or software computer systems[], .Model-checking is reputed to be terminating, sound, and complete on themodel. From an abstract interpretation point of view, relating thesystem to its model, it may be sound on the model but unsound on thesystem (e.g. the model is correct for safety properties but wrong forliveness properties), it is often incomplete (no finite model cancover the specified behaviors of the system [])and, in practice, may explode combinatorially. In all cases abstractinterpretations of the system into a model have to be considered. Alltransition models are abstract semantics but the converse is not true.
2.7 Predicate abstraction
Predicate abstraction [].
2.8 Counter-example-based refinement
Spurious counter-example-based refinement in abstract model-checkingis formalized as an abstract domain completion problem in the abstractinterpretation theory [].
2.9 Strong Preservation
The problem of modifying finite abstract model checking by minimalrefinements in order to get strong preservation for some specificationlanguage, including partition refinement algorithms, is a completionproblem in the abstract interpretation theory[].
2.10 Program transformation
Program transformations (such as
partial evaluation[].
2.11 Watermarking
Looking for program watermarks that are not subject to obfuscation, onecan think to an abstract interpretation of the program semantics[].
2.12 Information hiding
In language-based software security, the information to be hidden to anintruder can be formalized as an abstract interpretation of theprogram semantics [].
2.13 Code obfuscation
The aim of code obfuscation is to prevent malicious users fromdiscovering properties of the original source program. This goal canbe precisely modeled by abstract interpretation, where the hiding ofproperties corresponds to abstracting the semantics[].
2.14 Malware detection
An obfuscated malware is better detected by the effects of itsexecution as recognized by an abstract interpretation rather than by asyntactic signature [].
2.15 Termination
A relational abstract-interpretation-based static analysis on awell-founded abstract domain can be systematically extended to atermination analysis [].