From Machine Learning to Software Engineering
Francesco Bergadano and Daiiiele Guiictti
ix
xi
1 1
The logic programming approach to computing investigates the use of logic as a programming language and explores computational models based on controlled deduction.
The field of logic programming has seen a tremendous growth in the last several years, both in depth and in scope. This growth is reflected in the number of articles, journals, theses, books, workshops. and conferences devoted to the subject. The MIT Press series in logic programming was created to accommodate this development and to nurture it. It is dedicated to the publication of high-quality textbooks, monographs, collections, and proceedings in logic programming.
Ehud Shapiro
The Weizmann Institute of Science
Rehovot, Israel
Inductive Logic Programming (ILP) has evolved from previous research in Machine Learning, Logic Programming, and Inductive Program Synthesis. Like relational Machine Learning. it deals with the induction of concepts represented in a logical form. Typically. tlne output of an ILP learner is a set of Horn clauses. However. ILP cannot be just a new name for relational Machine Learning. This book emphasizes this fact by giving special attention to -Programming." less to "Inductive and "Logic.- ILP techniques have the potential to support software development and nnaintenance. Some of these techniques, developed by the authors, are studied in full detail and their implementations are made available through anonymous ftp. Their Software Engineering applications are then discussed in the context of a relatively complex logic prograun. However. this book also has the important motivation of providing an up-to-date and extended survey of ILP as a research area. The basic notions. the most common induction operators, and the best-known methods and systems are described and analyzed. Compared with other existing surveys. the present book may be more complete, and includes newer notions, such as inverse implication: newer goals that are important for programming assistants. such as mull il>le predicate learning; and important topics that are often overlooked, such as declarative bias.
This work is the result of many years of research, and many have directly or indirectly contributed to it. We wish to thank Lorenza Saitta and Attilio Giordana, as well as Gabriele Lolli, for the early joint work that has inspired our more recent, perspectives, and for their continuing support. As in all long-term research efforts, an essential contribution is represented by discussions, joint work on related topics, and collaboration in international projects. This kind of support is due to many researchers who cannot all he listed here. However. we would like to mention the Machine Learning and Al group at the University of Turin, the European researchers involved in the ESPRIT ILP project. and Ryszard Michalski and his group at George Mason University. Stan 1\-Ia.twin and Claire Nedellec read and commented on a first draft of the book, that resulted in a number of improvements and changes. The research described in this book was financially supported by the European Union. under contracts 6020 (ESPRIT 13RA ILP) and 6156 (ESPRIT BRA DRUi\IS2): by involvement in the ILPNET PECO network and in the MLNET network of excellence; and by the Italian CNR. for the bilateral project on Inductive Inference (Italy-USA).
Francesco and Daniele Gunetti
Turin, January 1996
Inductive Logic Programming is sufficiently new to require starting this book with a definition. A definition that has generated some agreement is found in [62]. It states that Inductive Logic Programming is the research area covering the intersection of Machine Learning and Logic Programming. This is about like saying that it is concerned with Induction and Logic Programming. Indeed: inductive, logic, and programming. Although the authority of this view dates hack to Lapalisse, we will risk generating less agreement and propose a more informative notion. Inductive Logic Programming (called ILP in the rest of the book) has been concerned with systems and general methods that are given examples and produce programs. In fact, an ILP system may receive various kinds of information about the desired program as input, but this input always includes examples of the program's inpnt/output behavior. The output that is produced is a logic program that behaves as expected on the given examples. or at least on it high percentage of them. Typically. the obtained programs will then be used on new examples. not given to the ILP system during the learning phase. The above may he an oversimplification. but for a computer scientist, it is ,just a form of program synthesis from examples. The view presented in this book emphasizes the fact that examples are absolutely not the only input to practical ILP methods. Another important source of information conies froin a priori knowledge about the target program. including partially developed software components, and properties of the needed Slit) procedures. such as the number and the type of the arguments and of the returned values. To stress this observation, and also the fact that practical tools will have to be embedded in more complex environments, we would like to look at ILP as logic program development with the help of examples, and not just automatic programming from examples. This hook provides an up-to-date and complete survey of ILP as a research area, and then concentrates on methods that are effective for software development.
Although definitions may be useful, and ours also serves the purpose of proposing a particular perspective. a better way to define a research area is to look at the roots of its ideas and at the history of its early developments. In this context. the definition of ILP as the intersection of Machine Learning and Logic Programming loses its tautological strength and seems to suggest that ILP has evolved from both disciplines. But this is not true. Most research and researchers in ILP came from Machine Learning alone. and some of the initial motivations are meaningful if framed in the evolution of inductive reasoning from Pattern Recognition, through initial approaches to symbolic Machine Learning. to more recent techniques for learning relational concepts.
In Pattern Recognition, one obtains the definition of a class, a group of objects that are similar or should be grouped together for the purposes of some application. The definition that is learned inductively is usually numeric, or at least contains some numeric parameters. The main purpose of the training phase is the fine-tuning of these numeric parameters. A good example is found in linear discriminants. A class is a set of points in an n-dimensional space. A learned description of a class is a hyperplane, in that space: what lies on one side of the plane is classified as belonging to the class. The training phase determines the parameters of the hyperplane.