This introduction will provide the necessary background on Semantic Web Services and their evaluation. It will then introduce SWS evaluation goals, dimensions and criteria and compare the existing community efforts with respect to these. This allows comprehending the similarities and differences of these complementary efforts and the motivation of their design.
Finally, in the last section, we will discuss lessons learned that concern all of the evaluation initiatives. In addition, we will analyze open research problems in the area and provide an outlook on future work and directions of development.
1.1 Organization of the Book
The remainder of the book is divided into four parts. Each part refers to one of the evaluation initiatives, including an introductory chapter followed by chapters provided by selected participants in the initiatives.
Part I will cover the long established first two tracks of the Semantic Service Selection (S3) Contest the OWL-S matchmaker evaluation and the SAWSDL matchmaker evaluation. Part II will cover the new S3 Jena Geography Dataset (JGD) cross evaluation contest. Part III will cover the SWS Challenge. Finally, Part IV will cover the semantic aspects of the WS Challenge.
The introduction to each part provides an overview of the evaluation initiative and overall results for its latest evaluation workshops. The following chapters by the participants, in each part, will present their approaches, solutions and lessons learned.
1.2 SWS in a Nutshell
Semantic Web Services (SWS) has been a vigorous technology research area for about a decade now. As its name indicates, the field lies at the intersection of two important trends in the evolution of the World Wide Web. The first trend is the development of Web service technologies, whose long-term promise is to make the Web a place that supports shared activities (transactions, processes, formation of virtual organizations, etc.) as well as it supports shared information [ or Representational State Transfer (REST) protocols, were limited to pre-established understanding of message types or shared data dictionaries.
Consequently, the second trend, the Semantic Web, is focused on the publication of more expressive metadata in a shared knowledge framework, enabling the deployment of software agents that can make intelligent use of Web resources or services []. In its essence, the Semantic Web brings knowledge representation languages and ontologies into the fabric of the Internet, providing a foundation for a variety of powerful new approaches to organizing, describing, searching for, and reasoning about both information and activities on the Web (or other networked environments). The central theme of SWS, then, is the use of richer, more declarative descriptions of the elements of dynamic distributed computation services, processes, message-based conversations, transactions, etc. These descriptions, in turn, enable fuller, more flexible automation of service provision and use, and the construction of more powerful tools and methodologies for working with services.
Because a rich representation framework permits a more comprehensive specification of many different aspects of services, SWS can provide a solid foundation for a broad range of activities throughout the Web service lifecycle. For example, richer service descriptions can support greater automation of service discovery, selection and invocation, automated translation of message content (mediation) between heterogeneous interoperating services, automated or semi-automated approaches to service composition, and more comprehensive approaches to service monitoring and recovery from failure. Further down the road, richer semantics can help to provide fuller automation of such activities as verification, simulation, configuration, supply chain management, contracting, and negotiation of services. This applies not only to the Internet at large, but also within organizations and virtual organizations.
SWS research, as a distinct field, began in earnest in 2001. In that year, the initial release of OWL for Services (OWL-S, originally known as DAML-S) was made available]).
In the world of standards, a number of activities have reflected the strong interest in this work. Two of the most visible of these are Semantic Annotations for WSDL (SAWSDL.
1.3 Evaluation in General
Evaluation has been part of science and scientific progress for a long time. In this section, we will have a brief look at evaluation in general before we focus on the much shorter history of evaluation in computer science.
1.3.1 Benefits and Aims of Evaluation
Lord Kelvin reportedly said more than 100years ago, If you can not measure it, you can not improve it. This sentence provides one of the main motivations for evaluations in a nutshell: By defining criteria that measure how good a system is, it becomes possible to objectively find strengths and weaknesses of this system and to systematically identify areas that need improvement. The German Evaluation Society puts it a bit more formally []:
Evaluation is the systematic investigation of an evaluands worth or merit. Evaluands include programs, studies, products, schemes, services, organizations, policies, technologies and research projects. The results, conclusions and recommendations shall derive from comprehensive, empirical qualitative and/or quantitative data.
When looking at the evaluation of software, [] offers a useful summary of possible goals of an evaluation: It may aim at comparing different software systems (Which one is better?), at measuring the quality of a given system (How good is it?) and/or at identifying weaknesses and areas for improvement (Why is it bad?).
Despite it being obvious that asking the questions above makes sense and will contribute to advancing computer science, evaluation is in general rather neglected in computer science. While benchmarks etc. have long been used systematically in some areas of computer science, overall, systematic experimentation has only recently gained importance in other areas of computer science. This may be due to the fact that this is a very young discipline which didnt have much time yet to establish its scientific standards. Several independent studies show that compared to other sciences experimental papers and meaningful evaluations are less frequent in computer science []. An area of computer science where this has been recognized early on and has been overcome by a community effort, namely the establishment of the TREC conference, is Information Retrieval. This is particularly interesting in the context of this book, as Information Retrieval (IR) and Semantic Web Service Discovery have a number of obvious similarities (albeit also differences) that are leveraged by some of the initiatives described in this book. Many Semantic Web Service evaluation techniques duplicate and extend established IR quality measures.