Edited by
RUSLAN MITKOV
ix
RUSLAN MITKOV
xi
xvii
MARTIN KAY
PART I FUNDAMENTALS
STEVEN BIRD
HARALD TROST
PATRICK HANKS
RONALD M. KAPLAN
SHALOM LAPPIN
ALLAN RAMSAY
GEOFFREY LEECH AND MARTIN WEISSER
CARLOS MARTIN-VIDE
BOB CARPENTER
PART II PROCESSES, METHODS, AND RESOURCES
ANDREI MIKHEEV
ATRO VOUTILAINEN
JOHN CARROLL
MARK STEVENSON AND YORICK WILKS
RUSLAN MITKOV
JOHN BATEMAN AND MICHAEL ZOCK
LORI LAMEL AND JEAN-Luc GAUVAIN
THIERRY DUTOIT AND YANNIS STYLIANOU
LAURI KARTTUNEN
CHRISTER SAMUELSSON
RAYMOND J. MOONEY
YUJI MATSUMOTO
LYNETTE HIRSCHMAN AND INDERJEET MANI
RICHARD I. KITTREDGE
TONY MCENERY
PIEK VOSSEN
ARAVIND K. JOSHI
PART III APPLICATIONS
JOHN HUTCHINS
HAROLD SOMERS
EVELYNE TZOUKERMANN, JUDITH L. KLAVANS, AND TOMEK STRZALKOWSKI
RALPH GRISHMAN
SANDA HARABAGIU AND DAN MOLDOVAN
EDUARD Hov
CHRISTIAN JACQUEMIN AND DIDIER BOURIGAULT
MARTI A. HEARST
ION ANDROUTSOPOULOS AND MARIA ARETOULAKI
ELISABETH ANDRE
JOHN NERBONNE
GREGORY GREFENSTETTE AND FREDERIQUE SEGOND
Computational Linguistics is an interdisciplinary field concerned with the processing of language by computers. Since machine translation began to emerge some fifty years ago (see Martin Kay's introduction below), Computational Linguistics has grown and developed exponentially. It has expanded theoretically through the development of computational and formal models of language. In the process it has vastly increased the range and usefulness of its applications. At a time of continuing and vigorous growth the Oxford Handbook of Computational Linguistics provides a much-needed reference and guide. It aims to be of use to everyone interested in the subject, including students wanting to familiarize themselves with its key areas, researchers in other areas in search of an appropriate model or method, and those already working in the field who want to discover the latest developments in areas adjacent to their own.
The Handbook is structured in three parts which reflect a natural progression from theory to applications.
Part I introduces the fundamentals: it considers, from a computational perspective, the main areas of linguistics such as phonology, morphology, lexicography, syntax, semantics, discourse, pragmatics, and dialogue. It also looks at central issues in mathematical linguistics such as formal grammars and languages, and complexity.
Part II is devoted to the basic stages, tasks, methods, and resources in and required for automatic language processing. It examines text segmentation, part-of-speech tagging, parsing, word-sense disambiguation, anaphora resolution, natural language generation, speech recognition, text-to-speech synthesis, finite state technology, statistical methods, machine learning, lexical knowledge acquisition, evaluation, sublanguages, controlled languages, corpora, ontologies, and tree-adjoining grammars.
Part III describes the main real-world applications based on Computational Linguistics techniques, including machine translation, information retrieval, information extraction, question answering, text summarization, term extraction, text data mining, natural language interfaces, spoken dialogue systems, multimodal/ multimedia systems, computer-aided language learning, and multilingual on-line language processing.
Those who are relatively new to Computational Linguistics may find it helpful to familiarize themselves with the preliminaries in Part I before going on to subjects in Parts II and III. Reading Chapter 4 on syntax, for example, should help the reader to understand the account of parsing in Chapter 12.
To make the book as coherent and useful as possible I encouraged the authors to adopt a consistent structure and style of presentation. I also added numerous crossreferences and, with the help of the authors, compiled a glossary. This latter will, I hope, be useful for students and others getting to know the field.
The diverse readership for whom the Handbook is intended includes university researchers, teachers, and students; researchers in industry; company directors; software engineers; computer scientists; linguists and language specialists; and translators. In sum the book is for all those who are drawn to this endlessly fascinating and rewarding field.
I thank all the contributors to the Handbook for the high quality of their input and their cooperation. I am particularly indebted to Eduard Hovy, Lauri Karttunen, John Hutchins, and Yorick Wilks for their helpful comments and to Patrick Hanks for his dedicated assistance in compiling the glossary. I thank John Davey, OUP's linguistics editor, for his help and encouragement. I acknowledge gratefully the support received from the University of Wolverhampton. Finally, I would like to express gratitude to my late mother Penka Georgieva Moldovanska for her kind words and moral support at the beginning of this challenging editorial project.
Ruslan Mitkov
March 2002