History of Computing
Series Editors
Gerard Alberts
Institute for Mathematics, University of Amsterdam, Amsterdam, The Netherlands
Jeffrey R. Yost
Charles Babbage Institute, University of Minnesota, Minneapolis, MN, USA
Advisory Editors
Jack Copeland
University of Canterbury, Christchurch, New Zealand
Ulf Hashagen
Deutsches Museum, Mnchen, Germany
Valrie Schafer
ISCC, CNRS, Paris, France
John V. Tucker
Department of Computer Science, Swansea University, Swansea, UK
Founding Editor
Martin Campbell-Kelly
Department of Computer Science, University of Warwick, Coventry, UK
The History of Computing series publishes high-quality books which address the history of computing, with an emphasis on the externalist view of this history, more accessible to a wider audience. The series examines content and history from four main quadrants: the history of relevant technologies, the history of the core science, the history of relevant business and economic developments, and the history of computing as it pertains to social history and societal developments.
Titles can span a variety of product types, including but not exclusively, themed volumes, biographies, profile books (with brief biographies of a number of key people), expansions of workshop proceedings, general readers, scholarly expositions, titles used as ancillary textbooks, revivals and new editions of previous worthy titles.
These books will appeal, varyingly, to academics and students in computer science, history, mathematics, business and technology studies. Some titles will also directly appeal to professionals and practitioners of different backgrounds.
More information about this series at http://www.springer.com/series/8442
Jacqueline Lon
Automating Linguistics
1st ed. 2021
Logo of the publisher
Jacqueline Lon
Laboratoire dHistoire des Thories Linguistiques UMR CNRS 7597 Universit de Paris, Paris, France
ISSN 2190-6831 e-ISSN 2190-684X
History of Computing
ISBN 978-3-030-70641-8 e-ISBN 978-3-030-70642-5
https://doi.org/10.1007/978-3-030-70642-5
Translation from the language edition: Histoire de lautomatisation des sciences du langage by Jacqueline Lon, ENS ditions 2015. Published by ENS Editions. All Rights Reserved.
The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Foreword
With the latest comeback of artificial intelligence as deep learning, newspapers and industry have been promising us new computer breakthroughs in linguistics and information retrieval. Building upon automatic processing of data that are massively harvested from the web, we are promised in some not too distant future near-perfect automatic translations, expert chatbots that understand and answer human queries, or even conversations with virtual assistants. However, as some reports have brought to light, much human effort, often under precarious circumstances, is tacitly injected in the machines deep learning and we do not yet know how the learning curve of this enhanced technology may evolve.
But it is not the first time that we have been promised perfect translation or improved human-computer interaction, rather, as history teaches us, the industrys self-advertising through the projection of a futuristic utopia is a recurrent phenomenon of our computer age. Already in the 1950s, formalisations of language were proposed that would supposedly make automatic translation possible. They turned out to be performing poorly. And already in the 1960s we saw the first usage of computing facilities for corpus linguistics prefiguring later big data or digital humanities. But, tempered by the limited resources of the time, this was without the (over)ambitious high hopes pinned on todays big data. This goes to show that it is now more timely than ever to go back in time and reflect upon past developments in computer linguistics. Both the successes and the limits of earlier efforts can help to historically inform us and to critically assess our current situation.
The present book is a history of how the digital computer encountered the field of linguistics in the wake of the Second World War and slowly but lastingly changed the very field of linguistics, creating new (sub)fields such as Automatic Translation, Natural Language Processing or Computational Linguistics. Two important turns are described in this book. The first one, which may be called the automatic turn, is the automation of language, enabled through the formalisation and mathematisation of language that took place roughly between 1949 and 1966. The second turn, the corpus turn, is the emergence of natural language processing in the 1990s, continuing and enlarging earlier research in documentation systems and corpus linguistics with the help of microcomputers.
Though efforts to formalise language and automate linguistics antedate this fateful encounter, the advent of the digital computer accelerated and heavily influenced the automation of language. It enabled, both theoretically and practically (and also financially), the use of mathematical methods in language, and, later, the systematic and automatic exploitation of large corpora in linguistics. But this encounter was also a two-way process. Linguistics also contributed to the newly developing field of computing. It motivated the development of some early programming languages, documentation systems and query languages, and, most conspicuously, provided some of the important theoretical tools for computing and programming such as indexing and parsing algorithms or the Chomsky hierarchy.
The main trigger for the encounter between linguistics and computing was the Second World War. Linguistics took part in the war effort as much as the other sciences, a fact that is often overlooked (cf. Chap. ). Apart from the more obvious connection to cryptography, linguistics was also essential for developing effective language training courses for the army, and for translating foreign texts. During the Cold War that followed the World War, it was especially the feeling that quick translation of Russian research and intelligence into English was badly needed that prompted military investments into automatic translation. Warren Weavers 1949 report on mechanical translation started off a decade of intensive work on automatic translation.