Evangelia Adamou
A Corpus-Driven Approach to Language Contact
Language Contact and Bilingualism
Editor
Yaron Matras
Volume 12
ISBN 978-1-61451-761-0
e-ISBN (PDF) 978-1-61451-657-6
e-ISBN (EPUB) 978-1-5015-0065-7
ISSN 2190-698X
Library of Congress Cataloging-in-Publication Data
A CIP catalog record for this book has been applied for at the Library of Congress.
Bibliographic information published by the Deutsche Nationalbibliothek
The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de .
2016 Walter de Gruyter Inc., Boston/Berlin
www.degruyter.com
Preface
The present book offers an analysis of first-hand data from three unrelated languages: Balkan Slavic, Thrace Romani, and Ixcatec. At the basis of this work lies the collaboration with all of the speakers of these languages who generously shared their stories and accepted that their conversations be recorded. Many thanks to the Ixcatec speakers Cipriano Ramirez Guzmn, Rufina Robles, Juliana Salazar Bautista, and Pedro Salazar Gutierrez; and to Santa Mara Ixcatlns General Assembly for granting us permission to work on Ixcatec. Many thanks to all the Romani and Slavic speakers from Greece who have chosen to remain anonymous due to the complexity of the political context in the country and a special thank-you to Sabiha Suleiman and to Chrysoula Adamou for their precious assistance during fieldwork in Drosero and Liti respectively.
The collection and analysis of these data were carried out as part of my research activities at the French National Centre for Scientific Research (CNRS). My research has also benefitted greatly from several externally-funded research programmes. The Balkan Slavic corpus of Nashta was created within a French- German research programme which I jointly led with Walter Breu, Electronic database of endangered Slavic varieties in non-Slavic speaking European countries (20102012), with funding from the French National Research Agency and the Deutsche Forschungsgemeinschaft (ANR-09-FASHS-025 and DFG BR 1228-4-1). Research on Thrace Romani received support from the programme Towards a multi-level, typological and computer-assisted analysis of contact-induced language change (20102014), funded by the French National Research Agency (ANR-09-JCJC-0121-01, P.I. Isabelle Lglise). Research on Ixcatec was conducted within the Ixcatec documentation programme (20102013), funded by the Endangered Languages Documentation Programmes of the Hans Rausing Foundation (MDP 0214, P.I. Denis Costaouec). The analysis of the data was then continued as part of the programme Designing spoken corpora for crosslinguistic research (20132016), funded by the French National Research Agency (ANR-12-BSH2-0011, P.I. Amina Mettouchi). I also wish to acknowledge support from the programme Investments for the Future funded by the French National Research Agency (ANR-10-LABX-0083).
Some of the studies reported on here have been conducted in collaboration with other scholars and have resulted in joint conference papers and publications which are cited throughout the book. Specifically, the comparison of the Balkan Slavic data with other Slavic minority languages was possible through collaboration with Walter Breu, Georges Drettas, and Lenka Scholze. The comparison of the Thrace Romani data with the Finnish Romani data is part of collaborative research with Kimmo Granqvist. Research on Romani phonetics and prosody is part of a collaboration with Amalia Arvaniti. Also, part of the Ixcatec data which was taken into consideration in this book was kindly shared with me by Denis Costaouec.
Over the years, I benefited from discussions with Walter Breu, Claudine Chamoreau, Denis Costaouec, Zygmunt Frajzyngier, Victor Friedman, Kimmo Granqvist, Isabelle Lglise, Yaron Matras, Felicity Meakins, Amina Mettouchi, Bettina Migge, Carol Myers-Scotton, Eva Schultze-Berndt, Stavros Skopeteas, Lameen Souag, and Anton Tenser.
Special thanks are due to Maa Ponsonnet, Claudia Wegener, and Stergios Chatzikyriakidis for their careful reading of some of the chapters of this book and to Margaret Dunham for editing my English. For technical support I would like to thank Mourad Aouini, Christian Chanard, Sverine Guillaume, and Pascal Vaillant. For the statistical analyses thanks are due to Rachel Chen and Franois Sermier and for the maps to Jrme Picard. Many thanks to Elif Diviioglu for advice on the analysis of the Turkish data, to Olivier Le Guen for insights on the analysis of the Ixcatec gesture and frames of reference, to Martine Toda and Yordanka Kozareva for their assistance with the Balkan Slavic corpus, to Claire Wolfarth for help with the Ixcatec corpus, and to Frida Cruz for assisting me with the non-verbal tasks in Ixcatln.
I am particularly grateful to the editor of this series, Yaron Matras, for his precious advice and encouragement during the editing process.
Last, I wish to dedicate this book to my daughter Niki, who not only accompanied me during my fieldwork trips over the last ten years, but also actively participated in community life and assisted me with my research when possible. The reasons why she does not wish to study linguistics are of course not related in any way to these trips!
List of figures
Screen capture of a search on a file produced with Jaxe for Thrace Romani
Screen capture of an Elan-CorpA file for Ixcatec
The Slavic minority languages of the EuroSlav corpora
Screen capture of a file produced with ITE for Balkan Slavic Nashta
Interactions in the two Ixcatec-Spanish corpora
Two Ixcatec-Spanish corpora (8,807 words in total): Distribution of word-tokens with respect to language
Interactions in the Balkan Slavic Nashta corpus
Two Balkan Slavic-Greek corpora (9,235 words in total): Distribution of word-tokens with respect to language
Interactions in the Colloquial Upper Sorbian-German corpus
Interactions in the Burgenland Croatian corpus
Two Slavic-German corpora (8,012 words in total): Distribution of word-tokens with respect to language
Interactions in the Thrace Romani-Turkish-Greek corpus
The Thrace Romani-Turkish-Greek corpus (5,816 words in total): Distribution of word-tokens per language
Interactions in the Finnish Romani-Finnish corpus
The Finnish Romani-Finnish corpus (13,031 words in total): Distribution of word-tokens per language (adapted from Adamou and Granqvist 2014)
Interactions in the Molise Slavic-Italian corpus
The Molise Slavic-Italian corpus (17,279 words in total): Distribution of word-tokens per language
Distribution of word-tokens with respect to language for seven corpora
The Thrace Romani-Turkish-Greek corpus: Length of Turkish and Greek word-tokens
The Finnish Romani-Finnish corpus: Distribution of word-tokens per language in the Finnish-dominant and Romani-dominant clauses (adapted from Adamou and Granqvist 2014)
The Finnish Romani-Finnish corpus: Length of Finnish word-tokens in Romani-dominant clauses and of Romani tokens in Finnish-dominant clauses (adapted from Adamou and Granqvist 2014)
The Ixcatec-Spanish contemporary corpus: Distribution of nouns and verbs per language
The Balkan Slavic Nashta-Greek corpus: Distribution of nouns per language
Next page