Contents
Guide
Haitao Liu and Junying Liang (Eds.)
Motifs in Language and Text
Quantitative Linguistics
Editor
Reinhard Khler
Advisory Editor
Hermann Moisl
Volume 71
ISBN 978-3-11-047496-1
e-ISBN (PDF) 978-3-11-047663-7
e-ISBN (EPUB) 978-3-11-047506-7
ISSN 0179-3616
Library of Congress Cataloging-in-Publication Data
A CIP catalog record for this book has been applied for at the Library of Congress.
Bibliographic information published by the Deutsche Nationalbibliothek
The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de.
2017 Walter de Gruyter GmbH, Berlin/Boston
www.degruyter.com
Editors Foreword
Linearity is one of the main features of human languages. However, for various reasons, linguistic research from a quantitative perspective is largely paradigmatic instead of being syntagmatic. The current volume focuses on the linguistic motifs and emphasizes the linear organisation of the units. As suggested by Khler, a motif is defined as the longest continuous sequence of equal or increasing values representing a quantitative property of a linguistic unit.
This volume documents some recent results in this area, and it is the first book that collects systematically and presents original research on linguistic motifs. It contains a collection of thirteen papers of altogether eighteen authors. The contributions cover quite a broad spectrum of topics from theoretical discussions to practical applications.
The first group consists of theoretically oriented papers. Andr Pascal Beyer suggests the persistency of higher order motifs by comparing Italian president speeches, the Russian Uppsala corpus and a set of DNA sequences. George K. Mikros and Jn Mautek examine the modern Greek blogs, and point out that word length distribution and text length are the two important factors influencing properties of word length motifs. Radek ech, Veronika Vincze and Gabriel Altmann suggest that verb valency motifs are regular language entities. Hongxin Zhang and Haitao Liu take a further step, validating valency motifs as basic language entities and also as a result of diversification processes.
The second group includes nine papers focused on practical applications. Cong Zhang investigates the words and F-motifs in six modern Chinese versions of the Gospel of Mark from the year 1855 to 2010, Heng Chen and Junying Liang compare the word length motif in modern spoken Chinese and written Chinese, both suggesting motifs as an index of language evolution. Yingqi Jing and Haitao Liu investigate the linear arrangement of dependency distance in Indo-European languages, Ruina Chen focuses on the Part-of-speech motifs, Yaqin Wang uses the L-motifs and F-motifs, and Yu Fang compares the L-motif TTR in two translated works, claiming motifs as an index of text classification and language typology. Jiang Yang examines the quantitative properties of polysemy motifs in Chinese and English, Wei Huang mainly investigates the rank frequency distribution and the length distribution of word length motifs in Chinese texts, Jingqi Yan presents an explorative study of part-of-speech motifs and dependency motifs using the treebanks of deaf students writing in three learning stages, pointing out the function of motifs in language description and acquisition.
We hope that this volume will give insight to linguistic motifs across (1) different languages; (2) text types; (3) dimensions of languages, and also, tentatively, into the cognitive mechanisms underlying the linguistic motifs. Moreover, we hope this volume will become a reference work for the related future research and as well as for undergraduate and postgraduate courses in the areas of Linguistics, Natural Language Processing and Text Mining.
We would like to thank all authors for their contributions and nice collaborations during the editing phases, and the referees for their invaluable efforts, and also Jieqiang Zhu and Wei Huang for their assistance in editing work. Most importantly, we would like to express our thanks to Reinhard Khler for his suggestion that we edit this volume and his continuous help and encouragement during the process of editing. We would also like to show our thanks to two other editors Gabriel Altmann and Peter Grzybek, for their support and timely help. Finally, we would like to acknowledge the National Social Sciences Funding of China Quantitative Linguistic Research of Modern Chinese (No. 11&ZD188), the Fundamental Research Funds for the Central Universities (Program of Big Data PLUS Language Universals and Cognition, Zhejiang University), and the MOE Project of the Center for Linguistics and Applied Linguistics, Guangdong University of Foreign Studies, which supported us during the preparation of this volume.
Haitao Liu, Junying Liang
Department of Linguistics, Zhejiang University
Hangzhou, China
Andr Pascal Beyer
Persistency of Higher Order Motifs
Andr Pascal Beyer: Department for Computational Linguistics and Digital Humanities, Trier University, Germany,
Abstract: In former and recent publications calculating motifs of motifs has been done repeatedly. This calculation seems meaningful intuitively. However, it is still not investigated how persistent and significant motifs of motifs are albeit interesting results are obtained from them. A further approach regarding the elucidation of meaning of motif derivation is tried to be done within the following investigation. Two linguistic and one DNA corpus were used to calculate higher-order L-motifs from them. The entropy and the Hurst-exponent could be obtained from each level of L-motifs. The entropy dropped for each layer as predicted. For the first few levels the values for the Hurst-exponent shrink and then start rising again. This behavior was not expected and is still to be explained.
Keywords: Higher-order motifs, entropy, Hurst-exponent
1Introduction
Syntactic motifs seem to gain more attention and become a more and more interesting unit to study in the field of quantitative linguistics. Recent volumes of this series feature studies of motifs (e.g.: Khler, 2015; Mautek, 2015). Motif research is accompanied by many unexamined assumptions - being a relatively new unit in the field of linguistics. The meaningfulness of the calculation of motifs of motifs is one of them. For instance, investigations of taking L-motifs of L-motifs (becoming LL-motifs) have already been done and seem to deliver interesting results (e.g. Milika, 2015; Khler & Naumann 2010).
The mechanisms of the unit are still not known despite these results. No valid linguistic theory has been hypothesized to verify the meaningfulness of calculating motifs of motifs. This article attempts to approach this question if calculating motifs of motifs is a reasonable operation. This will be done by comparing the entropy and the Hurst-exponent of consecutive motif processing.
2Higher Order Motifs
The scope of motifs as unit is limited: each motif captures only a small proportion of syntactic information. The following sentence:
Honestly, they could not have answered those questions.
can be transformed to the following L-motifs (ascending, with length in term of the number of character):