The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de.
Abstract
In this introduction, we first address the question of how to find the appropriate level for the comparison between language and music. Secondly, we argue that the appropriate level for this comparison is the one of prosody, subsuming supra-segmental properties such as rhythm, meter and melody. We then provide a bibliometric analysis of recent contributions on the topics most central to this comparison. Finally, we introduce the contributions to the present volume.
Keywords: language, music, prosody, bibliometric analysis,
Music is the universal language of mankind, poetry their universal pastime and delight.
Henry Wadsworth ]
The famous quote by 19th century novelist Henry Wadsworth Longfellow and many of its similar reformulations underscore the insight that music and language are essential human cognitive abilities. In his influential book, Aniruddh Patel goes as far as expressing these as identifying human features: Language and music define us as human. ().
It is obvious, then, that music and language are related to each other in several ways, displaying commonalities while also maintaining specific differentiations. As briefly mentioned above, musicologists and linguists alike attempt to ).
Prosody in language and music
In general, prosody as a system of suprasegmental linguistic information such as stress, prominence, meter, rhythm, tone and intonation (e.g., ) is a prime candidate for looking at the relation between language and music in a principled way. This claim is based on several aspects, as elucidated in the following paragraphs.
First, prosody is concerned with the perceptual correlates length, pitch and perceived intensity of the acoustic bases of language and music that are directly comparable with each other by their physical properties duration, frequency and intensity. Syllables in language, for instance, have a temporal extent and their vocalic nucleus is characterized by fundamental frequency contours as well as specific intensities. Tones in music also have a temporal extent and bear specific tone heights. They may also differ in their intensities. Syllables and tones are emitted as sound pressure waves in the air and perceived through the same peripheral auditory system. When focusing on human singing, syllables and tones are merged objects produced by the same tripartite human language production system: respiratory source, phonatory larynx and filtering vocal tract.
Second, prosodic accounts, most prominently, ). Importantly, the model relates prosodic phrases from language to phrases from music. The hierarchical phrase structure contains heads and complements and accounts for the multi-tiered aspects of prominences on more local (individual tones or syllables) or more global levels (groups of individual tones or syllables, cadences or feet).
). Basically, these approaches assume that the processing of basic (prosodic) units in music and language is based on a successful parsing (chunking) of the continuous acoustic signal by means of the brains oscillatory mechanisms. More precisely, it is suggested that cortical rhythms, such as the oscillation in the theta-range (between 4 and 8 Hz), are beneficial in tracking and thereby processing syllabic or tonal information by quantizing the acoustic information into packages that correspond to the respective units, i.e., syllables or tones. The exact role of cortical rhythms for segmenting and recognizing the respective acoustic input is currently still discussed.
The fourth reason of why prosody is particularly well-suited for approaching the relation of language and music concerns the shared neural substrates of linguistic and musical prosody. Many studies have shown that music and language share cortical and subcortical resources (e.g., ).
The fifth (but certainly not last) reason to focus on prosody when illustrating the relation between language and music relates to transfer effects between the two domains. This transfer can go in either direction and is best illustrated on the level of prosody. The language-to-music transfer can be exemplified by absolute pitch, the rare skill to assign arbitrary tone heights the correct (musical) label (use partially identical mechanisms such that the training of these mechanisms in one domain has direct effects for the respective other domain.
The discussion above emphasizes that it is a fruitful endeavor to use prosody for a principled comparison of language and music, an endeavor that we attempt to pursue in this book. Prosody, in very broad terms, refers to the sound structure of communicative systems and may be considered a meta-language that formalizes the way of how music speaks to language and vice versa. Prosody is firmly established within linguistic theory, particularly, phonology, but is also applied in the musical domain (e.g., ). Therefore, prosody is not just a field of inquiry that shares elements or features between music and language (e.g., sound/tone durations and frequency/tone height), but may provide a common conceptual ground.
A final argument that prosody is a very fruitful approach to study the relations between language and music stems from a bibliometric analysis introducing and framing the specific approaches to the prosodic link between the two domains made by the authors in this volume.
A bibliometric analysis on prosody in music and language
Bibliometric analyses are an emerging statistical and visual technique to describe and quantify a scientific landscape around a certain topic as well as its impact on science as a whole (). The results showed documents of a total of 664 authors, 57 of which produced single-authored documents and 607 of which participated in multi-authored documents. A total of 11 618 references were cited in all retrieved documents matching the search terms.
A closer look at the development of the scientific landscape between 1993 and 2021 revealed a number of citations peak in 2017 ( for details on the clustering technique), based on citation co-occurrences. A prominent red cluster is dominated by key words on language and speech prosody, including terms such as lexical stress, meter and rhythm. A green cluster subsumes terms on language, emotion(s) and evolution. A small and distributed yellow cluster is based on song, melody and pitch, while a blue cluster shows a neuroscientific character, with emphasis on perception, auditory cortex and event-related potentials (ERPs). It is perhaps worth noting that perception and related concepts play a much more prominent role than does production as another fundamental perspective in cognition. Key words that correspond to key topics within this volume are marked by rectangles and illustrate how well the present book covers the scientific landscape defined by the three search terms language, music and prosody. It is also discernible that the green cluster is not covered in this volume. This very plausibly illustrates that our attempt was not to provide an evolutionary prosodic account for language and music, but rather to reveal fruitful prosodic links between the two domains in a synchronic perspective.