Novel global study using investigators as participants finds shared acoustic relationships among the world’s languages and music

Three different types of traditional music clockwise from top left: a Japanese koto, Scottish bagpipes, African balafon.

Release Date: May 15, 2024

Peter Pfordresher, PhD, professor of psychology

University at Buffalo

BUFFALO, N.Y. – A University at Buffalo psychologist is part of a global research team that has identified specific acoustic relationships that distinguish speech, song and instrumental music across cultures.

The study published in the journal Science Advances, which involved experts in ethnomusicology, music psychology, linguistics and evolutionary biology, compared instrumental melodies along with songs, lyrics and speech in 55 languages. The findings provide an international perspective supporting ideas about how the world’s music and languages evolved into their current states.

“There are many ways to look at the acoustic features of singing versus speaking, but we found the same three significant features across all the cultures we examined that distinguish song from speech,” said Peter Pfordresher, PhD, a professor of psychology in the UB College of Arts and Sciences, and one of the 75 contributors to a unique project that involved the researchers assuming the dual roles of investigator and participant.

The three features are:

Singing tends to be slower than speaking across all the cultures studied.
People tend to produce more stable pitches when singing as opposed to speaking.
Overall, singing pitch is higher than spoken pitch.

The exact evolutionary pressures responsible for shaping human behaviors are difficult to identify, but the new paper provides insights regarding the shared, cross-cultural similarities and differences in language and music − both of which are found in highly diverse forms across every human culture.

Pfordresher says the leading theory, advanced by the paper’s senior author, Patrick Savage, PhD, senior research fellow at the University of Auckland, New Zealand, is that music evolved to promote social bonding.

“When people make music, and this is the case around the world, they tend to do so collectively. They synchronize and harmonize with each other,” says Pfordresher. “The features we found that distinguish music from speech fit well with that theory.”

Think about tempo as a mechanism to encourage music’s social aspects. Being in sync becomes more difficult as tempo increases. When the tempo slows, the rhythm becomes predictable and easier to follow. Music becomes a more social enterprise.

It’s the same with pitch stability, according to Pfordresher.

“It’s much easier to match a stable pitch with someone else, to be in sync with the collective, than is the case when a pitch is wavering,” he says.

Similarly, it’s possible that the higher pitches found in singing happen as a byproduct of songs being produced at a slower rate.

“Slower production rates require a greater volume of air in the lungs,” explains Pfordresher. “Greater air pressure in the vocal system increases pitch.”

Conversational speech, in contrast, is not synchronized. Conversations generally alternate between people.

“I would speculate that conversational speech is faster than song because people want to hold on to the stage. They don’t want to provide false cues that they’ve finished, in essence handing the conversation off to another speaker,” says Pfordresher. “Pausing in a conversation or speaking slowly often indicates that it’s another person’s turn to speak.”

The study’s novel structure, with its investigators as participants, is part of the increasingly global nature of music cognition research. Savage and Yuto Ozaki, PhD, the lead author from Keio University in Japan, recruited researchers from Asia, Africa, the Americas, Europe and the Pacific, who spoke languages that included Yoruba, Mandarin, Hindi, Hebrew, Arabic, Ukrainian, Russian, Balinese, Cherokee, Kannada, Spanish, Aynu, English and dozens more.

“First, we used this structure to counteract the unfortunate tradition of extractive research in cross-cultural musical studies in which researchers from the developed world collect, or extract, data from a culture in the developing world, and use the data to promote their own success,” says Pfordresher.

The second reason has more to do with the validity of the data.

“Our analyses require annotation of syllable and note onsets in songs and speech from around the world,” says Pfordresher. “No single investigator knows all of these languages. By having each investigator participate and thus check their own annotations, we add additional validity to our study.”

Each investigator-participant chose a song of national significance from their culture. Pfordresher selected “America the Beautiful.” Savage chose “Scarborough Fair.” Ozaki sang the Japanese folk song “Ōmori Jinku.”

Participants sang the song first; performed an instrumental version next on an instrument of their choice; and then recited the lyrics. They also provided an explanation for their choice as a free-form speech condition of the study. All four conditions were recorded and then segmented.

To avoid the possibility of bias creeping into the data, Pfordresher explained that not all investigators were involved in generating the study’s initial set of hypotheses. All of the authors looked at the data, but did so to make sure there were no differences between the initial group and those others.

“We do hope to follow up this study with other research that has authors from around the world sample data from within their cultures,” says Pfordresher.

Media Contact Information

Bert Gambini
News Content Manager
Humanities, Economics, Social Sciences, Social Work, Libraries
Tel: 716-645-5334
gambini@buffalo.edu