Research News

New search engine advances analysis of ancient texts


Published January 24, 2013

Neil Coffee.

Neil Coffee

Allusions, quotes, riffs, mash-ups: For thousands of years, artists of all stripes have reveled in using implicit references to older material as a way to enrich and expand their own work.

Since antiquity, scholars also have reveled in finding them. But tracking literary allusions from text to text across languages and centuries is a daunting task. It can involve searching hundreds, even thousands of pages of text using nothing but eye, pen and intuition.

Change is afoot, however.

The revolutionary “Tesserae Project,” a digital search engine under development by literary scholars and linguists at UB since 2008, can now quickly find almost every shared allusion, even subtle ones, across all major works of Latin literature, the Greek epics—“The Iliad,” “The Odyssey” and “The Argonautica”—as well as several major works in English. More English ancient Greek texts will follow.

“Allusions are important because they reflect an author’s borrowing and transformation of a prior text,” explains Neil Coffee, professor and chair of the Department of Classics and the project’s co-principal investigator. “They permit readers to reference one text while reading another, which expands and deepens their understanding and appreciation of a literary work and the author’s creative process.

“For instance,” he says, “allusions allow researchers and students to compare the love poems of Catullus (first century BCE) with the witty epigrams of Martial (first century BCE) in their original language, and not just find shared references, but to analyze them for literary and linguistic meaning.”

Coffee and co-principal investigator Jean Pierre Koenig, professor and chair of the Department of Linguistics, have taken a two-part approach to the problem they are addressing.

They created a free, open Tesserae website that permits users to search allusions or “intertexts” that shape one text’s meanings through the use of other texts—references to “Paradise Lost” in Wordsworth’s “Prelude,” for instance.

In addition, team members continually test and refine their definition of allusion, or intertextuality, through feedback from users, including international collaborators, and by employing the website to conduct their own literary research.

“The biggest challenge in this effort has been to define allusion,” Coffee says. “Scholars long have debated definitions of the term and haven’t agreed on one.

“Computers, of course, do not replicate the human mind. They require precise definitions of similarity,” he says, “so we began with a basic definition of allusion: pairs of words in different texts that are rare and close together.

“The Latin name ‘tesserae,’” he points out, “refers to the many small tiles that—like two word phrases—make up a mosaic. It reflects the project’s aim of finding all allusions to get the bigger picture.”

By using results from such searches, Coffee says, Tesserae can capture about 80 percent of the allusions found through traditional methods, but much more quickly and comprehensively.

“One example of outcome,” Coffee says, “is the demonstration that the practice of allusion in the Latin epic poetry, for instance, varies substantially by author: Some later poets allude heavily to Virgil to show their mastery of the tradition, while others are more independent.”

Coffee notes that although the Tesserae project already has begun to open up new ways of experiencing relationships between texts and project authors—leading to new perspectives and interpretations—“Our expectation is that it also will contribute to an emerging idea of humanities computing that emphasizes not just the processing of texts, but new, intuitive and provocative encounters with literature.”

Initial funding for the Tesserae Project was provided by the UB Digital Humanities Initiative and the National Endowment for the Humanities’ Office of Digital Humanities.

The team is looking for further support to enhance digital detection of different kinds of similarity—words with similar meanings, for example, not just similar forms—and to explore further what the literary device of allusion is and how it works.

“As mentioned earlier, another major goal,” says Coffee, “is to find allusions across languages. Important to all of these efforts, he says, is the growing community of Tesserae collaborators in the U.S. and around the world.

“Among them,” Coffee says, “is our third principal investigator, Walter J. Scheirer of the University of Colorado at Colorado Springs. His expertise in statistical pattern-matching methods helps identify which language features make up allusion.”

The Tesserae team also has established a collaboration with Damien Nelis, professor at the University in Geneva, Switzerland. Tesserae will supply Nelis and his team with its most updated methods of delivery and the Geneva group will, in turn, give Tesserae feedback on how to improve its system.