Big and small data in ancient languages

by Nicholas Zair (University of Cambridge)

Back in November I gave a talk at the Society’s round table on ‘Sources of evidence for linguistic analysis’ on ‘Big and small data in ancient languages’. Here I’m going to focus on one of the case studies I considered under the heading of ‘small data’, which is based on an article that I and Katherine McDonald and I have written (more details below) about a particular document from ancient Italy known as the Tabula Bantina.


It comes from Bantia, modern day Banzi in Basilicata and is written in Oscan, a language which was spoken in Southern Italy in the second half of the first millennium BC, including in Pompeii prior to a switch to speaking Latin towards the end of that period. Since Oscan did not survive as a spoken language, we know it almost entirely from inscriptions written on non-perishable materials such as stone, metal and clay. There aren’t very many of these inscriptions: perhaps a few hundred, depending on definitions (for instance, do you include control marks consisting of a single letter?). We are lucky that Oscan is an Indo-European language, and, along with a number of other languages from ancient Italy, quite closely related to Latin, so we can make good headway with it. Nonetheless, our knowledge of Oscan and its speakers is fairly limited: it is certainly a language that comes under the heading of ‘small data’.



One of the ways scholars have addressed the problem of so-called corpus languages like Oscan, and even better-attested but still limited ones like Latin has been to combine as many relevant sources of information, from ancient historians to the insights of modern sociolinguistic theory as a way of squeezing as much information from what we have – and trying to fill in the blanks where information is lacking. This has been a huge success, but this approach can also be dangerous, especially when it comes to studying language death. Given that we know a language will die out in the end, it is very tempting to see every piece of evidence as a staging post in the process, and try to fit it into our narrative of language death. Often this provides very plausible histories, but we must remember that, while in hindsight history can look teleological, things are rarely so clear at the time.

The Tabula Bantina is a bronze tablet with a Latin law on one side and an Oscan law on the other side. It is generally agreed that the Latin text was written before the Oscan one, but the Oscan is not a translation of the Latin: the writer of the Oscan text simply used the conveniently blank side of the tablet to write the new material on. The striking things about the Oscan text are that it is written in the Latin alphabet, and there are lots of mistakes. It also strongly resembles Latin legal language. The date of this side is probably between about 100-90 BC, just before Rome’s ‘allies’, which is to say conquered peoples and cities in Italy, rose up against it in a rebellion generally known as the Social War. Continue reading “Big and small data in ancient languages”