Big and small data in ancient languages

by Nicholas Zair (University of Cambridge)

Back in November I gave a talk at the Society’s round table on ‘Sources of evidence for linguistic analysis’ on ‘Big and small data in ancient languages’. Here I’m going to focus on one of the case studies I considered under the heading of ‘small data’, which is based on an article that I and Katherine McDonald and I have written (more details below) about a particular document from ancient Italy known as the Tabula Bantina.


It comes from Bantia, modern day Banzi in Basilicata and is written in Oscan, a language which was spoken in Southern Italy in the second half of the first millennium BC, including in Pompeii prior to a switch to speaking Latin towards the end of that period. Since Oscan did not survive as a spoken language, we know it almost entirely from inscriptions written on non-perishable materials such as stone, metal and clay. There aren’t very many of these inscriptions: perhaps a few hundred, depending on definitions (for instance, do you include control marks consisting of a single letter?). We are lucky that Oscan is an Indo-European language, and, along with a number of other languages from ancient Italy, quite closely related to Latin, so we can make good headway with it. Nonetheless, our knowledge of Oscan and its speakers is fairly limited: it is certainly a language that comes under the heading of ‘small data’.



One of the ways scholars have addressed the problem of so-called corpus languages like Oscan, and even better-attested but still limited ones like Latin has been to combine as many relevant sources of information, from ancient historians to the insights of modern sociolinguistic theory as a way of squeezing as much information from what we have – and trying to fill in the blanks where information is lacking. This has been a huge success, but this approach can also be dangerous, especially when it comes to studying language death. Given that we know a language will die out in the end, it is very tempting to see every piece of evidence as a staging post in the process, and try to fit it into our narrative of language death. Often this provides very plausible histories, but we must remember that, while in hindsight history can look teleological, things are rarely so clear at the time.

The Tabula Bantina is a bronze tablet with a Latin law on one side and an Oscan law on the other side. It is generally agreed that the Latin text was written before the Oscan one, but the Oscan is not a translation of the Latin: the writer of the Oscan text simply used the conveniently blank side of the tablet to write the new material on. The striking things about the Oscan text are that it is written in the Latin alphabet, and there are lots of mistakes. It also strongly resembles Latin legal language. The date of this side is probably between about 100-90 BC, just before Rome’s ‘allies’, which is to say conquered peoples and cities in Italy, rose up against it in a rebellion generally known as the Social War.

As a consequence of this timing, scholars have tended to see the linguistic and orthographic features of the Tabula Bantina as evidence for the sociolinguistic status of Oscan: they suppose that Oscan suddenly became salient to the inhabitants of Bantia as a marker of anti-Roman identity, hence the desire to express their official inscriptions in Oscan. It is further assumed that in practice Oscan was already moribund in Bantia, leading to mistakes in writing it, and that the Oscan alphabet, in which  most of our inscriptions are written, had been forgotten: hence the use of the Roman alphabet. An example of these assumptions is this statement:

“At one level, the re-adoption of Oscan as the official language of record at Bantia in the years before the Social War suggests that the Oscan language – but not necessarily the Oscan scripts – was a powerful symbol of local identity and anti-Roman feeling… However, it should also be noted that much of the format and content of the document is very close to that of Latin municipal or colonial charters of the late second and first centuries BC, and it seems that, despite the symbolic choice of language, Roman influence on epigraphic culture in general was considerable.” (Lomas 2008:125)

While this picture is very appealing, and fits well with our overall knowledge of Roman history and what happened to Oscan, there are quite some assumptions in it that need to be addressed. For example, we have no evidence that the people of Bantia ever used the Oscan alphabet which is found further north: our only evidence for Oscan in Bantia is the Tabula Bantia and a single other inscription from about the same date, also written in the Latin alphabet. So the idea that the people of Bantia lost their knowledge of how to write Oscan in their own alphabet before having, perforce, to adopt the Latin alphabet, has no direct support. Indeed, we would argue that it is quite likely that the writer of the Tabula Bantina was familiar with the Greek alphabet, which was used to write Oscan in Lucania and Bruttium (more or less modern Basilicata and Calabria) not far from Bantia: for example, the Tabula Bantina includes the letter <z>, which was hardly used in the Latin alphabet at this time but was often used in the Greek alphabet to write [z], an allophone of /s/ in Oscan (but not Latin). Similarly, the diphthong /oi/ is spelt <oi>, which is more reminiscent of Greek practice than of Latin, which would spell it <oe> at this time.

Since copying out a long text onto bronze or stone required a lot of concentration to prevent mistakes, it is not uncommon to find even quite large numbers in ancient inscriptions: this does not necessarily imply any lack of linguistic ability in the drafter or inscriber of the text. Instead, there is evidence that the writer was a native speaker of Oscan, since they show the same types of allophonic variation that are attested in other Oscan inscriptions. Thus, word-final /d/ is written with both <t> and <d> (for example pocapit ‘whatever’ /pokkapid/), which is characteristic of Oscan in the south, and there is confusion between <om> and <um> to write final /om/ (for example both dolom, and dolum ‘trick’), which is found throughout the Oscan-speaking area.

Lastly, it is true that the language of the Tabula Bantina and Roman law is quite similar, using terms borrowed from Latin such as senateis ‘of the senate’ and censtur ‘censor’, and even almost identical formulas. In a Latin law of perhaps the third century BC we find the phrase sei quis aruorsu hac faxit seiue mac[i]steratus uolet moltare, [li]cetod. ‘If anyone acts contrary to this … if a magistrate wants to impose a fine, it is allowed.’ An equivalent phrase in the Tabula Bantina is practically a word-for-word translation: suae . pis . contrud . exeic . fefacust . ionc . suae pis ./ herest . meddis . moltaum . licitud . ‘If anyone acts contrary to this, if any magistrate shall wish to fine him, it is allowed’. But this sort of similarity is also true of Oscan inscriptions from elsewhere and from significantly earlier: Latin influence on Oscan legal terminology and phraseology clearly had a long history, and there is not necessarily any particular significance to it in the case of the Tabula Bantina. Moreover, legal language is a very specific linguistic domain: it would be foolhardy to assume, for example, that English was endangered simply on the basis of frequent use of Latin in our legal language (decree nisi, adjournment sine die etc.).

To sum up: we know that Oscan would die out in the course of the first century BC, to be replaced by Latin. We can helpfully use our knowledge of history, sociolinguistics and language endangerment to think about the way this may have happened—but we must be sure to examine the evidence, however meagre it may be—carefully before slotting it into a predetermined narrative. Ultimately, Oscan may have been endangered in Bantia in the years 90–100, or it may have existed in a state of stable bilingualism with Latin until the upheavals of the Social War. Use of Oscan to write the Tabula Bantina may have been a marker of anti-Roman identity, or adoption of the Roman alphabet may have been a sign of positive accommodation to Roman soft power.

Mine and Katherine’s article is called Changing Script in a Threatened Language: Reactions to Romanization at Bantia in the First Century BC. It will be coming out shortly in Mari Jones & Damien Mooney (eds.), Creating Orthographies for Endangered Languages. Cambridge: Cambridge University Press.

We wrote the article as part of the Greek in Italy Project, funded by the Arts and Humanities Research Council and based at Cambridge University.


Lomas, Katherine (2008) “Script obsolescence, writing and power in pre-Roman and early Roman Italy”, in J. Baines, J. Bennet and S. Houston (eds) The Disappearance of Writing Systems: perspectives on Literacy and Communication. London: Equinox Publishing. 109-138.

2 thoughts on “Big and small data in ancient languages

  1. The orthographic/phonetic symbols that you intended are not visible on the page either via Chrome or via Edge. Hence I see, e.g. ‘Thus, word-final /d/ is written with both and (for example pocapit’. Can this be fixed?


Do you have a comment?

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.