The Origin of /ɬ/ in Southern Pinghua

by Xiaolan Cao (University of Melbourne)

In this post, I will discuss the origin of the voiceless lateral fricative /ɬ/ in Southern Pinghua, one of the two branches of Pinghua and a minority Sinitic language. Southern Pinghua is mostly spoken in Southern Guangxi in China (Qin 2000) by approximately 1.8 million native speakers (Min 2013). However, some of the dialects have experienced huge trans-generational language loss and are hence potentially endangered (Cao 2019). Most Southern Pinghua speakers identify as ethnic Han, the majority ethnic group in China, while most of the rest identify as ethnic Zhuang.

In southern Pinghua, the voiceless lateral fricative /ɬ/ is a consonant phoneme occurring in the onset position of a syllable. The phonemicity of /ɬ/ can be established by the minimal pair in Table 1 below.[1]

Word Gloss
/ɬa33/ ‘spread’
/sa33/ ‘sprinkle’

Table 1: a minimal pair of voiceless lateral fricative /ɬ/

Commonly, /ɬ/ is not considered an internal development of Sinitic languages primarily because it rarely occurs in present-day Sinitic languages. Within China, it is distributed in the former Baiyue area, once occupied by the ancestors of Tai-Kadai speakers (Li 2000). Besides Southern Pinghua dialects, Cantonese dialects located in Southern Guangxi and Western Guangdong also have the phoneme /ɬ/. Outside Guangxi and Guangdong, /ɬ/ can be found only in three small regions in China: it can be seen in some dialects of Ming in non-contiguous geographical pockets in Fujian Province or some dialects of Hui in Anhui Province; it also can be found in some dialects of various Sinitic languages spoken on the west coast of Hainan Province (de Sousa 2015: 166-168, quoting Liu X 2006, Liu F 2007, Akitani 2008, and Meng 1981). Due to its limited distribution in present-day Sinitic languages, /ɬ/ is not reconstructed for Middle Chinese or Old Chinese in the literature; see Zhengzhang (2003), Li (1971), Baxter and Sagart (2014), and Wang (1985) respectively.

On the other hand, /ɬ/ is common in present-day dialects of Zhuang, a Tai-Kadai language mainly spoken in Guangxi (Zheng 1998). According to works by Mai (2009, 2011), Ouyang (1995), Yuan (1989), Zheng (1998), and Zhao (2015), the phoneme /ɬ/ in Sinitic languages may have developed under the influences from Zhuang loanwords through language contact. However, the opposing view—that because the phoneme /ɬ/ in Zhuang corresponds to *s in Proto-Tai, it is likely that Zhuang developed this phoneme under the influence from Sinitic languages instead of the opposite direction of influences—has been suggested in the Chinese language literature as well (de Sousa 2015, quoting Li F 1977 and Pittayaporn 2009).

The two views on the origin of /ɬ/ in Sinitic languages have some limitations. First, the argument that /ɬ/ is not an internal development of Sinitic language simply because of its limited distribution and absence from reconstructions for Middle Chinese or Old Chinese does not preclude that /ɬ/ could have developed in Southern Pinghua after the Middle Chinese period.

Further, the evidence does not indicate whether /ɬ/ is an internal development in Southern Pinghua or a phoneme developed under the influences of loanwords from Zhuang through language contact. As for its distribution in Southern Pinghua, the phoneme /ɬ/ can be found in both the Sinitic stratum and the Zhuang stratum. According to a survey by Cao (2018), in the Sinitic stratum, Chinese characters (Chinese cognates) whose Southern Pinghua pronunciations contain onset /ɬ/ were mostly recorded as having the Middle Chinese onset denoted as 心 (*s) in Qieyun, a rhyming dictionary published in 601 CE during the Sui dynasty (581–618). This correspondence exists not only in common words, such as /ɬam52/ (‘three’) and /ɬɜm52/ (‘heart’) but also in literary words, like /ɬɜw52/ (‘constellation’) and /ɬoŋ52/ (‘lofty’).

The correspondences between /ɬ/ in Southern Pinghua and onset 心 (*s) in Middle Chinese suggests that /ɬ/ is of Sinitic origin. However, from the same survey, there are ninety-one admissible syllables start with /ɬ/ in total, among which twenty-six cannot be associated with Chinese characters (Chinese cognates). Normally for Southern Pinghua syllables, being able to be identified by Chinese characters strongly indicates their Sinitic origin. Thus, these twenty-six syllables are possibly not of Sinitic origin but introduced to the language by loanwords from other languages, such as Zhuang. Thus, the distribution of /ɬ/ in Southern Pinghua does not support /ɬ/ being an internal development or one induced by the influences of language contact with Zhuang.

In addition to the distributional features of /ɬ/ in Southern Pinghua, the historical developments of /s/-phonemes in Southern Pinghua may also shed some light on the developments of /ɬ/. In Southern Pinghua, pronunciations of Chinese characters whose onset is /s/ correspond mostly to those denoted in Qieyun as having onsets denoted as 审 (*ɕ), 禅 (**ʑ), and邪 (*z). Based on the fact that these three Middle Chinese onsets did not develop into /ɬ/, we may speculate that the Middle Chinese onset 心 (*s) has some features that make it prone to sound change to /ɬ/ under certain influences, such as loanwords from Zhuang.

Finally, the geographical distribution of /ɬ/ is not so discontiguous as described in previous studies. The geographical distribution of /ɬ/ is contiguous in Southern Guangxi and Western Guangdong. These two adjacent regions in total occupy approximately 184,000 square kilometres [2] of densely populated area, which is larger than Cambodia (181,035 square kilometres) or Nepal (147,181 square kilometres). Therefore, it may not be accurate to describe the territory of /ɬ/ in Southern Guangxi and Western Guangdong as small or isolated, and /ɬ/ can be considered as an areal feature for further studies in historical linguistics, areal linguistics, and linguistic typology. Drawing from the analysis and evidence given in the discussion above, I would like to posit some questions for further investigation.

  1. Why is /ɬ/ so prevalent in Southern Pinghua and Cantonese dialects found in the area of Southern Guangxi and Eastern Guangdong, but not in the other areas?
  2. If language contact with Zhuang is a contributing factor to the development of /ɬ/, why does /ɬ/ occur in Southern Pinghua dialects but not most Northern Pinghua dialects, given both Pinghua branches have similar contact with Zhuang?
  3. Similarly, why do Cantonese dialects in Western Guangdong have /ɬ/ but not those in Eastern Guangdong, considering Cantonese dialects mostly have similar exposure to Zhuang in the history?
  4. Can the peculiar distributions of /ɬ/ in Pinghua and Cantonese dialects be explained by a mere historical accident?

In sum, the two opposing views on the origin of /ɬ/ in Southern Pinghua are questionable because the evidence is inconclusive. At this stage, the origin of /ɬ/ in Southern Pinghua dialects remains unclear, and further investigations are still required.


Baxter, W.H., and Laurent Sagart. 2014. Old Chinese: A New Reconstruction. New York: Oxford University Press.

Cao, Xiaolan. 2019, ‘Documentation of Wucun Pinghua’: Endangered Langauge Documentation Program.

—. 2018, ‘A Survey of the Southern Pinghua Pronunciation of Chinese Characters with English Glosses and Corresponding Mandarin and Cantonese Pronunciations’: University of New England.

de Sousa, Hilário 2015, ‘Language Contact in Nanning: Nanning Pinghua and Nanning Cantonese’, in Hilary M. Chappell (ed.), Diversity in Sinitic Languages, Oxford Scholarship Online: March 2016: Oxford University Press.

Li, Fanggui. 1971, ‘上古音研究 (a Study of Old Chinese Phonology]’. Qinghua Xuebao 9,26-32.

Li, Lianjin. 2000, ‘平话的历史 [ the History of Pinghua]’. 民族语文 [ Minority languages of China] 6,24-30.

Mai, Geng. 2009, ‘从粤语的产生和发展看汉语方言形成的模式 [ a View of the Formation Pattern of Chinese Dialects from the Formation and Development of Yue]’. 方言[Fangyan] 3,219-232.

—. 2011, ‘粤语方言的音韵特征-兼谈方言分区的一些问题 [ Phonological Features of Yue and Some Issues in the Subgrouping of Chinese Dialects]’. 方言[Dialects],289-301.

Min, Gunag. 2013, ‘桂南平话研究综述 [ a Literature Review of the Studies of Southern Pinghua]’. 语文学刊 [ Journal of langauge] 9,22-23.

Ouyang, Jueya. 1995, ‘两广粤方言与壮语的种种关系 [ the Relations between Zhuang and the Yue Dialects Spoken in Guangdong and Guangxi]’. 民族语文 [ Minority languages of China] 6,49-52.

Qin, Yuanxiong. 2000, ‘桂南平话研究 [Study in Southern Pinghua]’, unpublished: Jinan University.

Wang, Li. 1985. 汉语语音史 [the Phonological History of the Chinese Language]. Bejing: China Social Science Press.

Yuan, Jiahua. 1989. 汉语方言概要 [ Introduction to Chinese Dialects]. Beijing: 文字改革出版社 [ The press of language and character reform].

Zhao, Yuan. 2015, ‘广西粤语,平话中的边擦音/ɬ/的来源及其形成探究 [ Exploring the Origin of the Voiceless Fricative /ɬ/ in Yue and Pinghua Spoken in Guangxi’. Journal of Guangxi Teacher’s Education University (Philosophy and Social Sciences Edition) 36,61-66.

Zheng, Zuoguang. 1998, ‘广西平话的边擦音声母ɬ及其形成 [ the Formation of Lateral Fricative /ɬ/ in Guangxi Pinghua’, 方言与音韵研究论集

, Nanning: Guangxi Jiaoyu Press, pp. 103-110.

Zhengzhang, Shangfang. 2003. 上古音系 [Phonology of Old Chinese]. Shanghai: Shanghai Jiaoyu Press.

[1] Southern Pinghua is a tone language, and the numbers in the word transcriptions indicate lexical tones

[2] The regions of Southern Guangxi and Western Guangdong occupy approximately half of Guangxi (236,700 square kilometers) and one third of Guangdong (177,900 square kilometers). Therefore, it is estimated that these two regions altogether have around 184,000 square kilometers.

The Preterite and Perfect in Middle English

by Morgan Macleod (University of Ulster)

The Proto-Germanic tense system, consisting only of a present and a preterite, was augmented in Old English by the addition of a periphrastic perfect. This perfect had already been grammaticalized to the point where it could be used even with intransitive verbs, e.g. þin folc hæfð gesyngod ‘your people have sinned’ (Mitchell 1985: I, 289). However, it was still possible to use the preterite to express similar temporal content, e.g. Ic heold nu nigon gear[…] þines fæder gestreon ‘I (have) now held your father’s property nine years’ (ÆLS I.21.42). For many Old English authors the preterite was in fact the preferred mode of expression; previous research on a sample of Old English texts found that the new periphrastic perfect was used only in 26% (95/360) of the cases where it would have been possible semantically (see Macleod 2014). However, little previous quantitative work exists on the subsequent development of the perfect and preterite towards the modern system, in which the two categories are paradigmatically opposed and can seldom be interchanged without altering the meaning of an utterance.

A preliminary investigation of the preterite and perfect in Middle English was performed using the Helsinki Corpus (Rissanen et al. 1996). Such a corpus, small in size yet selected for balanced content, was ideal for a form of analysis involving manual review of entire textual passages. The methodology was based on that of Macleod (2014): texts from the earliest Middle English period, 1150–1250, were analysed to identify all situations for which a present perfect would be an appropriate representation, and the relevant verbs were identified either as preterites or as perfects. This research revealed an abrupt transition between Old English and Middle English; in Middle English, not including texts that represent late copies of Old English works, the periphrastic perfect was used in 94% (258/274) of cases. It is possible that the earlier stages of this transition took place within OE, where they were obscured by the relatively homogeneous nature of the textual record. In addition, some ME authors seem to show awareness of a new opposition between the preterite and the perfect, e.g. Orm 197 Þe þridde god uss hafeþþ don / Þe Laferrd Crist onn erþe, / Þurrh þatt he ȝaff hiss aȝhenn lif ‘The Lord Christ has done us the third good on earth in that He gave His own life’. Here the same situation is described with a preterite to position it within a historical narrative and with a perfect to highlight its continuing relevance, showing a clearer contrast than seems to have existed in Old English.

Although the majority of Middle English examples seem to conform to the modern pattern, a small number of exceptions remain, a fact noted by previous authors such as Mustanoja (1960) and Fischer (1992). One factor involved in these exceptions may lie in the variation observed (e.g. Elsness 1997) among varieties of English in their tense preferences: constructions such as American English I already ate can be paralleled in Middle English examples such as Ich ne seh him neauer ‘I never saw Him’ (St Juliana 100.15), while examples such as mare wunder ilomp ‘greater wonders (have) happened’ (Ancrene Wisse 32.9) may show an even greater tolerance for the preterite than would be possible in present-day American English. This variation may best be interpreted as a difference not in the temporal meaning of the forms involved, but in the pragmatic presuppositions created by their use, in keeping with the approach of Portner (2003).

Some Middle English examples also involve the use of a past tense under a present-tense verb in a way that would be of marginal acceptability in Modern English. This can be seen in examples such as Brut I.384.7424, Ich þonkie mine Drihte[…] þet he swulche mildce; sent to moncunne ‘I thank my Lord that He sent such mercy to mankind’. Although much research on the sequence of tenses (e.g. Abusch 1997; Gennari 2003) has tended to focus on cases in which the matrix verb is in the past tense, it is known that sequence-of-tense phenomena are subject to cross-linguistic variation in their construction and interpretation. Examples such as the above may reflect an underlying difference between Middle English and Modern English in their sequence-of-tense rules.

This preliminary investigation has found a high degree of similarity between Middle English and Modern English in their use of the perfect even at a very early date, in sharp contrast to the patterns found in Old English texts. While the explanations proposed here may help to explain the small number of apparent counterexamples, more work is needed to substantiate these proposals. In particular, a larger data sample might provide further examples to clarify the factors influencing speakers’ choice between the perfect and the preterite, while a more general examination of the sequence of tenses found in Middle English would be essential to establish the details of the system obtaining at this period and the ways in which it might differ from Modern English. Further research in this area has the potential to illuminate many currently obscure details of the Middle English verbal system.


Abusch, Dorit, 1997. ‘Sequence of tense and temporal de re’, Linguistics and Philosophy 20, 1–40.

Elsness, Johan, 1997. The Perfect and Preterite in Contemporary and Earlier English, Berlin: de Gruyter

Fischer, Olga, 1992. ‘Syntax’, in Norman Blake (ed.), The Cambridge History of the English Language, vol. 2, Cambridge: Cambridge University Press, 207–408.

Gennari, Silvia P., 2003. ‘Tense meanings and temporal interpretation’, Journal of Semantics 20 35–71.

Macleod, Morgan, 2014. ‘Synchronic variation in the Old English perfect’, Transactions of the Philological Society 112, 319–343.

Mitchell, Bruce, 1985. Old English Syntax, 2 vols, Oxford: Clarendon.

Mustanoja, Tauno F., 1960. A Middle English Syntax, Helsinki: Societé Néophilologique.

Portner, Paul, 2003. ‘The (temporal) semantics and (modal) pragmatics of the perfect’, Linguistics and Philosophy 26, 459–510.

Rissanen, Matti, et al. (eds.) 1996. The Helsinki Corpus of English Texts, Helsinki: University of Finland, electronic.

The Loss of the Latin Case System – A New Morphological Approach

by Zeprina-Jaz Ainsworth (University of Oxford)

Much work has already been done on the development of the Latin case system, which has been lost almost entirely from nouns and adjectives in Romance. Scholars such as Herman (2000) have outlined phonetic, analogical, functional, and syntactic changes which may have contributed to the opacification of certain morphological case forms. However, none of the previous analyses account for the near-total loss of the case category in Romance. For instance, as the result of regular phonological changes, the singular forms in the first declension would not have ‘fallen together’ into a single, invariant shape:

PluralClassical LatinSound ChangeResult

AccusativeMENSAMLoss of final -m**mensa
AblativeMENSĀLoss of vowel length distinctions
GenitiveMENSAEae >[e]
DativeMENSAEae >[e]

Table 1: Phonetic erosion in first declension singular case/number suffixes

Moreover, cross-linguistic comparison indicates that, despite phonological, analogical, and functional developments, languages do not necessarily always lose their case systems. Finnish, for instance, retains the fifteen case values (for nouns and adjectives) reconstructed for proto-Finnic (although the abessive, comitative, instructive and prolative are now in restricted usage), and has even begun to develop new morphological suffixes:

Proto-Finnic nominative, genitive, partitive, essive, translative, elative, inessive, illative, ablative, adessive, allative, abessive, comitative, instructive, prolative
Modern Finnish nominative, genitive, partitive, essive, translative, elative, inessive, illative, ablative, adessive, allative, (abessive, comitative, instructive, prolative), comitiative2, excessive

Table 2: Case values in proto-Finnic and modern Finnish

This study is concerned with answering the question: why do we find such different developments cross-linguistically?

One major difference between these two languages is that Latin is characterized predominantly by fusional morphology, whilst Finnish exhibits an abundance of agglutinative structure. By analysing these structures from a unit-agnostic ‘abstractive’ approach (as opposed to a ‘constructive’ perspective, in which forms are considered to be ‘built’ up of sub-word parts),[1] we may best understand how they behave in significantly different ways in diachrony.

In Latin for instance, the fully-inflected wordform and the relationship it bears to other forms in the paradigm provides the language-user with informative patterns which may be extended in the inflexion of other lexemes – there is no need to posit ‘underlying’ forms or identify sub-word morphs in order to ‘construct’ new forms. For instance, if the language-user knows a nominative singular form ending in -a, the lexeme must belong to the first declension. In the second and fourth declensions, however, even if both the nominative singular and accusative singular forms are known, there is residual ambiguity about the inflexion class to which the lexeme belongs:

Nom. sg. PUELLA 1st declension SERVUS 2nd/4th declension GRADUS 2nd/4th declension
Acc. sg. PUELLAM 1st declension SERVUM 2nd/4th declension GRADUM 2nd/4th declension
Gen. sg. PUELLAE 1st declension SERVĪ 2nd declension GRADŪS 4th declension

Table 3: Implicational relations in a sub-set of Latin nouns

In Finnish, implicative relations provide information about inflexion class, whilst the frequent isomorphic form~function mapping exhibited by inflexional suffixes provides absolute certainty in the expression of most case functions.

Nom. sg. ajatus ‘thought’ -Vs ~ -Vks-/-Vs ~ -VV- vieras ‘stranger’ -Vs ~ -Vks-/-Vs ~ -VV-
Part. sg. ajatusta -Vs ~ -Vks-/-Vs ~ -VV- vierasta -Vs ~ -Vks-/-Vs ~ -VV-
Gen. sg. ajatuksen -Vs ~ -Vks- + [n] vieraan Vs ~ -VV- + [n]

Table 4: Implicational relations and sub-word units in a sub-set of Finnish nouns

Whilst multiple forms are required in Finnish to determine the declension class to which a lexeme with a nominative singular form in -s belongs, there is certainty in many cells as to the inflexional material that will follow the lexical stem.

The abstract patterns that exist in Latin are not maximally-informative, that is, there is occasionally still uncertainty about the shape of an unknown form, even given knowledge of two forms in the language (consider table three). In Finnish, on the other hand, there is a sub-word area of absolute certainty in most of the cells in the inflexional paradigm. In addition to implicational relations, therefore, a Finnish speaker, even where there is not have sufficient information to deduce the inflexion class of a lexeme, may utilize maximally-predictable sub-word forms to produce a form (whether or not the ‘correct’ one) which may be interpreted correctly by a hearer.[2]

The observations offered here accord with language-learning data. Niemi and Niemi (1987) and Laalo (2009), for instance, observe that Finnish children recognise early the direct mapping of the suffix -n and genitive singular functions; they then utilise this knowledge in the deduction of previously unencountered forms. In Latin, exemplary paradigms and principal parts have long been used to capture the inflexional variation exhibited by lexemes. The implicational relations that exist between the nominative singular and genitive singular forms of a noun, for instance, are sufficient to enable L2 learners to ‘match’ novel items to the correct inflexion class.

I suggest that understanding the way in which morphological structures are recognised and exploited by languages-users may help to explain (in conjunction with, e.g., phonological or analogical developments) whether morphological case distinctions are likely to be lost or maintained. In Latin, the implicational relations, although informative, are not always maximally-predictive, and became opacified through time following regular phonological developments (such as those given in table one). As a result of such phonetic erosion, the area of informativeness in the Latin case system has shifted from the area of suffixal variation, distinct across declension, towards the certainty associated with the invariant form of the lexeme. By contrast, the maximally-predictable sub-word elements in Finnish may be rote-learned, which provides them with diachronic stability. These units, in addition to the less informative abstract relations, offer language-users on average more information in language use than is available to a learner of Latin in the production of novel inflected forms. Consideration of the morphological structures found in a given language and the ways in which they are recognised and exploited in language use may therefore offer some additional insight into why the robust Latin case system is not found in Romance.


Blevins, J.P., 2006. ‘Word-based Morphology’. In Journal of Linguistics 42:3. 531-573.

—-, 2016. Word and Paradigm Morphology. Oxford: Oxford University Press.

Blevins, J.P., P. Milin, and M. Ramscar. 2017. ‘The Zipfian Paradigm Cell Filling Problem’. In F. Kiefer, J.P. Blevins, and H. Bartos (eds.). Perspectives on Morphological Structure: Data and Analyses. Leiden: Brill. 139-158.

Herman, J., 2000. Vulgar Latin. Pennsylvania: Pennsylvania State University Press.

Laalo, K., 2009. ‘Acquisition of Case and Plural in Finnish’. In U. Stephany and M. Voeikova (eds.). Development of Nominal Inflection in First Language Acquisition: a Cross-Linguistic Perspective. Berlin: Mouton de Gruyter. 49-90.

Milin, P., V. Kuperman, A. Kostić and H.R. Baayen, 2009.
‘Words and paradigms bit by bit: An information-theoretic approach to the processing of inflection and derivation’ in In J.P. Blevins and J. Blevins (eds.). Analogy in Grammar: Form and Acquisition. Oxford: Oxford University Press. 214-252.

Niemi, J. and S. Niemi, 1987. ‘Acquisition of inflectional marking: A case study of Finnish’ in Nordic Journal of Linguistics 10:1. 59-89.

[1] The terms ‘abstractive’ and ‘constructive’ are from Blevins (2006).

[2] This discussion may be recast in terms of the information-theoretic notion of ‘entropy’. See, e.g., Milin et al. (2009) and Blevins (2016:171-196).

Early Career Researcher Forum

by Robin Meyer (University of Oxford; Hon. Secretary for Student Associate Members)


PhilSoc is pleased to announce the programme for this year’s Early Career Researcher Forum, to be held on 8–9 March 2019. Twenty Early Career Researchers (late-stage doctoral students and post-docs) will present their research in 20-minute talks or posters.


The ECR Forum will take place at Wolfson College, Oxford.  Next to paper and poster sessions, there will be two workshops on journal and monograph publishing (led by Prof. James Clackson, Cambridge, and Prof. Susan Fitzmaurice, Sheffield) and on grant applications (led by Prof. Aditi Lahiri, Oxford). After the conclusion of the Research Forum, Prof. Rudolf Wachter (Basel) will give a paper at an ordinary meeting of the Society.

The programme of the Forum is available here as pdf. Abstracts of all talks, brief academic biographies of the presenters, and a registration form can be found here.

On Writing « The Secret Life of Language »

by Simon Pulleyn (London)

Secret Life Language front cover-1

In September 2017, I was asked by Trevor Davies, Commissioning Editor at Octopus Books, whether I would write a book about language for the general reader. Octopus already had titles such as The Secret Life of the Periodic Table and The Secret Life of Equations. Now they wanted to try linguistic science. They had some general ideas about scope, but I was offered a free hand as to the text. Octopus specialize in illustrated books. This was quite new for me. My previous experience was that pictures cost money and, as the author must pay for them, they are best avoided. But Octopus has an entire department dedicated to sourcing images; the project also had a talented artist who produced drawings tailored to my ideas. PhilSoc readers will not be slow to spot anachronisms in cartoons depicting Cicero or Babylonian scribes. But the aim of the book is to appeal to the bright general reader, not the specialist; the designers thought that the drawings would have broader appeal if they did not incorporate my niggles about period costume and furniture.

Once I had been signed up as the author, I was in the unenviable position of being expected to know everything. Sadly, I don’t but I was able to consult knowledgeable friends who dug me out of some of my ignorance. I began with an almost blank sheet of A3 paper. It contained just a series of empty rectangles called spreads: these correspond to what you see when you open the book at any given point and look at the two pages in front of you. My job was to decide, in outline at first and then in detail, what would go onto each page or spread. What were the topics to cover and how many spreads should be devoted to each? All this was against the background that the number of pages for this series is fixed at 192 and not all of those are for the author: there must be titles, picture acknowledgements, and an index.

I began with evolution, looking at the anatomical apparatus needed for speech and how this developed. I am no expert in this field and those who specialize in primate evolution will probably find things that they would say otherwise. I went on to look in detail at the constituent elements of linguistics: two spreads on phonetics, three on phonology, four on morphology, two on lexicon and three on syntax. The book then moves on to proto-languages and the problems with arranging languages into families. The book has on its cover an attractive tree diagram of the Indo-European languages. Anyone familiar with the field will know how contentious a topic this is and will either want to draw the branches in a different way, change the labels or object altogether to the notion of trees. But I hope that the text of the book makes it clear that the enlightenment enthusiasm for genealogies, which also brought us Linnaean classification of plants and the periodic table of elements, is not taken by linguists today as the last word on the topic. The problems of areal influence are discussed in detail, particularly in respect of the Semitic languages and those of mainland Southeast Asia.

The deadline for the book was strict. Whereas those of us accustomed to academic publishing often have years in which to write a monograph, my brief from Octopus was to write 50,000 words in ten weeks. Furthermore, the text was to be delivered in three batches so that the design team could be getting on with the illustrative content for one part of the book whilst I was writing the text for the next. Because of the need to fit in illustrations, this meant that one had generally to write in units of 610, 1220 or 2440 words depending on the number of pages to be covered.

Because I wanted to give the reader the broadest immersion in the field, the book goes on to tour the world either by looking at language families or at the speech of large geographical areas. There are thus sections on the Celtic, Semitic, Turkic, and Iranian languages and others on the languages of India, the Caucasus, the Pacific and the Americas. On some days, this meant that my task was to write 610 words on the idea and reconstruction of the Indo-European family. This is a challenge in terms of choice and compression but also a wholesome discipline. Other days were much harder. It is not encouraging to wake up knowing that the business of the moment is to produce 1220 words on the languages of North and South America. Quite aside from problems of choice and compression, the greater challenge was that I knew so little about the topic and needed to educate myself before presuming to write a single word. By the end of the day, I had not written the required number of words but at least had read a great deal and mapped out the way forward.

Specialist readers will disagree over what ought to have been included, what left out and what emphasis ought to have been given to individual elements. But I hope that the general reader new to language and browsing in a high-street shop will be enthused and drawn in to our wonderful subject. If a person is motivated to start learning another language or to buy some books on linguistics (there is a select bibliography), that is a result. The cartoons are meant to allure. But that does not mean that the text is small beer. I asked my editor if I could discuss things like syllabic nuclei and sonority hierarchy. ‘Yes,’ he replied without missing a beat, ‘Of course!  Just make sure that you explain it all clearly.’ The diagrams help to do that and there is a full glossary at the back.

Simon Pulleyn’s The Secret Life of Language was published by Octopus Books on 30 August 2018 (Cassell, 192 pp, £12.99, ISBN 9781788400244).

Prepositional infinitives in Latin & Romance

by Keith Tse (Chinese University of Hong Kong)

Prepositional infinitives are an important type of clausal complementation in all Romance languages, especially the use of de-infinitive and ad-infinitive which are pan-Romance in their uses as non-finite clausal complements (Harris 1978:197-198, Vincent 1988:68-70, Ledgeway 2012a:179, cf. Meyer-Lübke 1900:426ff.). However, although Romance prepositional infinitives are widely attested across time and space, their Latin (or proto-Romance) origins are as yet unknown, since prepositional infinitives do not exist in Latin, apart from some very late and dubious examples which cannot be taken for granted (Diez 1876:201-202, Beardsley 1921:97). Nonetheless, there have been recent attempts to reconstruct proto-Romance prepositional infinitives, which are structurally equivalent to Latin prepositional gerunds/gerundives as suppletive markers of the oblique functions of the infinitive and the latter may be taken as precursors of the former (Schulte 2007:87ff).

In this post, I outline a proposal concerning the Latin origins for Romance prepositional infinitives whose diachronic formation displays striking parallels with and divergences from the famous English to-infinitive (Los 2005), a comparison of which raises new questions not only for non-finite complementation but also for mechanisms of syntactic change.

Prepositional complementation in Romance

The two most common types of prepositional complementisers in Romance are de-infinitives and ad-infinitives, which show different distributions; the former is used with all types of verbs, while the latter is restricted mainly to verbs that imply purpose and futurity (Meyer-Lübke 1900:426ff, 435ff; Beardsley 1921:97-99, 106-108, 150-151; Vincent 1988:68; 1999:7). This is illustrated in the following examples from Medieval Romance where de-infinitives are used with verbs of communication (verba declarandi), command (verba praecipiendi) and as prolative infinitives (verba prolativa), whereas ad-infinitives are only attested with the latter two (prepositional complementiser in bold):

Verba declarandi:

1a) deneg-o             de  enuia-r-les              ayuda
deny-PRET.3SG DE send-INF-PRO.3PL aid
‘… he denied that he sent them help.’ (La Primera Crónica General 679a33)

1b)   confess-a                d’   aver-lo      fa-tto
confess-PRES.3SG DE have-PRO do-PERF.PTCP
‘he confesses that he has done it…’ (Rettorica p. 108)

1c)   qui           se               dout-e               d’   estre    blasmee
‘… who fears that he is being blamed.’ (La clef d’amors 2584)

Verba prolativa:

2a)   siempre contiend-e           de val-er            a    cuitad-os
always    strive-PRES.3SG  DE protect-INF AD victim-PL
‘he always strives to protect the victims.‘ (La Estoria de Sennor Sant Millan 623)

2b)   procaccia-ndo  di  riconcili-ar-si                    co-l                     Papa
strive-GERUND DE reconcile-INF-REFL.PRO with-DEF.ART Pope
‘striving to reconcile with the Pope.’ (Cronica fiorentina, p. 104)

2c)   desirroit              a    vivre      d-u                          sien
‘… he would like to live with his.’ (Les miracles de saint Louis de Guillaume de St Pathus 5554)

Verba praecipiendi:

3a)   ell-os      ordena-uan              de pon-er
PRO-3PL order-IMPERF.3PL DE place-INF
‘… they ordered to place them.’ (La Primera Crónica General 87a47)

3b)   pora            esforç-ar  a    defend-er-se force-INF AD defend-INF-REFL.PRO
‘in order to force them to defend themselves.’ (La Primera Crónica General560b31)

3c)   ordin-arono       di  fa-r-gli                fa-re    incontinente…
order-PRET.3PL DE make-INF-PRO make-INF incontinent
‘… they ordained him to be made to make him incontinent’ (Compagnia di S. M. del Carmine, p. 66)

3d)   era-no                 costr-ett-i …                           a    tagli-are selv-e
be.IMPERF-3PL force-PERF.PTCP-NOM.PL AD cut-INF   forest-PL
‘… they were forced… to cut forests…’ (Vegezio 2, cap. 24)

3e)   il      fust                contrei-nz            a    renoi-er     la             foy    Jhesu Crist
PRO be.PRET.3SG force-PAST.PTCP AD reject-INF DEF.ART faith Jesus Christ
‘… he was forced to reject his faith in Jesus Christ.’ (L’histoire de Barlaam et Josaphat 1.1.46)

The main difference between de and ad, therefore, is that de marks both realis and irrealis clausal complements, whereas ad only marks irrealis complements, which may be projected back to proto-Romance. In the next section, I look at some Latin attestations which bear striking similarities to these Romance examples and may be taken as their precursors.

Prepositional complementation in Latin

Both Latin de ‘about, regarding’ and ad ‘to, towards’ are lexical prepositions; there are numerous examples from pre-classical and classical times where prepositional gerunds/gerundives are construed directly with verbs which are compatible with their lexical meanings of these prepositions (Johndal 2012). In the case of de, it denotes the content of propositions and is attested with numerous types of verbs that express indirect statements (prepositions in bold):

Verba declarandi:

In this category, these are examples of verbs of saying and thinking (dicendi et putandi) that take de-gerund/gerundive expressing the content of the proposition, which can be reanalysed as indirect statements:

4a) primum tibi                   de nostr-o                     amico
first         PRO.2SG.DAT DE our-ABL.SG.MASC friend-ABL.SG.MASC

placa-nd-o                                               aut etiam plane
appease-GERUNDIVE-ABL.SG.MASC or   even   altogether

restitue-nd-o                                         pollice-or

‘First I promise you about appeasing or even restoring our friend altogether.’ >                   ‘I promise you that I shall appease or even restore our friend’ (Cicero ad Atticum                1.10.2)

4b)   qui                                       de  virgine         capienda


‘who wrote about capturing the girl’ > ‘who wrote that they would capture the girl’            (Gellius Noctes Atticae 1.12)

4c)   tu                       de alter-o                              consulat-u
PRO.2SG.NOM DE another-MASC.ABL.SG consulship-MASC.ABL.SG

gere-nd-o                                        te                      dice-re-s                         cogit-are

‘you said that you were considering about running another consulship’ > ‘you said             that you were considering running another consulship.’ (Cicero In Vatinium 11)

4d)   nam vell-e         se               cum eo                     conloqu-i
for    want-INF REFL.PRO with PRO.3SG-ABL converse-INF

de  parti-end-o                              regn-o

‘for he wanted to converse with him (something) about dividing the kingdom.’ >                  ‘for he wanted to say to him that he would divide the kingdom.’ (Nepos Dion 2)

Verba prolativa:

De-gerund/gerundive and ad-gerund/gerundive are used with certain verbs expressing the content of intention/purpose of the matrix subject:

5a)   nos… labor-amus         de aufere-nd-o                                   mal-o
we      work-PRES.1PL DE eliminate-GERUNDIVE-ABL.SG evil-ABL.SG
‘we strive about removing the evil…’ > ‘we strive to remove the evil.’ (Tertullian Adversus Hermogenem 11.3)

5b)   ego          enim te             arbitr-or…           statim  esse
PRO.1SG for     PRO.2SG think-PRES.1SG at.once be.INF

ad  Sicyon-em  oppurgn-and-um              profe-ct-um
AD Sicyon-ACC attack-GERUNDIVE-ACC set.out-PERF-ACC.SG

‘for I think that you immediately set off in order to attack Sicyon’ > ‘for I think that            you immediately set off to attack Sicyon’ (Cicero ad Atticum 1.13)

Verba praecipiendi:

Verbs denoting command can take both de-gerund/gerundive and ad-gerund/gerundive in expressing the content and purpose of the command respectively, which may be reanalyzed as indirect commands (Panchón 2003:384-387):

6a)   cum  de muta-nd-o                                      praecip-ere-t                     homin-e
‘since he ordered about changing the man’ > ‘since he ordered to change the man.’ (Augustine Sermones 9.8)

6b)   ut          consul-es            populum           cohort-are-ntur
so.that consul-NOM.PL people-ACC.SG encourage-IMPERF.SUBJ-3PL

ad  rogation-em accipiendam

‘so that the consuls might encourage the people so as to accept the plea’ > ‘so that the consuls might encourage the people to accept the plea’ (Cicero ad Atticum 1.14)

6c)   ad resistitue-nd-um                        non   compell-it
AD re-establish-GERUND-ACC.SG NEG  force-PRES.3SG
‘he does not force you so that you might re-establish it.’ > ‘he does not force you to re-establish it.’ (Augustine Epistulae 153.21)

The distribution of Romance prepositional infinitives hence seems to conform to Latin prepositional gerunds/gerundives where de in being the marker of theme/content is semantically more general and hence compatible with a wider range of verbs whereas ad as a marker of purpose/intention is only used with verbs that express command and purpose. These developments are strikingly similar to English to-infinitives, especially from a formal perspective, as discussed in the next section.

Prepositional phrases > prepositional infinitives

English to-infinitives are the prototypical example of non-finite complementation and it is widely held that to-infinitives are reanalysed in Old English (OE) from being purposive adjuncts to clausal complements (cf Latin ad-gerund/gerundive), which are particularly frequent with verbs of purpose and command (Los (2005:chapter 3)):

7a)   tiligen we us to  gescild-enne and us to gewarnig-enne
strive   we us TO shield-DAT   and  us to guard-DAT
‘we should try to shield ourselves and guard ourselves…’ (HomS 44,158)

7b)   on hwilcum godum tihst    pu     us to gelyf-enne ?
in  which      gods     urgest thou us to believe-DAT
‘which gods do you urge us to believe in?’ (AELS (George) 148)

Furthermore, both Latin/Romance and English prepositional infinitives are the results of morphophonological erosion in the nominal paradigm, since the Germanic dative ending –enne following OE to is argued to be obsolete in OE (Los 2005:3-5) and the Romance infinitive, in contrast to Latin gerund/gerundive, likewise does not inflect for morphological case. In both cases, the nominal properties of the clausal complement are practically eliminated, which severely weakens the agreement between the preposition and its nominal complement (Roberts and Roussou 2003:105), which leads to their reanalysis as non-finite clauses. Furthermore, Latin/Romance de-infinitives represent a new pathway of syntactic change since, in contrast to English to-infinitives and Latin/Romance ad-infinitives, Latin/Romance de does not express purpose but is more semantically general in expressing the content of propositions, which not only yields its wider distribution in Romance but also reveals two distinct types of non-finite complementisers, one more purpose-oriented (to/ad), the other more neutral (de). Since non-finite complementisers are traditionally held to be low in the cartography of C-elements (Rizzi 1997), it may be argued that there are two functional projections in the non-finite domain (Mrealis/Mirrealis), which parallels the dual complementiser system in Romance finite complementation (Ledgeway 2012b). The Latin/Romance evidence, therefore, reveals a more sophisticated C-system, especially in the non-finite domain.


The use of Latin prepositional gerund/gerundive represents a new topic in Latin/Romance historical syntax which opens up many new avenues to the formation of Romance non-finite complementation, since although prepositional infinitives, which are plentiful in Romance, are not attested in Latin, their historical structural equivalents, namely Latin the prepositional gerund/gerundive, are widely attested in examples where they are re-analysable as clausal complements. It is therefore possible to account for the pan-Romance distribution of prepositional infinitives by expanding our search and analysis to Latin prepositional gerunds/gerundives.


Beardsley, Winfred, A., 1921, Infinitive Constructions in Old Spanish, New York, Columbia University Press.

Diez, Frédéric, 1876, Grammaire des Langues Romanes, vol III. 3rd ed, Paris, Libraire-Éditeur.

Harris, Martin, 1978, The evolution of French syntax: a comparative approach, London, Longman.

Johndal, M. (2012): Non-finiteness in Latin. DPhil dissertation, University of Cambridge.

Ledgeway, A. (2012a): From Latin to Romance: Morphosyntactic Typolog and Change. Oxford: Oxford University Press.

Ledgeway, A. N. (2012b): ‘La sopravvivenza del Sistema dei doppi complementatori nei dialetti meridionali’, in Del Puente, P. (ed): Atti del II Convegno internazionale di dialettologia-Progetto A.L.Ba. Rionero in Vulture: Calice, pp. 151-176.

Los, Bettelou, 2005, The Rise of the To-infinitive, Oxford, Oxford University Press.

Meyer-Lübke, Wilhelm, 1900, Grammaire des Langues Romanes. Tome Troisième: Syntaxe, Paris,  H. Welter.

Panchón, Federico, 2003, ‘Les complétives en ut’. In: Bodelot, Colette, 2003, Grammaire Fondamentale du Latin. Tome X: Les propositions complétives en latin, Louvain/Paris/Dudley, Peeters: 335-481.

Reenan, Pieter, van. / Schøsler, Lene, 1993, ‘Les indices d’infinitif complément d’objet en ancien français’. In: Lorenzo, Ramón (ed), Actas do XIX Congreso Internacional de Lingüística e Filoloxía Románicas, Vol V, La Coruña: 523-545.

Rizzi, L. (1997): ‘The fine structure of the left periphery’, in Haegeman, L. Elements of Grammar, Netherlands: Kluwer Academic Publishers, pp. 281-337.

Roberts, I. and Roussou, A. (2003): Syntactic Change. A Minimalist approach to grammaticalization. Cambridge: Cambridge University Press.  

Schulte, Kim, 2007, Prepositional infinitives in Romance: a usage-based approach to syntactic change, Oxford, Peter Lang.

Vincent, Nigel, 1988, ‘Latin’. In: Vincent, Nigel / Harris, Martin (eds), The Romance Languages, London, Croom Helm: 26-78.

Vincent, Nigel, 1999, ‘Non-finite complementation in Latin and Romance’, Paper presented at the Indo-European Seminar, Department of Classics, University of Cambridge, October 1999.


Semantically driven grammaticalisation: the systematic pathways of Estonian polar question particles

by Mari Aigro (University of Tartu)

Seeing grammaticalisation as being analogically driven takes the explanatory power, which is frequently assigned to syntactic position, and assigns it to the semantic analogy between the source and the target. This case study focuses on the semantic cohesion patterns in the pathways of contemporary as well as historical Estonian polar question particles (PQPs). It will show that not only is the semantic component of function words much more relevant to grammaticalisation than is commonly thought, but also that the grammaticalisation network surrounding a functional category can in fact be semantically so uniform that one can devise a model based on a semantic map and assign it a certain degree of explanatory power regarding why certain markers become PQPs and others are much less likely to do so.

While the most frequently mentioned PQP sources are negation and disjunction markers (Heine & Kuteva 2002), a comprehensive literature review reveals altogether six source categories. In addition to disjunction and negation markers, this list also includes clause conjunction markers, embedded PQPs, conditional markers and pronominal interrogatives (König & Siemund 2007, Nordström 2010, Metslang et al. 2017). These sources appear to form a systematic set – all of the above could be classified as markers of polarity or truth values (see Payne 1985 for coordinators, Nordström 2010 for conditionals). To investigate, whether or not this principle would hold for additional data and other newly discovered source categories, an in-depth corpus study was carried out on Estonian, a language especially rich in both neutral and biased PQPs.

Nearly 2400 polar questions using the particle strategy (inversion and zero-marking strategies are used alongside) were manually encoded in the Corpus of Old Written Estonian (17th–19th century) and the Corpus of Standard Estonian (20th century). I found six different PQPs—four biased and two neutral—used between the 17th and 21st centuries. Three of them—kas, või, ega—are still in use in Standard Modern Estonian. The source of kas is either a clause conjunction (“also”) or an embedded PQP; või most likely originates from a disjunction (“or”); and ega from a clause rejection marker (“nor”). The three historical polar question markers are eks, eps and jo/ju; while the first two originate from negation, the source of the latter is an affirmative focus marker. Only three have given rise to new functional structures: eks became an affirmative polar tag question marker; kas gave rise to the disjunction marker “either”; and jo/ju, after its brief time as a PQP, became a marker of evidentiality when occurring sentence-initially (retaining the older focus reading in other positions).

Hence, the new source categories introduced by the corpus study were polarity-sensitive focus markers (for ju) and rejection markers (for ega), both of which confirm the hypothesis that polar question particles originate from non-interrogative markers, which already involve the semantic component of negation, affirmation or neutral (open) polarity. Table 1 depicts the pathways of Estonian PQPs on a semantic map, which links the two dimensions of polarity – interrogation and bias.

Table 1: Semantic map of Estonian PQPs

Markers in the neutral category are especially relevant. They leave the truth value unknown, assigning open polarity even without interrogation, and due to this share a close link with PQPs. PQPs are more frequently homophonous with disjunction markers than other particles and both of the non-biased Estonian PQPs, kas and või, originate from the neutral category. Additionally, all functional markers originating from PQPs belong in this category. However, although the fact that the map accommodates all known sources of PQPs implies causality, it can only constitute a probabilistic rather than a deterministic model.


Aigro, M. 2017. A Diachronic Study of Polar Question Particles and Their Sources. MPhil thesis, University of Cambridge.

Heine, B. & Kuteva, T. 2002. World Lexicon of Grammaticalisation, Cambridge: Cambridge University Press.

König, E. & Siemund, P. 2007. Speech Act Distinctions in Grammar. In T. Shopen, ed. Language Typology and Syntactic Description, Vol 1: Clause Structure. Cambridge: Cambridge University Press.

Metslang, H., Habicht, K. & Pajusalu, K. 2017. Where Do Polar Question Markers Come From? STUF – Language Typology and Universals 70(3).

Nordström, J. 2010. Modality and Subordinators, Amsterdam: John Benjamins Publishing Company.

Payne, J.R. 1985. Complex Phrases and Complex Sentences. In T. Shopen (ed.) Language Typology and Syntactic Description. Cambridge: Cambridge University Press.