The Origin of /ɬ/ in Southern Pinghua

by Xiaolan Cao (University of Melbourne)

In this post, I will discuss the origin of the voiceless lateral fricative /ɬ/ in Southern Pinghua, one of the two branches of Pinghua and a minority Sinitic language. Southern Pinghua is mostly spoken in Southern Guangxi in China (Qin 2000) by approximately 1.8 million native speakers (Min 2013). However, some of the dialects have experienced huge trans-generational language loss and are hence potentially endangered (Cao 2019). Most Southern Pinghua speakers identify as ethnic Han, the majority ethnic group in China, while most of the rest identify as ethnic Zhuang.

In southern Pinghua, the voiceless lateral fricative /ɬ/ is a consonant phoneme occurring in the onset position of a syllable. The phonemicity of /ɬ/ can be established by the minimal pair in Table 1 below.[1]

Word Gloss
/ɬa33/ ‘spread’
/sa33/ ‘sprinkle’

Table 1: a minimal pair of voiceless lateral fricative /ɬ/

Commonly, /ɬ/ is not considered an internal development of Sinitic languages primarily because it rarely occurs in present-day Sinitic languages. Within China, it is distributed in the former Baiyue area, once occupied by the ancestors of Tai-Kadai speakers (Li 2000). Besides Southern Pinghua dialects, Cantonese dialects located in Southern Guangxi and Western Guangdong also have the phoneme /ɬ/. Outside Guangxi and Guangdong, /ɬ/ can be found only in three small regions in China: it can be seen in some dialects of Ming in non-contiguous geographical pockets in Fujian Province or some dialects of Hui in Anhui Province; it also can be found in some dialects of various Sinitic languages spoken on the west coast of Hainan Province (de Sousa 2015: 166-168, quoting Liu X 2006, Liu F 2007, Akitani 2008, and Meng 1981). Due to its limited distribution in present-day Sinitic languages, /ɬ/ is not reconstructed for Middle Chinese or Old Chinese in the literature; see Zhengzhang (2003), Li (1971), Baxter and Sagart (2014), and Wang (1985) respectively.

On the other hand, /ɬ/ is common in present-day dialects of Zhuang, a Tai-Kadai language mainly spoken in Guangxi (Zheng 1998). According to works by Mai (2009, 2011), Ouyang (1995), Yuan (1989), Zheng (1998), and Zhao (2015), the phoneme /ɬ/ in Sinitic languages may have developed under the influences from Zhuang loanwords through language contact. However, the opposing view—that because the phoneme /ɬ/ in Zhuang corresponds to *s in Proto-Tai, it is likely that Zhuang developed this phoneme under the influence from Sinitic languages instead of the opposite direction of influences—has been suggested in the Chinese language literature as well (de Sousa 2015, quoting Li F 1977 and Pittayaporn 2009).

The two views on the origin of /ɬ/ in Sinitic languages have some limitations. First, the argument that /ɬ/ is not an internal development of Sinitic language simply because of its limited distribution and absence from reconstructions for Middle Chinese or Old Chinese does not preclude that /ɬ/ could have developed in Southern Pinghua after the Middle Chinese period.

Further, the evidence does not indicate whether /ɬ/ is an internal development in Southern Pinghua or a phoneme developed under the influences of loanwords from Zhuang through language contact. As for its distribution in Southern Pinghua, the phoneme /ɬ/ can be found in both the Sinitic stratum and the Zhuang stratum. According to a survey by Cao (2018), in the Sinitic stratum, Chinese characters (Chinese cognates) whose Southern Pinghua pronunciations contain onset /ɬ/ were mostly recorded as having the Middle Chinese onset denoted as 心 (*s) in Qieyun, a rhyming dictionary published in 601 CE during the Sui dynasty (581–618). This correspondence exists not only in common words, such as /ɬam52/ (‘three’) and /ɬɜm52/ (‘heart’) but also in literary words, like /ɬɜw52/ (‘constellation’) and /ɬoŋ52/ (‘lofty’).

The correspondences between /ɬ/ in Southern Pinghua and onset 心 (*s) in Middle Chinese suggests that /ɬ/ is of Sinitic origin. However, from the same survey, there are ninety-one admissible syllables start with /ɬ/ in total, among which twenty-six cannot be associated with Chinese characters (Chinese cognates). Normally for Southern Pinghua syllables, being able to be identified by Chinese characters strongly indicates their Sinitic origin. Thus, these twenty-six syllables are possibly not of Sinitic origin but introduced to the language by loanwords from other languages, such as Zhuang. Thus, the distribution of /ɬ/ in Southern Pinghua does not support /ɬ/ being an internal development or one induced by the influences of language contact with Zhuang.

In addition to the distributional features of /ɬ/ in Southern Pinghua, the historical developments of /s/-phonemes in Southern Pinghua may also shed some light on the developments of /ɬ/. In Southern Pinghua, pronunciations of Chinese characters whose onset is /s/ correspond mostly to those denoted in Qieyun as having onsets denoted as 审 (*ɕ), 禅 (**ʑ), and邪 (*z). Based on the fact that these three Middle Chinese onsets did not develop into /ɬ/, we may speculate that the Middle Chinese onset 心 (*s) has some features that make it prone to sound change to /ɬ/ under certain influences, such as loanwords from Zhuang.

Finally, the geographical distribution of /ɬ/ is not so discontiguous as described in previous studies. The geographical distribution of /ɬ/ is contiguous in Southern Guangxi and Western Guangdong. These two adjacent regions in total occupy approximately 184,000 square kilometres [2] of densely populated area, which is larger than Cambodia (181,035 square kilometres) or Nepal (147,181 square kilometres). Therefore, it may not be accurate to describe the territory of /ɬ/ in Southern Guangxi and Western Guangdong as small or isolated, and /ɬ/ can be considered as an areal feature for further studies in historical linguistics, areal linguistics, and linguistic typology. Drawing from the analysis and evidence given in the discussion above, I would like to posit some questions for further investigation.

  1. Why is /ɬ/ so prevalent in Southern Pinghua and Cantonese dialects found in the area of Southern Guangxi and Eastern Guangdong, but not in the other areas?
  2. If language contact with Zhuang is a contributing factor to the development of /ɬ/, why does /ɬ/ occur in Southern Pinghua dialects but not most Northern Pinghua dialects, given both Pinghua branches have similar contact with Zhuang?
  3. Similarly, why do Cantonese dialects in Western Guangdong have /ɬ/ but not those in Eastern Guangdong, considering Cantonese dialects mostly have similar exposure to Zhuang in the history?
  4. Can the peculiar distributions of /ɬ/ in Pinghua and Cantonese dialects be explained by a mere historical accident?

In sum, the two opposing views on the origin of /ɬ/ in Southern Pinghua are questionable because the evidence is inconclusive. At this stage, the origin of /ɬ/ in Southern Pinghua dialects remains unclear, and further investigations are still required.


[1] Southern Pinghua is a tone language, and the numbers in the word transcriptions indicate lexical tones

[2] The regions of Southern Guangxi and Western Guangdong occupy approximately half of Guangxi (236,700 square kilometers) and one third of Guangdong (177,900 square kilometers). Therefore, it is estimated that these two regions altogether have around 184,000 square kilometers.

The Loss of the Latin Case System – A New Morphological Approach

by Zeprina-Jaz Ainsworth (University of Oxford)

Much work has already been done on the development of the Latin case system, which has been lost almost entirely from nouns and adjectives in Romance. Scholars such as Herman (2000) have outlined phonetic, analogical, functional, and syntactic changes which may have contributed to the opacification of certain morphological case forms. However, none of the previous analyses account for the near-total loss of the case category in Romance. For instance, as the result of regular phonological changes, the singular forms in the first declension would not have ‘fallen together’ into a single, invariant shape:

PluralClassical LatinSound ChangeResult

AccusativeMENSAMLoss of final -m**mensa
AblativeMENSĀLoss of vowel length distinctions
GenitiveMENSAEae >[e]
DativeMENSAEae >[e]

Table 1: Phonetic erosion in first declension singular case/number suffixes

Moreover, cross-linguistic comparison indicates that, despite phonological, analogical, and functional developments, languages do not necessarily always lose their case systems. Finnish, for instance, retains the fifteen case values (for nouns and adjectives) reconstructed for proto-Finnic (although the abessive, comitative, instructive and prolative are now in restricted usage), and has even begun to develop new morphological suffixes:

Proto-Finnic nominative, genitive, partitive, essive, translative, elative, inessive, illative, ablative, adessive, allative, abessive, comitative, instructive, prolative
Modern Finnish nominative, genitive, partitive, essive, translative, elative, inessive, illative, ablative, adessive, allative, (abessive, comitative, instructive, prolative), comitiative2, excessive

Table 2: Case values in proto-Finnic and modern Finnish

This study is concerned with answering the question: why do we find such different developments cross-linguistically?

One major difference between these two languages is that Latin is characterized predominantly by fusional morphology, whilst Finnish exhibits an abundance of agglutinative structure. By analysing these structures from a unit-agnostic ‘abstractive’ approach (as opposed to a ‘constructive’ perspective, in which forms are considered to be ‘built’ up of sub-word parts),[1] we may best understand how they behave in significantly different ways in diachrony.

In Latin for instance, the fully-inflected wordform and the relationship it bears to other forms in the paradigm provides the language-user with informative patterns which may be extended in the inflexion of other lexemes – there is no need to posit ‘underlying’ forms or identify sub-word morphs in order to ‘construct’ new forms. For instance, if the language-user knows a nominative singular form ending in -a, the lexeme must belong to the first declension. In the second and fourth declensions, however, even if both the nominative singular and accusative singular forms are known, there is residual ambiguity about the inflexion class to which the lexeme belongs:

Nom. sg. PUELLA 1st declension SERVUS 2nd/4th declension GRADUS 2nd/4th declension
Acc. sg. PUELLAM 1st declension SERVUM 2nd/4th declension GRADUM 2nd/4th declension
Gen. sg. PUELLAE 1st declension SERVĪ 2nd declension GRADŪS 4th declension

Table 3: Implicational relations in a sub-set of Latin nouns

In Finnish, implicative relations provide information about inflexion class, whilst the frequent isomorphic form~function mapping exhibited by inflexional suffixes provides absolute certainty in the expression of most case functions.

Nom. sg. ajatus ‘thought’ -Vs ~ -Vks-/-Vs ~ -VV- vieras ‘stranger’ -Vs ~ -Vks-/-Vs ~ -VV-
Part. sg. ajatusta -Vs ~ -Vks-/-Vs ~ -VV- vierasta -Vs ~ -Vks-/-Vs ~ -VV-
Gen. sg. ajatuksen -Vs ~ -Vks- + [n] vieraan Vs ~ -VV- + [n]

Table 4: Implicational relations and sub-word units in a sub-set of Finnish nouns

Whilst multiple forms are required in Finnish to determine the declension class to which a lexeme with a nominative singular form in -s belongs, there is certainty in many cells as to the inflexional material that will follow the lexical stem.

The abstract patterns that exist in Latin are not maximally-informative, that is, there is occasionally still uncertainty about the shape of an unknown form, even given knowledge of two forms in the language (consider table three). In Finnish, on the other hand, there is a sub-word area of absolute certainty in most of the cells in the inflexional paradigm. In addition to implicational relations, therefore, a Finnish speaker, even where there is not have sufficient information to deduce the inflexion class of a lexeme, may utilize maximally-predictable sub-word forms to produce a form (whether or not the ‘correct’ one) which may be interpreted correctly by a hearer.[2]

The observations offered here accord with language-learning data. Niemi and Niemi (1987) and Laalo (2009), for instance, observe that Finnish children recognise early the direct mapping of the suffix -n and genitive singular functions; they then utilise this knowledge in the deduction of previously unencountered forms. In Latin, exemplary paradigms and principal parts have long been used to capture the inflexional variation exhibited by lexemes. The implicational relations that exist between the nominative singular and genitive singular forms of a noun, for instance, are sufficient to enable L2 learners to ‘match’ novel items to the correct inflexion class.

I suggest that understanding the way in which morphological structures are recognised and exploited by languages-users may help to explain (in conjunction with, e.g., phonological or analogical developments) whether morphological case distinctions are likely to be lost or maintained. In Latin, the implicational relations, although informative, are not always maximally-predictive, and became opacified through time following regular phonological developments (such as those given in table one). As a result of such phonetic erosion, the area of informativeness in the Latin case system has shifted from the area of suffixal variation, distinct across declension, towards the certainty associated with the invariant form of the lexeme. By contrast, the maximally-predictable sub-word elements in Finnish may be rote-learned, which provides them with diachronic stability. These units, in addition to the less informative abstract relations, offer language-users on average more information in language use than is available to a learner of Latin in the production of novel inflected forms. Consideration of the morphological structures found in a given language and the ways in which they are recognised and exploited in language use may therefore offer some additional insight into why the robust Latin case system is not found in Romance.


The latest from Austronesian historical linguistics

by Laura Arnold (University of Edinburgh)

LogoiCAL14The 14th International Conference on Austronesian Linguistics was held on 17–20 July 2018, at the campus of the Université d’Antananarivo in the capital of Madagascar, the westernmost outpost of the Austronesian world. With four keynote speakers and 176 participants, the conference brought together Austronesian researchers from all over the world to share their latest research on this huge and diverse language family. The four days of talks were followed by an excursion to the UNESCO world heritage site of the Royal Hill of Ambohimanga, situated on a soaring hill above stunning landscapes and rice paddies, 24 km to the northeast of the city. Photographs of the conference by David Gil can be found here.

As ever, there were many talks that dealt with historical, comparative, and philological issues in Austronesian linguistics. The question of the the origin and movement of the pre-Austronesians and the subsequent expansion of Austronesian languages throughout insular Southeast Asia was the subject of lively debate throughout the conference. In his keynote speech, Waruno Mahdi—a proponent of the proto-Austric hypothesis, which links Austro-Tai to Austroasiatic—used genetic, archaeological, and linguistic data to argue that speakers of proto-Austronesian comprised two distinct population groups. One was a subtropical group (the ‘Deutero-Malays’), descended from the rice-cultivating Austro-Tai group; and the other was an equatorial group (the ‘Proto-Malays’), who migrated from the south towards the Proto-Austronesian homeland of Taiwan when the Sunda shelf was flooded, around 7000–4000 years BP. Laurent Sagart, on the other hand, who proposes that Austronesian is a sister of Sino-Tibetan, later argued that the pre-Austronesians originated from the Yellow Valley in north China, approximately 9000–7500 years BP. This conclusion is based on agricultural archaeological evidence regarding the spread of millet domestication; the spread of the ritual ablation of upper lateral incisors; and mtDNA and Y chromosome data showing a link between Sino-Tibetan- and Austronesian-speaking populations. Regarding the dispersal of the Malayo-Polynesians, Marian Klamer emphasised that the traditional farming dispersal model of Austronesian expansion throughout Island Southeast Asia is too simplistic, and cannot account for the linguistic and archaeological diversity found throughout the area – especially for the so-called Western Malayo-Polynesian and Central Malayo-Polynesian languages, which comprise over 600 languages across the majority of Island Southeast Asia. She reminded us that the Malayo-Polynesian expansion most likely did not occur in one fell swoop across the archipelago, but that there may have been hundreds or possibly thousands of migrations across the area; and that we need detailed, bottom-up micro-comparisons in order to work out the history of the linguistic dispersal of Malayo-Polynesian languages. This sentiment in particular appeared to strike a chord with the conference participants, and was something I heard echoed many times over coffee, lunch, and the cocktail party that closed the conference.


Credit: David Gil

Another topic of interest was the linguistic inferences that can be made about the history of Malagasy from 17th-century sources. One of the keynote speakers, Narivelo Rajaonarimanana, outlined his work on the Sorabe manuscripts and texts held in the National Library of Paris, which he has been transcribing and translating. He discussed the use of Qur’anic verses in these manuscripts in healing prayers, and as talismans for protection. He also sketched out some aspects of the grammar of the volañ’onjatsy dialect, spoken by a group living around the Matataña River, which is represented in these texts. Earlier in the conference, Alexander Adelaar (see also here) presented several speculations regarding the phonology of early Malagasy, using evidence from 17th-century Sorabe texts, and a 1603 textbook and wordlist of Malagasy compiled by Frederik de Houtman. First, he concluded that *y and *w in proto-Southeast Barito (the Bornean ancestor of Malagasy) were still vocoids at the point when Sorabe, a derivative of the Arabic script, was first adapted to transcribe Malagasy. Second, he established that the contraction of like vowels in originally disyllabic roots (e.g. *fu(h)u ‘heart’ > fu, *raa ‘blood’ > ra) had not yet taken place. Third, he discussed problems with the traditional identification of the Sorabe texts with the Taimoro dialect, providing linguistic evidence to show that the oldest Sorabe texts have features in common with the Tanosy dialect. Sorabe was originally practiced in a wider area, and its identification with the Taimoro dialect and region alone is too narrow and only reflects the current state of affairs. Finally, the orthography used in the wordlist, as well as comparison with cognate forms in other languages, suggests Malagasy still had a palatal nasal ñ at the time Houtman was compiling this wordlist.

My travel to the conference was funded in part by the Philological Society. In my presentation, I looked at a split in the tone system of a dialect of Ambel, a South Halmahera-West New Guinea language spoken in West Papua, Indonesia. This split was conditioned by vowel height, such that toneless syllables with non-high vowel nuclei *e, *a, and *o developed High tone, whereas toneless syllables with high vowel nuclei *i or *u remained toneless. There are two interesting points about this split. First, tone splits conditioned by vowel quality are very rare. Second, in all other cases of tone splits or tonogenesis conditioned by vowel quality that have been described in the literature so far, high vowels are associated with High tone. The conditioning of High tone by non-high vowels, as we find in Ambel, has not previously been attested. I went on to present a possible phonetic motivation for the split. This motivation makes reference to the complementary phenomena of intrinsic F0 and intrinsic pitch. All things being equal, higher vowels (e.g. /i/, /u/) are generally produced with a higher F0 than lower vowels (e.g. /a/). However, intrinsic pitch compensates for this, in that, when the F0 is identical, hearers perceive lower vowels as being higher in pitch than higher vowels. One important exception to intrinsic F0 is at the lower end of a speaker’s pitch range (e.g. in a tonal language, Low-toned syllables), where differences in F0 are reduced or completely neutralised. Toneless vowels in Ambel are realised with low pitch. I therefore suggested that, when proto-Ambel first developed tone, and toneless syllables came to be realised with low pitch, the intrinsic F0 of these toneless vowels was neutralised; however, the intrinsic pitch that formerly compensated for intrinsic F0 differences was maintained. This meant that speakers of Ambel came to perceive the toneless non-high vowels (*e, *a, and *o) as higher in pitch than the toneless high vowels (*i and *u). Eventually, this perceptual difference resulted in the merger of toneless syllables with non-high vowels with other High-toned syllables. Slides from this presentation can be found here.

Other talks that may be of interest to members and followers of the Philological Society are as follows (in order of presentation):

  • Owen Edwards explored the possible phonetic quality of proto-Austronesian *j. Three pieces of evidence lead him to the conclusion that the best reconstruction may be the affricate *dz. First, *dz is preserved as /dz/ in three primary branches of Austronesian, including Malayo-Polynesian; second, most of the reflexes in the present-day languages can be accounted for by making reference to natural and well-attested sound changes; and third, reconstructing *dz leads to a balanced and typologically-expected phonological inventory in proto-Austronesian.
  • Francesca Moro presented empirical data demonstrating that the morphological simplification of Alorese that has occurred since the most recent common ancestor with Lamaholot can be explained by the large number of L2 speakers of the language, which has historically been used as the lingua franca of the area.
  • Albert Davletshin looked at the diachrony of case marking in Nukeria, a Polynesian outlier – specifically, an agentive marker a, which is preposed only to singular personal and demonstrative pronouns, the question word ai ‘who?’, and personal names. He showed that the development and distribution of a can be explained by an interaction between semantics and phonology. On the semantic level, he discussed the phenomenon of differential agent marking, found elsewhere in Polynesian languages, in which highly-individuated NPs (such as pronouns, personal names, and definite NPs) are marked, whereas lower-individuated NPs are not. The distribution of a can be further explained by making reference to a phonological constraint in Nukeria which prevents the bimoraic singular pronouns and any bimoraic personal names from being realised without additional marking.
  • In a paper by Ritsuko Kikusawa(see also here), John Lowry, Paul Geraghty, Apolonia Tamata, Fumita Sano, Susuma Okamoto, and Hirofumi Teramura, results from a pilot project in Fiji combining linguistic and GIS data were discussed. In this project, the data are used to map different ‘communalects’, depending on how similar forms for a particular meaning are to Standard Fijian. This methodology can also be used to calculate the similarity of forms to a reconstructed ancestor form, and has the potential to be used in testing hypotheses with regards to historical population movements, for example where the ports of entry for a particular island may have been.
  • A paper by Juliette Huber and Antoinette Schapper looked at Austronesian borrowings into the non-Austronesian Eastern Timor languages. On the basis of sound changes in both the Austronesian and non-Austronesian languages, several layers of borrowing can be identified, indicating a complex and long-term history of contact. In addition, Austronesian borrowings from unidentified sources in the Eastern Timor languages suggests that there has been contact with a now-extinct Austronesian substrate in East Timor; and shared vocabulary throughout the languages of the area points to contact between the proto-languages of the Austronesian and non-Austronesian languages spoken today, although the source of these words is difficult to determine.
  • Kirsten Culhane and Owen Edwards presented data from the Meto dialect cluster, in which there are very diverse patterns of intervocalic consonant insertions. A diachronic perspective is necessary to understand this diversity – most of the consonants used in insertion can be easily explained by making reference to well-attested sound changes in each of the dialects. However, a structural analysis is insufficient to account for the synchronic state. Instead, a social perspective which makes reference to the distinct identity of each of the dialect communities is necessary to explain the observed differences.
  • Corinna Handschuh provided an overview of common and proprial articles in Austronesian. Various languages throughout the family have a system in which different articles are used to mark common and proper nouns: most notably in Oceanic, but also elsewhere, such as in Tagalog. The distinction has also been reconstructed to proto-Austronesian. This system is highly unusual, in that it has not so far been attested in any other language family. She thus focussed on the stability of such a typologically unusual system over such a great time depth, flagging up the similarities with nominal classification systems such as gender, which are typically stable over time.
  • Emily Gasser discussed a ‘crazy rule’ of /β/, /r/, and /k/ mutation, which is attested in the majority of the languages of the South Halmahera-West New Guinea (SHWNG) subbranch. While her presentation focussed on the synchrony of this mutation, in the question and answer session she proposed that it may be helpful in the subgrouping of SHWNG – specifically, that the mutation provides evidence for grouping the SHWNG languages spoken around Cenderawasih Bay into a single primary branch.
  • Tobias Weber discussed the typological profile of the languages of Sumatra and the Barrier Islands, investigating mostly structures mentioned in the WALS. He assumed that certain features of these languages—the larger-than-average vowel inventories, the denasalisation of consonants in Enggano and Mentawai, numeral classifiers, and clausal head-marking (indexing of arguments on the predicate)—may be explained by influence from a now-extinct pre-Austronesian substrate.
  • Peter Slomanson looked at the development of negation in the contact languages Sri Lankan Malay and Sri Lankan Portuguese. He showed that these two varieties are in some ways structurally closer to each other than they are to their co-territorial model languages, Tamil and Sinhala, yet the contact languages still differ from each other in their respective negation systems. The parallels that there are, for example in the ordering of functional markers, suggest that contact between what would become Sri Lankan Malay and Sri Lankan Portuguese may have begun in Java, before continuing in Sri Lanka.
  • Penelope Howe presented preliminary data from matched guise tests, showing that an emergent lexical tone contrast in the Central dialects of Malagasy additionally indexes social meaning. Her results suggest that the use of tone in these dialects is associated with more positive attributes (e.g. friendliness, honesty). However, when tone is absent, the speakers of these dialects are associated with more negative attributes (e.g. reticence, indifference).

For further information about any of these presentations, readers are encouraged to contact the relevant author(s).

Prepositional infinitives in Latin & Romance

by Keith Tse (Chinese University of Hong Kong)

Prepositional infinitives are an important type of clausal complementation in all Romance languages, especially the use of de-infinitive and ad-infinitive which are pan-Romance in their uses as non-finite clausal complements (Harris 1978:197-198, Vincent 1988:68-70, Ledgeway 2012a:179, cf. Meyer-Lübke 1900:426ff.). However, although Romance prepositional infinitives are widely attested across time and space, their Latin (or proto-Romance) origins are as yet unknown, since prepositional infinitives do not exist in Latin, apart from some very late and dubious examples which cannot be taken for granted (Diez 1876:201-202, Beardsley 1921:97). Nonetheless, there have been recent attempts to reconstruct proto-Romance prepositional infinitives, which are structurally equivalent to Latin prepositional gerunds/gerundives as suppletive markers of the oblique functions of the infinitive and the latter may be taken as precursors of the former (Schulte 2007:87ff).

In this post, I outline a proposal concerning the Latin origins for Romance prepositional infinitives whose diachronic formation displays striking parallels with and divergences from the famous English to-infinitive (Los 2005), a comparison of which raises new questions not only for non-finite complementation but also for mechanisms of syntactic change.

Prepositional complementation in Romance

The two most common types of prepositional complementisers in Romance are de-infinitives and ad-infinitives, which show different distributions; the former is used with all types of verbs, while the latter is restricted mainly to verbs that imply purpose and futurity (Meyer-Lübke 1900:426ff, 435ff; Beardsley 1921:97-99, 106-108, 150-151; Vincent 1988:68; 1999:7). This is illustrated in the following examples from Medieval Romance where de-infinitives are used with verbs of communication (verba declarandi), command (verba praecipiendi) and as prolative infinitives (verba prolativa), whereas ad-infinitives are only attested with the latter two (prepositional complementiser in bold):

Verba declarandi:

1a) deneg-o             de  enuia-r-les              ayuda
deny-PRET.3SG DE send-INF-PRO.3PL aid
‘… he denied that he sent them help.’ (La Primera Crónica General 679a33)

1b)   confess-a                d’   aver-lo      fa-tto
confess-PRES.3SG DE have-PRO do-PERF.PTCP
‘he confesses that he has done it…’ (Rettorica p. 108)

1c)   qui           se               dout-e               d’   estre    blasmee
‘… who fears that he is being blamed.’ (La clef d’amors 2584)

Verba prolativa:

2a)   siempre contiend-e           de val-er            a    cuitad-os
always    strive-PRES.3SG  DE protect-INF AD victim-PL
‘he always strives to protect the victims.‘ (La Estoria de Sennor Sant Millan 623)

2b)   procaccia-ndo  di  riconcili-ar-si                    co-l                     Papa
strive-GERUND DE reconcile-INF-REFL.PRO with-DEF.ART Pope
‘striving to reconcile with the Pope.’ (Cronica fiorentina, p. 104)

2c)   desirroit              a    vivre      d-u                          sien
‘… he would like to live with his.’ (Les miracles de saint Louis de Guillaume de St Pathus 5554)

Verba praecipiendi:

3a)   ell-os      ordena-uan              de pon-er
PRO-3PL order-IMPERF.3PL DE place-INF
‘… they ordered to place them.’ (La Primera Crónica General 87a47)

3b)   pora            esforç-ar  a    defend-er-se force-INF AD defend-INF-REFL.PRO
‘in order to force them to defend themselves.’ (La Primera Crónica General560b31)

3c)   ordin-arono       di  fa-r-gli                fa-re    incontinente…
order-PRET.3PL DE make-INF-PRO make-INF incontinent
‘… they ordained him to be made to make him incontinent’ (Compagnia di S. M. del Carmine, p. 66)

3d)   era-no                 costr-ett-i …                           a    tagli-are selv-e
be.IMPERF-3PL force-PERF.PTCP-NOM.PL AD cut-INF   forest-PL
‘… they were forced… to cut forests…’ (Vegezio 2, cap. 24)

3e)   il      fust                contrei-nz            a    renoi-er     la             foy    Jhesu Crist
PRO be.PRET.3SG force-PAST.PTCP AD reject-INF DEF.ART faith Jesus Christ
‘… he was forced to reject his faith in Jesus Christ.’ (L’histoire de Barlaam et Josaphat 1.1.46)

The main difference between de and ad, therefore, is that de marks both realis and irrealis clausal complements, whereas ad only marks irrealis complements, which may be projected back to proto-Romance. In the next section, I look at some Latin attestations which bear striking similarities to these Romance examples and may be taken as their precursors.

Prepositional complementation in Latin

Both Latin de ‘about, regarding’ and ad ‘to, towards’ are lexical prepositions; there are numerous examples from pre-classical and classical times where prepositional gerunds/gerundives are construed directly with verbs which are compatible with their lexical meanings of these prepositions (Johndal 2012). In the case of de, it denotes the content of propositions and is attested with numerous types of verbs that express indirect statements (prepositions in bold):

Verba declarandi:

In this category, these are examples of verbs of saying and thinking (dicendi et putandi) that take de-gerund/gerundive expressing the content of the proposition, which can be reanalysed as indirect statements:

4a) primum tibi                   de nostr-o                     amico
first         PRO.2SG.DAT DE our-ABL.SG.MASC friend-ABL.SG.MASC

placa-nd-o                                               aut etiam plane
appease-GERUNDIVE-ABL.SG.MASC or   even   altogether

restitue-nd-o                                         pollice-or

‘First I promise you about appeasing or even restoring our friend altogether.’ >                   ‘I promise you that I shall appease or even restore our friend’ (Cicero ad Atticum                1.10.2)

4b)   qui                                       de  virgine         capienda


‘who wrote about capturing the girl’ > ‘who wrote that they would capture the girl’            (Gellius Noctes Atticae 1.12)

4c)   tu                       de alter-o                              consulat-u
PRO.2SG.NOM DE another-MASC.ABL.SG consulship-MASC.ABL.SG

gere-nd-o                                        te                      dice-re-s                         cogit-are

‘you said that you were considering about running another consulship’ > ‘you said             that you were considering running another consulship.’ (Cicero In Vatinium 11)

4d)   nam vell-e         se               cum eo                     conloqu-i
for    want-INF REFL.PRO with PRO.3SG-ABL converse-INF

de  parti-end-o                              regn-o

‘for he wanted to converse with him (something) about dividing the kingdom.’ >                  ‘for he wanted to say to him that he would divide the kingdom.’ (Nepos Dion 2)

Verba prolativa:

De-gerund/gerundive and ad-gerund/gerundive are used with certain verbs expressing the content of intention/purpose of the matrix subject:

5a)   nos… labor-amus         de aufere-nd-o                                   mal-o
we      work-PRES.1PL DE eliminate-GERUNDIVE-ABL.SG evil-ABL.SG
‘we strive about removing the evil…’ > ‘we strive to remove the evil.’ (Tertullian Adversus Hermogenem 11.3)

5b)   ego          enim te             arbitr-or…           statim  esse
PRO.1SG for     PRO.2SG think-PRES.1SG at.once be.INF

ad  Sicyon-em  oppurgn-and-um              profe-ct-um
AD Sicyon-ACC attack-GERUNDIVE-ACC set.out-PERF-ACC.SG

‘for I think that you immediately set off in order to attack Sicyon’ > ‘for I think that            you immediately set off to attack Sicyon’ (Cicero ad Atticum 1.13)

Verba praecipiendi:

Verbs denoting command can take both de-gerund/gerundive and ad-gerund/gerundive in expressing the content and purpose of the command respectively, which may be reanalyzed as indirect commands (Panchón 2003:384-387):

6a)   cum  de muta-nd-o                                      praecip-ere-t                     homin-e
‘since he ordered about changing the man’ > ‘since he ordered to change the man.’ (Augustine Sermones 9.8)

6b)   ut          consul-es            populum           cohort-are-ntur
so.that consul-NOM.PL people-ACC.SG encourage-IMPERF.SUBJ-3PL

ad  rogation-em accipiendam

‘so that the consuls might encourage the people so as to accept the plea’ > ‘so that the consuls might encourage the people to accept the plea’ (Cicero ad Atticum 1.14)

6c)   ad resistitue-nd-um                        non   compell-it
AD re-establish-GERUND-ACC.SG NEG  force-PRES.3SG
‘he does not force you so that you might re-establish it.’ > ‘he does not force you to re-establish it.’ (Augustine Epistulae 153.21)

The distribution of Romance prepositional infinitives hence seems to conform to Latin prepositional gerunds/gerundives where de in being the marker of theme/content is semantically more general and hence compatible with a wider range of verbs whereas ad as a marker of purpose/intention is only used with verbs that express command and purpose. These developments are strikingly similar to English to-infinitives, especially from a formal perspective, as discussed in the next section.

Prepositional phrases > prepositional infinitives

English to-infinitives are the prototypical example of non-finite complementation and it is widely held that to-infinitives are reanalysed in Old English (OE) from being purposive adjuncts to clausal complements (cf Latin ad-gerund/gerundive), which are particularly frequent with verbs of purpose and command (Los (2005:chapter 3)):

7a)   tiligen we us to  gescild-enne and us to gewarnig-enne
strive   we us TO shield-DAT   and  us to guard-DAT
‘we should try to shield ourselves and guard ourselves…’ (HomS 44,158)

7b)   on hwilcum godum tihst    pu     us to gelyf-enne ?
in  which      gods     urgest thou us to believe-DAT
‘which gods do you urge us to believe in?’ (AELS (George) 148)

Furthermore, both Latin/Romance and English prepositional infinitives are the results of morphophonological erosion in the nominal paradigm, since the Germanic dative ending –enne following OE to is argued to be obsolete in OE (Los 2005:3-5) and the Romance infinitive, in contrast to Latin gerund/gerundive, likewise does not inflect for morphological case. In both cases, the nominal properties of the clausal complement are practically eliminated, which severely weakens the agreement between the preposition and its nominal complement (Roberts and Roussou 2003:105), which leads to their reanalysis as non-finite clauses. Furthermore, Latin/Romance de-infinitives represent a new pathway of syntactic change since, in contrast to English to-infinitives and Latin/Romance ad-infinitives, Latin/Romance de does not express purpose but is more semantically general in expressing the content of propositions, which not only yields its wider distribution in Romance but also reveals two distinct types of non-finite complementisers, one more purpose-oriented (to/ad), the other more neutral (de). Since non-finite complementisers are traditionally held to be low in the cartography of C-elements (Rizzi 1997), it may be argued that there are two functional projections in the non-finite domain (Mrealis/Mirrealis), which parallels the dual complementiser system in Romance finite complementation (Ledgeway 2012b). The Latin/Romance evidence, therefore, reveals a more sophisticated C-system, especially in the non-finite domain.


The use of Latin prepositional gerund/gerundive represents a new topic in Latin/Romance historical syntax which opens up many new avenues to the formation of Romance non-finite complementation, since although prepositional infinitives, which are plentiful in Romance, are not attested in Latin, their historical structural equivalents, namely Latin the prepositional gerund/gerundive, are widely attested in examples where they are re-analysable as clausal complements. It is therefore possible to account for the pan-Romance distribution of prepositional infinitives by expanding our search and analysis to Latin prepositional gerunds/gerundives.


Mechanising historical phonology

by Patrick Sims-Williams (Aberystwyth University)

The neogrammarian approach to historical phonology involves propounding sound-change laws and explaining exceptions by means such as sub-laws, rearranging the relative chronology, and appeal to special factors such as analogy, borrowing, incomplete lexical diffusion, and sporadic phenomena like metathesis. Progress is mostly made manually, but in the second half of the 20th century some linguists looked forward to the ‘triumph of the electronic neogrammarian’. Although this hasn’t been realized yet, I’ll argue that there are opportunities to make important advances with comparatively little effort.

Syntactic microvariation in Romance – bridging synchrony and diachrony: the case of SI

by Sam Wolfe (University of Oxford)

Major syntactic differences between the medieval Romance languages and their modern counterparts have been noted for well over a century (Tobler 1875; Diez 1882; Thurneysen 1892; Meyer-Lübke 1889), with a body of more recent work highlighting important synchronic variation amongst the medieval languages (Vance, Donaldson & Steiner 2009; Wolfe 2015, forthcoming), and diachronic variation observable in texts from different stages of the medieval period (Ledgeway 2009; Labelle & Hirschbühler 2017; Galves forthcoming). In this talk, I focus on a particular aspect of the syntax of Medieval Romance: the grammar of the particle SI, which abounds across the early textual records, but eludes a satisfying analysis.

Based on a new hand-annotated corpus of seven Old French texts, I show that the numerous and frequently contradictory claims in the literature regarding SI (Marchello-Nizia 1985; Reenen & Schøsler 2000; Ledgeway 2008) can often be reconciled under an account where its formal characterisation, discourse-pragmatic value, and interaction with other areas of core clausal syntax varies markedly, both synchronically and diachronically, within the period conventionally referred to as ‘Old French’. Specifically, I sketch a grammaticalisation pathway where SI becomes progressively bleached through a process of upwards reanalysis (Roberts & Roussou 2002). This entails a change from SI (>SIC) as an adverbial encoding temporal succession, to topic continuity marker (Fleischman 2000), then two distinct expletive stages, where SI acts as a last-resort mechanism to satisfy the Verb Second constraint. The core empirical observation is that there is large-scale variation between SI in 12th-century and 13th-century texts and, furthermore, small-scale variation in the syntax of SI across texts which are conventionally considered contemporaneous.

In the second part of the talk I bring in data from a range of Medieval Italo-Romance varieties, showing that SI in Sicilian, Florentine, Piedmontese and Venetian texts mirrors almost exactly the distribution of SI in 12th-century French, but does not show the distributional properties of the highly grammaticalised element found in 13th-century French.

The core intuition behind the analysis of Medieval Romance SI is that the element in question can occupy distinct positions within an articulated left periphery (on which see Rizzi 1997, Benincà & Poletto 2004 and Ledgeway 2010) during different stages of the grammaticalisation process. Furthermore, throughout its history, SI cannot be understood in isolation from ongoing changes in the Medieval Romance Verb Second property and its correlates (Wolfe 2016), but may also have a previously overlooked role in shaping a number of the morphosyntactic isoglosses observable within Romance-speaking Europe today. In particular, I suggest that differences in the syntax of Old French SI and its Old Italo-Romance counterparts may account for major contemporary Italo- vs. Gallo-Romance differences in the syntax of topicalisation, focus and the null subject property.

Overall, although SI may seem like a small and parochial area of Medieval Romance syntax, its synchronic and diachronic significance for an understanding of the evolution of Romance grammar cannot be underestimated.


Fleischman, Suzanne. 2000. Methodologies and Ideologies in Historical Linguistics: On Working with Older Languages. In Susan C. Herring, Pieter Th. van Reenen & Lene Schøsler (eds.), Textual parameters in older languages. Amsterdam; Philadelphia, Pa.: John Benjamins. 33–58.

Galves, Charlotte. Forthcoming. Partial V2 in Classical Portuguese. In Theresa Biberauer, Sam Wolfe & Rebecca Woods (eds.), Rethinking Verb Second. (Rethinking Comparative Syntax). Oxford: Oxford University Press.

Labelle, Marie & Paul Hirschbühler. 2017. Leftward Stylistic Displacement in Medieval French. In Eric Mathieu & Robert Truswell (eds.), Micro-change and Macro-change in Diachronic Syntax. Oxford: Oxford University Press.

Ledgeway, Adam. 2008. Satisfying V2 in early Romance: Merge vs. Move. Journal of Linguistics 44(2).

Marchello-Nizia, Christiane. 1985. Dire le vrai: L’adverbe «si» en français médieval: Essai de linguistique historique. (Publications Romanes et Françaises CLXVIII). Geneva: Droz.

Roberts, Ian & Anna Roussou. 2002. Syntactic change a minimalist approach to grammaticalization. Cambridge: Cambridge University Press.

Vance, Barbara, Bryan Donaldson & B. Devan Steiner. 2009. V2 loss in Old French and Old Occitan: The role of fronted clauses. In Sonia Colina, Antxon Olarrea & Ana Maria Carvalho (eds.), Romance Linguistics 2009. Selected papers from the 39th Linguistic Symposium on Romance Languages (LSRL), Tuscon, Arizona. (Current Issues in Linguistic Theory 315). Amsterdam: John Benjamins. 301–320.

Wolfe, Sam. Forthcoming. Redefining the V2 Typology: The View from Medieval Romance and Beyond. (Ed.) Christine M. Salvesen. Linguistic Variation (Special Issue: A Micro-Perspective on V2 in Germanic and Romance).

Wolfe, Sam. 2015. The Old Sardinian Condaghes. A Syntactic Study. Transactions of the Philological Society 113(2). 177–205.

A video of the talk can be found below. The accompanying handout is available here.

What is language revitalization about? Some insights from Provence

by James Costa (Sorbonne Nouvelle / UMR LACITO (CNRS), Paris)

Should you find yourself in Provence this summer, you might wonder why some villages have bilingual signs at the entrance. Your surprise would be forgiven, since you are unlikely to have heard anything but French in most places, and likely a lot of English as you approach the Mediterranean. But if you listen more closely, observe more closely, you might come across a world that is fast vanishing, but that is still present. You might stumble upon a concert in a language that you cannot identify, or wonder why some street names don’t sound French. You might even hear people speak Occitan—for this is what it is, a language also known as Provençal, one which many locals will refer to as “Patois” (a derogatory term in France to refer to anything other than French traditionally spoken in the country).

Bilingual sign (French, Provençal)

This sort of experience might happen to you in Provence, but not only. Across the European Union, several million people speak a language that is not the official language of the state they live in. Across Europe, there are language advocates who defend and promote the right to speak one’s language. This struggle for language rights also extends to Latin America, North America, Australia, and many other places. This, many scholars assert, is a consequence of globalization—a backlash against uniformity if you like. A way of being oneself, of finding meaning locally in a world that seems to be getting smaller. In my recent book, Revitalising Language in Provence: A Critical Approach, I argue otherwise. Those movements are not a reaction to globalization—they are, on the contrary, a way of taking part in this process, on the very terms defined by those who define what globalization is (and not on their own terms, as Leena Huss [2008, 133] asserts).

But let’s start from the beginning. This book focuses on Provence, home to what is perhaps the earliest language reclamation movement, or at least one of the earliest. Poets had already started writing texts in defense of Gascon, Provençal or Languedocien (all dialects of what most scholars of Romance linguistics view as Occitan) back in the 16th and 17th centuries. This is perhaps a consequence of an increasingly aggressive move to promote French in all administrative domains at the expense of Latin and Occitan, which had been in use for official usage for centuries in what is now Southern France. But it was after the French Revolution Terror government (after 1793) sought to eradicate the “patois” that a genuine interest was born in various parts of France, resulting in the south in a rediscovery of the poetry of Medieval Troubadours and in a scholarly interest in the history of Provence and Languedoc before their annexation to France. It wasn’t, however, before the 1850s that an organized language-based movement was formed, under the aegis of poets such as Frederic Mistral or Joseph Roumanille.

The Felibrige was the name they gave to their movement, a name whose origin remains mysterious. The Felibres sought to revive the Provençal or Occitan language (which was still almost universally spoken in all of Southern France) through poetry and literature. And indeed, Mistral published a series of long, epic poems that were hailed across Europe as monuments of literature. Mirèio is probably his most well known poem, a love story set in the Crau region of Provence and an allegory of the language revival movement. Mirèio was acclaimed in Paris as a chef d’æuvre, and was prefaced by Lamartine.

I recount parts of the history of the movement in the book but for this blog post, suffice it to say that while successful on a literary level, it never succeeded in political terms. Provençal was long banned in education, and despite a strong Occitan movement throughout the 20th century, the use of Provençal continued (and continues) to decline. But the story I tell in this book isn’t the story of the language movement. Instead, following a two-year ethnographic study in Provence, I ask why the movement was based on language at all, like so many others afterwards—but, crucially, none before, or at least none before the 1840s.  Continue reading “What is language revitalization about? Some insights from Provence”