Language, learning and usage-based theory: tackling nominal and verbal morphology in Slavic

by Dagmar Divjak (University of Sheffield)

Usage-based theories of language are built on the assumption that our ability to extract and entrench the distributional patterns available in the input enables learners to build a grammar from the ground up. This circumvents the needs for an innate universal grammar. But it does not tell us which patterns are relevant. And it remains customary for linguists to approach the data using linguistic categories—such as Case or Tense, Aspect and Mood—categories that were never intended to reflect the workings of the mind. In this talk, I will argue that it might be better to take the input as starting point and derive categories that resemble those native speakers might derive. Models from Learning Theory can help with this. I will present two case studies that capitalize on a merger of cognitive linguistics and cognitive psychology, and aim to infuse Usage-Based linguistics with insights from Learning Theory … with a little help from computational engineering.

The first case study uses insights from Learning Theory to challenge the idea that theoretical linguistic constructs such as tense, aspect and mood (TAM) predict best how native speakers of Russian read sentences containing verbs meaning to try in real time. Discrimination learning, as implemented in the NDL algorithm, proposes simple 3-letter usage-patterns and predicts the time it takes subjects to read and integrate these verbs into a sentence significantly better than all TAM markers combined.


Contrary to what mainstream (psycho)linguistic models assume, speakers do not (and do not need to) analyse verb forms in terms of abstract linguistic concepts such as tense, aspect and mood when they process language. Instead, they can rely on simple letter sequences that are linked directly to an experience and embed crucial information about that experience (i.e., is it over, ongoing, or coming up; was it something that they completed, or simply did for a while; was it an order). This demonstrates that honouring parsimony (naivety and simplicity) in the structures that are hypothesized to exist, and in the way in which behaviour is explained, is a powerful research stance, in particular for designing cognitively realistic accounts of language knowledge and representation.

The second case study demonstrates how biologically inspired machine learning techniques can pinpoint the essence of native speaker intuitions. Polish boasts fascinating examples of seemingly unmotivated allomorphy, and the genitive singular of masculine inanimate nouns (which can be -a or -u) is its prime example. Criteria for choice have been proposed that are semantic, morphological or phonological in nature, but most of these are unreliable, yielding conflicting predictions (Dąbrowska 2005). Furthermore, although -u occurs with at least twice as many nouns, -a is the default ending for new words entering the language. The NDL algorithm, that implements discrimination learning, predicts the choice between -a and -u better using simple sequences of 3 letters (letter triplets or trigraphs) than models running on richly annotated corpus data. In addition, it explains the unexpected preference of -a as genitive ending for new words in terms of the learnability of words taking the -a ending, their phonological predictability and their contextual (semantic) typicality.

On their own, linguists and psychologists would have approached these questions rather differently and, from within their disciplinary cages, would have arrived at answers that would necessarily have remained partial. Integrative interdisciplinarity, on the other hand, relies on a simultaneous, interspersed methodological endeavour to arrive at more encompassing answers that combine depth of analysis with breadth of explanation. It presupposes mutually complementary theories, shared testable hypotheses as well as compatibility of research methodologies. But what wins the game is a good dose of willingness to question your customary ways of doing things.

Syntactic microvariation in Romance – bridging synchrony and diachrony: the case of SI

by Sam Wolfe (University of Oxford)

Major syntactic differences between the medieval Romance languages and their modern counterparts have been noted for well over a century (Tobler 1875; Diez 1882; Thurneysen 1892; Meyer-Lübke 1889), with a body of more recent work highlighting important synchronic variation amongst the medieval languages (Vance, Donaldson & Steiner 2009; Wolfe 2015, forthcoming), and diachronic variation observable in texts from different stages of the medieval period (Ledgeway 2009; Labelle & Hirschbühler 2017; Galves forthcoming). In this talk, I focus on a particular aspect of the syntax of Medieval Romance: the grammar of the particle SI, which abounds across the early textual records, but eludes a satisfying analysis.

Based on a new hand-annotated corpus of seven Old French texts, I show that the numerous and frequently contradictory claims in the literature regarding SI (Marchello-Nizia 1985; Reenen & Schøsler 2000; Ledgeway 2008) can often be reconciled under an account where its formal characterisation, discourse-pragmatic value, and interaction with other areas of core clausal syntax varies markedly, both synchronically and diachronically, within the period conventionally referred to as ‘Old French’. Specifically, I sketch a grammaticalisation pathway where SI becomes progressively bleached through a process of upwards reanalysis (Roberts & Roussou 2002). This entails a change from SI (>SIC) as an adverbial encoding temporal succession, to topic continuity marker (Fleischman 2000), then two distinct expletive stages, where SI acts as a last-resort mechanism to satisfy the Verb Second constraint. The core empirical observation is that there is large-scale variation between SI in 12th-century and 13th-century texts and, furthermore, small-scale variation in the syntax of SI across texts which are conventionally considered contemporaneous.

In the second part of the talk I bring in data from a range of Medieval Italo-Romance varieties, showing that SI in Sicilian, Florentine, Piedmontese and Venetian texts mirrors almost exactly the distribution of SI in 12th-century French, but does not show the distributional properties of the highly grammaticalised element found in 13th-century French.

The core intuition behind the analysis of Medieval Romance SI is that the element in question can occupy distinct positions within an articulated left periphery (on which see Rizzi 1997, Benincà & Poletto 2004 and Ledgeway 2010) during different stages of the grammaticalisation process. Furthermore, throughout its history, SI cannot be understood in isolation from ongoing changes in the Medieval Romance Verb Second property and its correlates (Wolfe 2016), but may also have a previously overlooked role in shaping a number of the morphosyntactic isoglosses observable within Romance-speaking Europe today. In particular, I suggest that differences in the syntax of Old French SI and its Old Italo-Romance counterparts may account for major contemporary Italo- vs. Gallo-Romance differences in the syntax of topicalisation, focus and the null subject property.

Overall, although SI may seem like a small and parochial area of Medieval Romance syntax, its synchronic and diachronic significance for an understanding of the evolution of Romance grammar cannot be underestimated.


A flexible approach to focus and the syntax-prosody interface

by Kriszta Szendröi (University College, London)

This paper addresses ‘a central question for […] any theory of the syntactic prosodic constituency relation’ (Selkirk, 2011, 17): how to best characterize the notion of ‘clause’ in ALIGN/MATCH constraints related to the syntax-prosody mapping of the intonational phrase. It will be proposed that the notion of ‘clause’ should be determined in each construction by making reference to the overt position of the finite verb (or auxiliary). We show how this theory of the syntax-prosody mapping determines the typology of prosodically-driven word order variations associated with focus and topic.  We will discuss data from the Bantu language, Bàsàá, and the Finno-Ugric language, Hungarian, as well as English and Italian.

AGM & The President’s Lecture: Standards, norms and prescriptivism

The Annual General Meeting of the Philological Society was held on 17 June at Selwyn College, Cambridge.

Having completed a four-year term of office, Prof. Wendy Ayres-Bennett stood down as President of the Society; she is succeeded by Prof. Aditi Lahiri FBA.

The following Members of Council have served their term on council or wished to retire early, and did not stand for re-election: Prof. Ruth Kempson FBA (KCL); Prof. Aditi Lahiri FBA (Oxford); Dr John Penney (Oxford); Dr George Walkden (Manchester).

In their place, the following new Ordinary Members of Council have been elected: Prof. Eleanor Dickey (Reading); Dr Mary MacRobert (Oxford); Prof. Maj-Britt Mosegaard-Hansen (Manchester); Dr David Willis (Cambridge).

The 9th RH Robins Prize was awarded to Jade Jørgen Sandstedt (Edinburgh) for a paper entitled ‘Transparency and blocking in Old Norwegian height harmony’, which will be published in TPS.

The outgoing President delivered her President’s Lecture on ‘Standards, norms and prescriptivism’, an audio recording and screencast of which can be found below and on the Society’s YouTube channel.

Language through Deaf eyes

by Bencie Woll (University College London)

sign20club20280022920anna20morpurgo20davies20lecture20resizedSign languages are universally found wherever Deaf communities exist, and are the world’s only truly ‘young’ languages, unrelated to the spoken languages which surround them. This presentation will review the history of British Sign Language – the language of the British Deaf community – and recent linguistic, psycholinguistic, sociolinguistic and neurolinguistic research on the language. Research on signed  language creates a new perspective on our understanding of human language generally, helping us consider the origins of human language, how language is processed in the brain, what the universal properties of human language are, and the relationship of how language is produced and perceived to the structure of language itself.

‘Counting’: quality and quantity in literary language and tools for investigating it

by Jonathan Hope (Strathclyde University, Glasgow)

The transcription of a substantial proportion of Early Modern English books by the Text Creation Partnership has placed more than 60,000 digital texts in the hands of literary and linguistic researchers. Linguists are in many cases used to dealing with large electronic corpora, but for literary scholars this is a new experience. Used to arguing from the quality, rather than quantity of evidence, literary scholars have a new set of norms and procedures to learn, and are faced with the exciting, or perhaps depressing, prospect that their object of study has changed.

 In this talk I’ll look at some specific case studies that illustrate the potential, and the problems, of quantity-based studies – and will highlight key areas where literary scholars need to reassess their expectations of ‘evidence’, and the texts we use. A possible alternative title might be ‘Learning to live with error: gappy texts and crappy metadata’.

Ethnopoetic philology and the power of narrative for endangered voices

by Alexander King (Franklin & Marshall College)

Ethnopoetics lies at the juncture of linguistics, comparative literature, anthropology, and activist politics. It is more of an approach than a discipline, and was inspired by the realization that indigenous oral literature, or ‘orature’ was of equal literary merit to that of the ancient and modern literary languages. Ethnopoetic analysis requires close attention to the form and performance of oral narratives, looking for patterning in phonology, morphology, syntax, as well as repetitions in word use and larger units. I will present some examples of the power of stories in the lives of Koryaks, who are indigenous to Kamchatka, Russia. The material comes from a documentation project by Valentina R. Dedyk and me (funded by a grant from the Endangered Languages Documentation Project).

