Semantically driven grammaticalisation: the systematic pathways of Estonian polar question particles

by Mari Aigro (University of Tartu)

Seeing grammaticalisation as being analogically driven takes the explanatory power, which is frequently assigned to syntactic position, and assigns it to the semantic analogy between the source and the target. This case study focuses on the semantic cohesion patterns in the pathways of contemporary as well as historical Estonian polar question particles (PQPs). It will show that not only is the semantic component of function words much more relevant to grammaticalisation than is commonly thought, but also that the grammaticalisation network surrounding a functional category can in fact be semantically so uniform that one can devise a model based on a semantic map and assign it a certain degree of explanatory power regarding why certain markers become PQPs and others are much less likely to do so.

While the most frequently mentioned PQP sources are negation and disjunction markers (Heine & Kuteva 2002), a comprehensive literature review reveals altogether six source categories. In addition to disjunction and negation markers, this list also includes clause conjunction markers, embedded PQPs, conditional markers and pronominal interrogatives (König & Siemund 2007, Nordström 2010, Metslang et al. 2017). These sources appear to form a systematic set – all of the above could be classified as markers of polarity or truth values (see Payne 1985 for coordinators, Nordström 2010 for conditionals). To investigate, whether or not this principle would hold for additional data and other newly discovered source categories, an in-depth corpus study was carried out on Estonian, a language especially rich in both neutral and biased PQPs.

Nearly 2400 polar questions using the particle strategy (inversion and zero-marking strategies are used alongside) were manually encoded in the Corpus of Old Written Estonian (17th–19th century) and the Corpus of Standard Estonian (20th century). I found six different PQPs—four biased and two neutral—used between the 17th and 21st centuries. Three of them—kas, või, ega—are still in use in Standard Modern Estonian. The source of kas is either a clause conjunction (“also”) or an embedded PQP; või most likely originates from a disjunction (“or”); and ega from a clause rejection marker (“nor”). The three historical polar question markers are eks, eps and jo/ju; while the first two originate from negation, the source of the latter is an affirmative focus marker. Only three have given rise to new functional structures: eks became an affirmative polar tag question marker; kas gave rise to the disjunction marker “either”; and jo/ju, after its brief time as a PQP, became a marker of evidentiality when occurring sentence-initially (retaining the older focus reading in other positions).

Hence, the new source categories introduced by the corpus study were polarity-sensitive focus markers (for ju) and rejection markers (for ega), both of which confirm the hypothesis that polar question particles originate from non-interrogative markers, which already involve the semantic component of negation, affirmation or neutral (open) polarity. Table 1 depicts the pathways of Estonian PQPs on a semantic map, which links the two dimensions of polarity – interrogation and bias.

Table 1: Semantic map of Estonian PQPs

Markers in the neutral category are especially relevant. They leave the truth value unknown, assigning open polarity even without interrogation, and due to this share a close link with PQPs. PQPs are more frequently homophonous with disjunction markers than other particles and both of the non-biased Estonian PQPs, kas and või, originate from the neutral category. Additionally, all functional markers originating from PQPs belong in this category. However, although the fact that the map accommodates all known sources of PQPs implies causality, it can only constitute a probabilistic rather than a deterministic model.


Aigro, M. 2017. A Diachronic Study of Polar Question Particles and Their Sources. MPhil thesis, University of Cambridge.

Heine, B. & Kuteva, T. 2002. World Lexicon of Grammaticalisation, Cambridge: Cambridge University Press.

König, E. & Siemund, P. 2007. Speech Act Distinctions in Grammar. In T. Shopen, ed. Language Typology and Syntactic Description, Vol 1: Clause Structure. Cambridge: Cambridge University Press.

Metslang, H., Habicht, K. & Pajusalu, K. 2017. Where Do Polar Question Markers Come From? STUF – Language Typology and Universals 70(3).

Nordström, J. 2010. Modality and Subordinators, Amsterdam: John Benjamins Publishing Company.

Payne, J.R. 1985. Complex Phrases and Complex Sentences. In T. Shopen (ed.) Language Typology and Syntactic Description. Cambridge: Cambridge University Press.

In Memoriam Matti Rissanen

by Sylvia Adamson (University of Sheffield)

It is with great sadness that the Society has received news of the death of Matti Rissanen, Professor Emeritus of English Philology at the University of Helsinki, at the age of 80 on 24 January 2018.


A long-time member and supporter of the Philological Society, Matti Rissanen was a pioneer in English historical corpus linguistics, and the director of the project that produced the Helsinki Corpus of English Texts, which covers a thousand years of the history of English and has been used widely since its publication in 1991.

Matti Rissanen was one of the rare scholars to command the history of the English language from its early stages to the present, beginning with his PhD thesis (1967) on the Old English numeral ONE. His wide range of publications includes a number of original articles and several co-edited volumes of corpus-based research, such as Early English in the Computer Age (1993), English in Transition and Grammaticalization at Work (1997), as well as the much cited chapter on ‘Early Modern English syntax’ in The Cambridge History of the English Language (vol. 3, 1999). Also taking an active interest in early American English, he was one of the international team that re-edited the Records of the Salem Witch-Hunt (2009).

His retirement in 2001 did not mark an end to his research activities. His philological expertise made an important contribution to the publication project that resulted in a new Finnish translation of all Shakespeare’s works. One of his long-lasting research interests was the history of English connectives, on which he was working to the very last days of his life.

Active in numerous professional organizations, Matti Rissanen served as president of the Societas Linguistica Europaea and chaired the Board of the International Computer Archive of Modern and Medieval English (ICAME). He was the founder and first director of the Research Unit for Variation, Contacts and Change in English (VARIENG), an Academy of Finland Centre of Excellence from 2000 to 2011. He was also a driving force in the foundation of the Finnish Institute in London and the Language Centre of the University of Helsinki. In recognition of his achievements Matti Rissanen received many awards, including an honorary doctorate of the University of Uppsala, Sweden, and being elected to the Finnish Academy of Science and Letters. He was an Honorary member of the Modern Language Society, the International Society of Anglo-Saxonists, and the Japan Association for English Corpus Studies.

On the personal level, Matti was supervisor to several generations of undergraduate and doctoral students in Helsinki, while providing unfailing encouragement and support to many more students and colleagues both in Finland and abroad. He will be greatly missed by his wide circle of friends.

Anyone who would like to share their memories and recollections of him is invited to do so by adding them as comments (in English or Finnish) to this VARIENG blog post.

This notice has been adapted, with permission, from the notice posted by Matti’s colleagues in Helsinki.

Language, learning and usage-based theory: tackling nominal and verbal morphology in Slavic

by Dagmar Divjak (University of Sheffield)

Usage-based theories of language are built on the assumption that our ability to extract and entrench the distributional patterns available in the input enables learners to build a grammar from the ground up. This circumvents the needs for an innate universal grammar. But it does not tell us which patterns are relevant. And it remains customary for linguists to approach the data using linguistic categories—such as Case or Tense, Aspect and Mood—categories that were never intended to reflect the workings of the mind. In this talk, I will argue that it might be better to take the input as starting point and derive categories that resemble those native speakers might derive. Models from Learning Theory can help with this. I will present two case studies that capitalize on a merger of cognitive linguistics and cognitive psychology, and aim to infuse Usage-Based linguistics with insights from Learning Theory … with a little help from computational engineering.

The first case study uses insights from Learning Theory to challenge the idea that theoretical linguistic constructs such as tense, aspect and mood (TAM) predict best how native speakers of Russian read sentences containing verbs meaning to try in real time. Discrimination learning, as implemented in the NDL algorithm, proposes simple 3-letter usage-patterns and predicts the time it takes subjects to read and integrate these verbs into a sentence significantly better than all TAM markers combined.


Contrary to what mainstream (psycho)linguistic models assume, speakers do not (and do not need to) analyse verb forms in terms of abstract linguistic concepts such as tense, aspect and mood when they process language. Instead, they can rely on simple letter sequences that are linked directly to an experience and embed crucial information about that experience (i.e., is it over, ongoing, or coming up; was it something that they completed, or simply did for a while; was it an order). This demonstrates that honouring parsimony (naivety and simplicity) in the structures that are hypothesized to exist, and in the way in which behaviour is explained, is a powerful research stance, in particular for designing cognitively realistic accounts of language knowledge and representation.

The second case study demonstrates how biologically inspired machine learning techniques can pinpoint the essence of native speaker intuitions. Polish boasts fascinating examples of seemingly unmotivated allomorphy, and the genitive singular of masculine inanimate nouns (which can be -a or -u) is its prime example. Criteria for choice have been proposed that are semantic, morphological or phonological in nature, but most of these are unreliable, yielding conflicting predictions (Dąbrowska 2005). Furthermore, although -u occurs with at least twice as many nouns, -a is the default ending for new words entering the language. The NDL algorithm, that implements discrimination learning, predicts the choice between -a and -u better using simple sequences of 3 letters (letter triplets or trigraphs) than models running on richly annotated corpus data. In addition, it explains the unexpected preference of -a as genitive ending for new words in terms of the learnability of words taking the -a ending, their phonological predictability and their contextual (semantic) typicality.

On their own, linguists and psychologists would have approached these questions rather differently and, from within their disciplinary cages, would have arrived at answers that would necessarily have remained partial. Integrative interdisciplinarity, on the other hand, relies on a simultaneous, interspersed methodological endeavour to arrive at more encompassing answers that combine depth of analysis with breadth of explanation. It presupposes mutually complementary theories, shared testable hypotheses as well as compatibility of research methodologies. But what wins the game is a good dose of willingness to question your customary ways of doing things.

A video of the talk can be found below.

This paper was read at the Philological Society meeting in London, SOAS Main Building, Room 116, on Friday, 9 February, 4.15pm.