The moment of truth: Testing the Matrix Language Frame model in English–Vietnamese bilingual speech

by Li Nguyen (University of Cambridge)

Over the last few decades, there has been burgeoning interest in the study of code-switching in the research of bilingualism. Despite various definitions of what the phenomenon might entail, it is generally agreed in the literature that code-switching broadly refers to bilinguals’ ability to effortlessly alternate between two different languages in their daily speech (Bullock and Toribio 2008:1). This ability enables speakers’ behaviour of language mixing, which, as researchers have come to realise, is far from random but rather governed by specific structural constraints (Poplack 1980; Bullock & Toribio 2009). The nature of such constraints has inspired the search for a ‘universal pattern’, resulting in new investigations involving a number of language pairs, such as English–Spanish (Poplack 1980; Travis & Torres Cacoullos 2013; Aaron 2015), English–Welsh (Stammers & Deuchar 2012), Ukrainian–English (Budzhak-Jones & Poplack 1997), Igbo–English (Eze 1997), or Acadian French–English (Turpin 1998).

One of the most influential theoretical accounts in code-switching literature is Myers-Scotton (2002)‘s Matrix Language Frame model (MLF), which assumes an asymmetrical relationship between the two languages in bilingual discourse. As the MLF goes, ‘speakers and hearers generally agree on which language the mixed sentence is “coming from”’ (Joshi 1985:190–191), and it is this language that constitutes the ‘matrix language’ (ML) of the conversation. In a code-switched clause, the MLF predicts that the ML (i) supplies closed-class system morphemes such as finite verbs or function words, and (ii) determines word order. Although the need and the practicality of identifying a ML in some language pairs are debatable (Sankoff & Poplack 1981; Clyne 1987), the asymmetrical relationship between two languages involved is borne out in many existing datasets. Most often, the asymmetry is more obvious in pairs that are structurally different, with existing evidence heavily involving an Indo-European language and an Asian or African language (see Chan 2009:184 for an exhaustive list). The question is then: does the MLF actually generate accurate predictions in spontaneous speech?

In this project, I am testing the applicability of the MLF in English–Vietnamese code-switching data. This pair provides an interesting testing platform, since they share a similar surface word order (SVO) despite other typological differences. In other words, at a clausal level, the word-order morpheme principle is not applicable to determining the Matrix Language. The focus of the study thus lies on the so-called ‘conflict sites’, points at which the word order of the participating languages differs. These conflicts involve the sequence head-modifier within NPs and Possessive Phrases. Specifically, modifier and possessors precede head nouns in English, but follow head nouns in Vietnamese. When bilingual speakers are presented with such a conflict, MLF predicts that the matrix language (i.e. language of the finite verbs or function words) should determine the word order. Furthermore, as an isolating language, Vietnamese has virtually no overt morphology. This adds an extra layer to the complexity of determining the Matrix Language at the clausal level, which is traditionally is assigned by the language of the finite verb, thereby testing the MLF predictions when these two languages come into contact.

Thanks to fieldwork funding support from the Philological Society, I was able to carry out my fieldwork in Canberra, Australia, where I had existing connections with the Vietnamese bilingual community. Data collection took place between June and September 2017. My principle in building the corpus was drawn from Labov’s emphasis on the vernacular, where ‘minimum attention is paid to speech’ (Labov 1984:29).  This approach was chosen because the vernacular reflects the most natural, systematic form of the language acquired by the speaker ‘before any subsequent efforts at (hyper-) correction or style shifting are made’ (Poplack 1993:252). Recruited speakers were thus free to choose their own interlocutors, in an environment that they were most comfortable with. They were asked to self-record a conversation on their personal mobile phone device, of a minimum of 30 minutes. After the recording was returned, speakers were asked to fill in a questionnaire to obtain information on extra-linguistic variables. The questionnaire consists of 18 questions, available both in English and Vietnamese.

The data collection process was successfully completed, resulting in a corpus of 10 hours of spontaneous speech. Results from this research should offer concrete, empirical evidence for or against the applicability of the MLF in language contact situations in which the participating languages are typologically disparate. If found non-applicable, it is hoped that the patterns found will form the foundation of a new theoretical framework accounting for the data in question. Methodologically, the study demonstrates a systematic approach to determining the ML, especially in problematic situations where the overarching word order of the participating languages converge, and one of the languages lacks overt morphology. When made publicly available, the data will also constitute the first digitalised English–Vietnamese bilingual corpus, providing a valuable resource for future research on this language pair in particular, and in bilingualism research as a whole.


Aaron, J. E. (2015). Lone English-origin nouns in Spanish: The precedence of community norms. International Journal of Bilingualism 19(4), 429–480.

Budzhak-Jones, S. & Poplack, S. (1997). Two generations, two strategies: the fate of bare English-origin nouns in Ukrainian. Journal of Sociolinguistics 1(2), 225-258.

Bullock, B. & Toribio, J. (2008). Cambridge Handbook of Linguistic Code-switching. Cambridge: Cambridge University Press.

Chan, B. (2009). Code-switching between typologically distinct languages. In B. Bullock & A. Toribio (eds.), The Cambridge Handbook of Linguistic Code-switching. Cambridge: Cambridge University Press, 182-198.

Clyne, M. (1987). Constraints on code-switching: How universal are they? Linguistics 25, 739–76.

Eze, E. (1997). Aspects of language contact: A varionatist perspective on codeswitching and borrowing in Igbo-English bilingual discourse. PhD dissertation. Ottawa: University of Ottawa.

Joshi, K. (1985). Processing of sentences with intrasentential code switching. In D. R. Dowty, L. Karttunen and A. Zwicky (eds.) Natural language parsing. Cambridge: Cambridge University Press, 190–205.

Labov, W. (1984). Field methods of the project on linguistic change and variation. In J. Baugh & J. Sherzer (eds.), Language in use: Readings in sociolinguistics. Englewood Cliffs, NJ: Prentice Hall, 28–53.

Myers-Scotton, C. (2002). Contact Linguistics: Bilingual Encounters and Grammatical Outcomes. Oxford: Oxford University Press.

Poplack, S. (1980). Sometimes I’ll start a sentence in Spanish y termino en español: Toward a typology of codeswitching. Linguistics 18(7–8), 581–618. 

Poplack, S. (1993). Variation theory and language contact. In D. Preston (ed.), American dialect research: An anthology celebrating the 100th anniversary of the American Dialect Society. Amsterdam: Benjamins, 251–268.

Sankoff, D. & Poplack, S. (1981). A formal grammar for code-switching. Papers in Linguistics 14(1), 3-46.

Stammers J., & Deuchar M. (2012). Testing the nonce borrowing hypothesis: Counter-evidence from English-origin verbs in Welsh. Bilingualism: Language and Cognition 15(3), 630–664.

Travis, C., & Torres Cacoullos, R. (2013). Making voices count: Corpus compilation in bilingual communities. Australian Journal of Linguistics 33(2), 170-194.

Turpin, D. (1998). ‘Le francais, c’est le last frontier’: The status of English-origin nouns in Acadian French. International Journal of Bilingualism 2(2), 221–233.