A spoken corpus of Cameroon Pidgin English: Compilation, applications and next steps

by Melanie Green (Sussex) & Gabriel Ozón (Sheffield)

Cameroon Pidgin English (CPE) is an expanded pidgin/creole spoken in some form by an estimated 50% of Cameroon’s 22,000,000 population (Simons & Fennig 2017). CPE is spoken primarily in the Anglophone west regions, but also in urban centres throughout Cameroon. As a predominantly spoken language, CPE has no standardised orthography, but enjoys a vigorous oral tradition, not least through its presence in the broadcast media. The language has stigmatised status in the face of French and English, prestige languages of Cameroon, where it also co-exists with an estimated 280 indigenous languages (Simons & Fennig 2017).

We describe the spoken corpus of CPE, a British Academy/Leverhulme-funded pilot study (Green et al. 2016, Ozón et al. 2017). The corpus consists of 30 hours of recordings made in five locations, resulting in a total of 240,000 words (80 texts of 15 minutes/3,000 words). Proportions of text types are guided by the International Corpus of English project (Nelson 1996), and the texts contain mark-up and part-of-speech-tagging. The corpus files, which are freely available from the Oxford Text Archive, include sound files (*.mp3 and *.wav), raw and annotated text files, participant metadata, a field manual, a tagging manual and a spelling list.

We then briefly describe some case studies of linguistic phenomena that the pilot corpus allows us to investigate, focusing on grammatical and lexical phenomena, as well as codeswitching, demonstrating that while a small corpus provides a robust test-bed for the investigation of grammatical phenomena, a larger dataset is required for the full investigation of lexical and sociolinguistic phenomena. Finally, we outline our plans for a 1-million-word corpus, a project for which a funding application is in preparation.


This paper was read at the Philological Society meeting at SOAS, University of London, on Friday, 18 January 2019, 4.15pm. A video recording of the presentation can be found below; the slides are available here.


References
Green, Melanie, Miriam Ayafor and Gabriel Ozón. 2016. A spoken corpus of Cameroon Pidgin English: pilot study. British Academy/Leverhulme funded digital database (ref. SG140663).

Nelson, Gerald. 1996. The design of the corpus. In Sidney Greenbaum (ed.). Comparing English worldwide. The International Corpus of English. Oxford: Clarendon Press, 27–35.

Ozón, Gabriel, Miriam Ayafor, Melanie Green and Sarah Fitzgerald. 2017. A spoken corpus of Cameroon Pidgin English. World Englishes 36: 427–447.

Simons, Gary F. and Charles D. Fennig (eds.). 2018. Ethnologue: Languages of the World, Twenty-first edition. Dallas, Texas: SIL International.

The Faces of PhilSoc: Melanie Green

melanie_green

Name: Melanie Green

Position: Reader in Linguistics and English Language

Institution: University of Sussex

Role in PhilSoc: Council Member

 


About You

How did you become a linguist – was there a decisive event, or was it a gradual development?

Somewhere between doing my A-levels (in English, French and Latin) and applying for university, when I found the SOAS prospectus in the school cupboard. At that point I realised that studying language didn’t have to mean studying literature, and I applied to study Hausa at SOAS. In my final year, I took a course that focused on the linguistic description of Hausa (taught by Professor Philip Jaggar), and it was this course that led me upstairs to the Linguistics Department, where I then took my MA and PhD.

What was the topic of your doctoral thesis? Do you still believe in your conclusions?

My doctoral thesis was on focus and copular constructions in Hausa, and offered a minimalist analysis. I still believe in the descriptive conclusions, which relate to the grammaticalisation of non-verbal copula into focus marker, but I’m less convinced these days by formal theory. I still enjoy teaching it though, because I think it makes students think carefully (and critically) about formal similarities and differences between languages.

On what project / topic are you currently working?

Together with Gabriel Ozon at Sheffield and Miriam Ayafor at Yaounde I, I’ve just completed a BA/Leverhulme funded project to build a pilot spoken corpus of Cameroon Pidgin English. Based on this corpus, Miriam and I co-authored a descriptive grammar of the variety, which is in press.

What directions in the future do you see your research taking?

In my dreams, typologically-framed language documentation. In reality, probably more corpus linguistics, since this seems to be what attracts funding at the moment.

How did you get involved with the Philological Society?

The PhilSoc published my first book, Focus in Hausa.


‘Personal’ Questions

Do you have a favourite language – and if so, why?

No.

Minimalism or LFG?

Minimalism.

Teaching or Research?

Both.

Do you have a linguistic pet peeve?

No.

 


Looking to the Future

Is there something that you would like to change in academia / HE?

I would like there to be more funding for language documentation. Languages are dying faster than we can describe them.

(How) Do you manage to have a reasonable work-life balance?

I do, but that only became possible in mid-career. I achieve it with careful planning, so when I’m off work, I’m really off work.

What is your prime tip for younger colleagues?

Start publishing as early as possible. 

TPS 114(3) – Abstract 2

Trade Pidgins in China: Historical and Grammatical Relationships

by Michelle Li

Sino-western contacts began in the 16th century when Europeans started open trade with China. Two trade pidgins, Macau Pidgin Portuguese (MPP) and Chinese Pidgin English (CPE), arose during the Canton trade period. This paper examines the historical and grammatical relationships of these two pidgins by drawing data from 19th century phrasebooks. This study argues for a close connection between MPP and CPE with reference to three grammatical features which go beyond shared vocabulary: locative copulas, form of personal pronouns, and prepositional complementisers. While these grammatical properties find little resemblance in the recognised source languages for CPE, parallel uses are attested in MPP, which therefore appears to provide the model for these properties in CPE.