Leuven English Old to New (LEON): Some ideas on a new corpus for longitudinal diachronic studies.

Petré, Peter

No full text

Unpublished conference/Abstract (Scientific congresses and symposiums)

Leuven English Old to New (LEON): Some ideas on a new corpus for longitudinal diachronic studies.

Petré, Peter

2009 • Middle and Modern English Corpus Linguistics (MMECL)

Permalink
https://hdl.handle.net/2268/80137

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

No document available.

Annexes

2009, Some ideas on a new corpus (LEON), MMECL.ppt

Publisher postprint (453.12 kB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Corpus compilation; Old English; Middle English; Diachrony

Abstract :

[en] Despite the explosion of diachronic corpora of English in the last few decades, still not a single corpus exists that covers the entire documented history of English. Although its compilation is generally perceived as most attractive (Rissanen 2000: 13), corpus compilers do not seem to believe in its creation in the near future. This is regrettable, as many linguists dealing with longitudinal developments such as grammaticalization need to cover very long time spans, and are forced to combine several, not necessarily compatible, corpora (e.g. Hilpert 2008, van Linden 2009). Clearly, their results are less reliable than they might be if a single corpus existed (for example, Gries and Hilpert’s data (2008) show a major shift in the collocational profile of shall about 1710; however, this is precisely where one corpus they use ends and a second – rather different one – begins). So I tentatively started compiling a corpus myself, provisionally called LEON (Leuven English Old to New). The basic architecture of LEON comprises a 400,000 word corpus for each HC-period, and after 1710 for the periods 1710-1780, 1780-1850, 1850-1920, 1920-1990 and post-1990. Data available from 1250-1350, a less well represented period, serve as a template on which other subperiods are to be based to acquire best comparability of genre and region. To make up for the lack of some genres (letters, diaries) and social stratification, for each period after 1350 an additional, selfsufficient 600,000 words corpus is envisaged. While LEON is primarily conceived as a ‘meta-corpus’, mining existing corpora, some additions are envisaged too (e.g. the unedited Statutes Rwl. B.520, dated a1325). LEON does not aim at full comparability (which would be presumptuous), but wants to optimize the usefulness of concepts like ‘equal size of subperiods’ or ‘diachronic text prototype’ (HC). LEON might be, as compared to the present ‘big evil’, a ‘lesser evil’. References Gries, Stefan Th. and Martin Hilpert. The identification of stages in diachronic data: variability-based neighbour clustering. Corpora Vol. 3 (1): 59–81. Hilpert, Martin. 2008. Germanic future constructions A usage-based approach to language change. Amsterdam & Philadelphia: John Benjamins. Los, Bettelou. 2005. The rise of the to-infinitive. Oxford: Oxford University Press. Rissanen, Matti & Merja Kytö. 1993. General introduction. In Rissanen, Matti, Merja Kytö & Minna Palander-Collin, eds. 1993. Early English in the computer age: Explorations through the Helsinki Corpus. Berlin: Mouton de Gruyter. 1-17. Rissanen, Matti. 2000. The world of English historical corpora: From Cædmon to computer age. Journal of English Linguistics 28: 7-20. van Linden, An. 2009. Dynamic, deontic and evaluative adjectives and their clausal complement patterns: A synchronic-diachronic account. PhD dissertation, University of Leuven.

Research center :

Functional Linguistics Leuven (FLL)

Disciplines :

Languages & linguistics

Author, co-author :

Petré, Peter ; Université de Liège - ULiège > Département des langues et littératures modernes > Département des langues et littératures modernes

Language :

English

Title :

Leuven English Old to New (LEON): Some ideas on a new corpus for longitudinal diachronic studies.

Publication date :

09 July 2009

Number of pages :

Event name :

Middle and Modern English Corpus Linguistics (MMECL)

Event organizer :

University of Innsbruck

Event place :

Innsbruck, Austria

Event date :

From 6-07-2009 to 9-07-2009

Audience :

International

Funders :

FWO - Fonds Wetenschappelijk Onderzoek Vlaanderen [BE]

Available on ORBi :

since 23 December 2010

Statistics

Number of views

101 (5 by ULiège)

Number of downloads

87 (1 by ULiège)

More statistics