Recurrent word combinations in academic writing by native and non-native speakers of English: A lexical bundles approach

Published on Apr 1, 2012in English for Specific Purposes1.70
· DOI :10.1016/j.esp.2011.08.004
Annelie Ädel10
Estimated H-index: 10
(Stockholm University),
Britt Erman4
Estimated H-index: 4
(Stockholm University)
In order for discourse to be considered idiomatic, it needs to exhibit features like fluency and pragmatically appropriate language use. Advances in corpus linguistics make it possible to examine idiomaticity from the perspective of recurrent word combinations. One approach to capture such word combinations is by the automatic retrieval of lexical bundles. We investigated the use of English-language lexical bundles in advanced learner writing by L1 speakers of Swedish and in comparable native-speaker writing, all produced by undergraduate university students in the discipline of linguistics. The material was culled from a new corpus of university student writing, the Stockholm University Student English Corpus (SUSEC), amounting to over one million words. The investigation involved a quantitative analysis of the use of four-word lexical bundles and a qualitative analysis of the functions they serve. The results show that the native speakers have a larger number of types of lexical bundles, which are also more varied, such as unattended ‘this’ bundles, existential ‘there’ bundles, and hedging bundles. Other lexical bundles which were found to be more common and more varied in the native-speaker data involved negations. The findings are shown to be largely similar to those of the phraseological research tradition in SLA.
  • References (38)
  • Citations (92)
Published on Jun 6, 2014
Andrew Pawley20
Estimated H-index: 20
Frances Hodgetts Syder3
Estimated H-index: 3
A system for centrifugal casting of induction melted metal includes a crucible and a porous mold secured to a casting arm. The casting arm is secured to a hollow rotatable shaft which includes two separate fluid channels extending axially therethrough. The shaft is supported on a rotary valve which provides separate flow communication with each flow channel. One flow channel provides pressurized argon to the crucible to prevent oxide formation during casting. The other channel provides a vacuum ...
Published on Sep 1, 2010in Applied Linguistics3.04
Rita Simpson-Vlach4
Estimated H-index: 4
(UM: University of Michigan),
Nick C. Ellis51
Estimated H-index: 51
(UM: University of Michigan)
This research creates an empirically derived, pedagogically useful list of formulaic sequences for academic speech and writing, comparable with the Academic Word List (Coxhead 2000), called the Academic Formulas List (AFL). The AFL includes formulaic sequences identified as (i) frequent recurrent patterns in corpora of written and spoken language, which (ii) occur significantly more often in academic than in non-academic discourse, and (iii) inhabit a wide range of academic genres. It separately...
Published on Jun 1, 2010in Language Learning & Technology2.57
Yu-Hua Chen3
Estimated H-index: 3
Paul Baker24
Estimated H-index: 24
Yu-Hua Chen and Paul Baker Lancaster University This paper adopts an automated frequency-driven approach to identify frequently-used word combinations (i.e., lexical bundles) in academic writing. Lexical bundles retrieved from one corpus of published academic texts and two corpora of student academic writing (one L1, the other L2), were investigated both quantitatively and qualitatively. Published academic writing was found to exhibit the widest range of lexical bundles whereas L2 student writin...
Published on Dec 17, 2009
Douglas Biber40
Estimated H-index: 40
(NAU: Northern Arizona University)
Published on Dec 17, 2009
Bernd Heine25
Estimated H-index: 25
Heiko Narrog11
Estimated H-index: 11
Published on Jul 1, 2009in English for Specific Purposes1.70
Philip Durrant11
Estimated H-index: 11
(University of Nottingham)
Abstract A number of researchers are currently attempting to create listings of important collocations for students of EAP. However, so far these attempts have (1) failed to include positionally-variable collocations, and (2) not taken sufficient account of variation across disciplines. The present paper describes the creation of one listing of positionally-variable academic collocations and evaluates the extent to which it is likely to be useful to students from across a wide range of disciplin...
Published on May 20, 2009
Britt Erman1
Estimated H-index: 1
The formulaic language in focus in the present paper is collocations. The ‘intrinsic’ as opposed to ‘extrinsic’ features of collocations related to Frame Semantics and Lexical Functions are proposed to best reflect their unit-hood status. The paper primarily discusses the lexical status and identification of collocations from different theoretical frameworks, and also reports on a study examining the collocations in English essays written by native and non-native writers. The results show that t...
Philip Durrant11
Estimated H-index: 11
Norbert Schmitt39
Estimated H-index: 39
Usage-based models claim that first language learning is based on the frequency-based analysis of memorised phrases. It is not clear though, whether adult second language learning works in the same way. It has been claimed that non-native language lacks idiomatic formulas, suggesting that learners neglect phrases, focusing instead on orthographic words. While a number of studies challenge the claim that non-native language lacks formulaicity, these studies have two important shortcomings: they f...
Published on Jan 1, 2009
Ute Römer18
Estimated H-index: 18
Ann Arbor1
Estimated H-index: 1
Cited By92
Published on May 7, 2018in Corpora
Duygu Candarli , Steven Jones32
Estimated H-index: 32
Lexical bundles are pervasive in English academic writing; however, little scholarly attention has been paid to how quantitative and qualitative research paradigms influence the use of lexical bundles in research articles. In order to investigate this, we created two equal-size corpora of research articles in the discipline of education. Four-word lexical bundles were examined in terms of their structural characteristics and discourse functions in the quantitative and qualitative research articl...
Published on Jul 1, 2019in Journal of English for Academic Purposes1.73
Yu Kyoung Shin (Hallym University)
Abstract Formulaic language is widely used in academic prose and is known to be a useful measure of various aspects of language development. Prior studies have reported that L2 novice/student writers rely on formulaic language typical of conversation more than L1 academic writers do. However, these studies compared different types of academic writing, and thus do not clarify whether the L2 patterns they found are attributable to register or to characteristics of L2 writers, or both. The present ...
Published on Jun 1, 2019in Language Learning2.00
Alvin Cheng-Hsien Chen2
Estimated H-index: 2
(NTNU: National Taiwan Normal University),
Alvin Cheng‐Hsien Chen (NTNU: National Taiwan Normal University)
Published on Jun 1, 2019in Journal of Second Language Writing4.20
Sonca Vo (Iowa State University), Sonca Vo
Abstract Second language writing research has often analyzed written discourse to provide evidence on learner language development; however, single word-based analyses have been found to be insufficient in capturing learner language development (Read & Nation, 2006). This study therefore utilized both single word-based and multi-word analyses. Specifically, it explored vocabulary distributions and lexical bundles to better understand the development of writing proficiency across three levels in ...
Published on May 1, 2019in Journal of English for Academic Purposes1.73
Xiaofei Lu10
Estimated H-index: 10
(PSU: Pennsylvania State University),
Jinlei Deng (University of Languages and International Studies)
Abstract This study compared the use of lexical bundles in dissertation abstracts written in English by Chinese and L1 English doctoral students. Our data consisted of 13,596 and 4,755 abstracts of doctoral dissertations written by Chinese students at Tsinghua University (the Tsinghua Corpus) and L1 English students at the Massachusetts Institute of Technology (the MIT Corpus), respectively. With a combination of frequency and dispersion criteria, 165 and 134 four-word bundles were identified fr...
Published on Apr 1, 2019in RELC Journal
Liang Li1
Estimated H-index: 1
(University of Waikato),
Margaret Franken8
Estimated H-index: 8
(University of Waikato),
Shaoqun Wu5
Estimated H-index: 5
(University of Waikato)
Lexical bundles, recurrent multiword combinations in a register, are extremely common and important discourse building blocks in academic writing. An increasing number of studies have investigated lexical bundles in academic writing in recent years, but few studies have explored L2 learners’ interpretations of their own bundle production, particularly sentence initial bundle production. Investigating the sources that have appeared to influence learners’ choices and knowledge of bundles is import...
Published on Apr 1, 2019in English for Specific Purposes1.70
Ying Wang2
Estimated H-index: 2
(Stockholm University)
Abstract Text-oriented formulaic expressions (e.g., in addition to, on the other hand, nevertheless ) are important in crafting reader-friendly prose and particularly frequent in academic discourse. By accommodating both multi-word expressions (MWEs) and single-word expressions (SWEs), the present study is among the first attempts to explore how the two types of expressions relate and contribute to the level of formulaicity in academic discourse. Through a manual examination of formulaic express...
Published on Feb 5, 2019in Journal of Quantitative Linguistics0.82
Yves Bestgen16
Estimated H-index: 16
(UCL: Université catholique de Louvain)
ABSTRACTFormulaic sequences in language use are often studied by means of the automatic identification of frequently recurring series of words, often referred to as ‘lexical bundles’, in corpora that contrast different registers, academic disciplines, etc. As corpora often differ in size, a critically important assumption in this field states that the use of a normalized frequency threshold, such as 20 occurrences per million words, allows for an accurate comparison of corpora of different sizes...