mainly the Google Plus Corpus Linguistics Archive

puzzling language teaching

How can I use corpora to improve my academic writing in English?

Quick, straightforward intro to using corpora to help with academic writing. For a more comprehensive take check member Monika Sobejko’s post Teaching writing with the aid of COCA []

Fiction genre and learner collocational knowledge

Fiction genre in both COCA and BNC best predicts (i.e. highest correlation with) learner collocation knowledge, Durrant (2014). This could be useful to select collocations to use with lower proficiency students. For higher proficiency students filter the previous list by mutual information (MI) score in COCA as learners are known to be weak with low (word) frequency items that MI scores pick out. Or use the KWIC function of COCA to easily see any idiomatic uses that higher proficiency learners may want to learn.

For example searching on collocations of all uses of get in the Fiction section in COCA shows highly frequent collocations include get out (of) + location and get up. These can be appropriate for low proficiency learners.

Filtering by MI score we have contraction of (have) got to i.e. gotta and get rid (of). Looking more closely in a KWIC search in COCA we see uses of the idiom get out of hand. These may be appropriate for higher proficiency students to learn.

Durrant, P. (2014). Corpus frequency and second language learners’ knowledge of collocations: A meta-analysis. International Journal of Corpus Linguistics, 19, 443–477.

BYU-COCA Corpus Query – on the one hand

Here’s another challenge for you. What happens if you look for the term on the one hand by comparing the Spoken sections in BYU-COCA with each of the other 4 sections – Fiction, Magazine, Newspaper, Academic.

1. What do you predict given you know that on the one hand is often associated with explanations?

Now do the same with on the other hand.

The above query came about because I noticed that in speaking people tended to use both pairs of on the one hand & on the other hand whereas by contrast in written online texts I noticed only the second part of pair i.e. on the other hand being used.

See comments below.


BYU-COCA Corpus Query – Prepositions of place []

BYU Corpora Digs 1 – Rock up []

Is this the longest term in English language teaching?

Check out this regex (regular expression) to search any CLAWS7 tagged corpus for the present perfect, “It accounts for both contracted and full forms and also allows a number of intervening words”[]

See also:

CL2017 Keynote Speakers []

BYU-Corpora Digs 1 – Rock up

The corpora one can access through the BYU interface [] range from US Soap Operas to British Parliament Speeches, from historical English in the US in the 1800s right up to yesterday’s news on the web in 20 countries round the world.

This allows interested parties a number of ways to look at some language in use.

A recent story about UK politics reports on a politician using the follow language:

No, I just rocked up and then waved at the CCTV.


This use of rock up seems worthy of a little attention. So my thought is what information can we get about the use of rock up using the corpora at BYU?

I’ll post my thoughts in a few weeks. Thanks for any consideration : )

BootCat Top Tip

If you use BootCat here is a command to help you separate the collected corpus into individual files using the CURRENT URL line as a separator in a regex:

awk ‘/CURRENT URL/{g++} { print $0 > g”.txt”}’ corpus.txt

Be careful when copy pasting this command into your command line that the apostrophe ‘ is straight and not curly.


BootCat custom URL: []

BootCat seeding: []