Empowering your writing with corpus tools

With the pressures to publish internationally ever so real, many find that their experiences of learning English are taking on new meaning. As they settle into their role as writers for a global audience, the urge to find that old English grammar textbook re-emerges with new vitality. So does the interest in learning the kind of discipline-specific, relevant, and natural vocabulary that would help invite rather than discourage editors’ attention to their writing.

Using corpus tools in research writing

While getting published in English may seem like a dream come true, it means a certain ability to navigate through the demands and discourse expectations of the international academic community. To the novice academic writer, developing this ability means getting increasingly more knowledgeable not only about what fellow researchers around the globe are writing, but also HOW they are doing this language-wise. Computer-based corpora have come in as an indispensable learning aid that does just that — provides deliberate exposure to authentic language usage in target texts.

  • relevance to a specific genre of research writing (a research article, a PhD dissertation, etc.)
  • relevance to a specific discipline(s) (computer sciences, philosophy, economics, etc.)
  • currency of texts (how recently the texts have been published)
  • choice of specific research topics
  • publication venue (e.g., the kind of journal where an article was published).
  1. They reflect the current expectations of the target audience (journal editors, academic supervisors, and other types of readers).
  2. They can help address specific language queries, such as “How frequent, if at all, is the use of “researches” in international academic discourse in my field?” “Is it OK to say “the research object/subject”? “Does the X phrase make sense?”
  • work out rules regarding the meaning and use of language items based on your observations
  • check the frequency of words and phrases used in certain discourses
  • pick up commonly used phrases that are relevant to your needs
  • compile your own wordlists (of collocations, prepositional phrases, verb patterns, etc.).

Using open-access corpora

To become more familiar with corpora of academic texts, a good place to start is to use readily available corpora. Among these the following corpora have been extensively used and cited in academic literature:

  1. What verbs is “this hypothesis” typically used with?
  2. Which phrase tends to occur more frequently: “results show” or “results reveal”?
  3. What tenses is “extensively studied” used with?
  4. What verbs/verb phrases commonly follow “these findings”? Write down at least 10 different verbs/verb phrases.

Creating your own corpora

Instead of using readily available corpora, many may find it more useful to compile their own corpora for personal needs (genre of research text, disciplinary focus, etc.). This can be done with the help of corpus management tools, such as AntConc (freely available), WordSmith Tools (paid), and Monoconc Pro (paid). These computer applications allow users to upload their own selections of texts, combine these into corpora, and run various kinds of analysis on texts within the self-compiled corpora. Being freely available and easy to use, AntConc (developed by Professor Anthony Laurence) seems the most popular choice.

  1. Convert all files into .txt format files (the application only uses this type of files).
  2. Save the files you want to appear in the same corpus in one folder (directory) on your computer. These files can also be organized into subfolders (for example, selections of texts by discipline) to generate sub-corpora.
  3. Download the application from here and install it on your computer.
  4. Run the application and select “Open File(s)” from the “File” menu to load the files.
  5. Select the files from your folder/subfolder(s) and click “Open”.
  6. Start your searches in your own corpus! Use the “Search Term” field to enter keywords for corpus search.

Further literature

If you are interested in learning more about the practical applications of corpora for research writing purposes, here are some recommended readings.

The Academic Writing Center at the Higher School of Economics, Moscow, provides writing support to everyone involved in research.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store