NFDI4DS | UHH-SEMS - Publication Details

Andrew Hardie

ORCID: 0000-0002-2952-2545

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5065806480

Research Areas

Natural Language Processing Techniques
Lexicography and Language Studies
Translation Studies and Practices
Geographic Information Systems Studies
Language, Metaphor, and Cognition
Linguistic Variation and Morphology
linguistics and terminology studies
Topic Modeling
Linguistics, Language Diversity, and Identity
Language, Linguistics, Cultural Analysis
Swearing, Euphemism, Multilingualism
Digital Humanities and Scholarship
Gender Studies in Language
Historical Linguistics and Language Studies
Syntax, Semantics, Linguistic Variation
Text Readability and Simplification
Discourse Analysis in Language Studies
Empathy and Medical Education
Second Language Acquisition and Learning
Historical and Linguistic Studies
Language, Discourse, Communication Strategies
Authorship Attribution and Profiling
Humor Studies and Applications
Social Media and Politics
Interpreting and Communication in Healthcare

Lancaster University
2014-2024

University of Birmingham
2018

Universities UK
2016

Curtin University
2016

The Open University
2015

Solomon R. Guggenheim Museum
2014

Manchester Metropolitan University
2014

Centre National de la Recherche Scientifique
2014

Institut d'Etudes Politiques de Paris
2014

Infection et inflammation
2014

CQPweb — combining power, flexibility and usability in a corpus analysis tool

OPENALEX - Publications

Andrew Hardie

CQPweb is a new web-based corpus analysis system, intended to address the conflicting requirements for usability and power in software. To do this, its user interface emulates BNCweb system. Like BNCweb, built on two separate query technologies: IMS Open Corpus Workbench MySQL relational database. CQPweb’s main innovative feature flexibility; more generalised data model makes it compatible with any corpus. The options available include: concordancing; collocations; distribution tables...

10.1075/ijcl.17.3.04har article EN International Journal of Corpus Linguistics 2012-12-31

The Spoken BNC2014

OPENALEX - Publications

Robbie Love Claire Dembry Andrew Hardie Václav Březina Tony McEnery

Abstract This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthographically transcribed conversations among L1 speakers English from across UK, recorded in years 2012–2016. After showing that a survey recent history corpora spoken justifies compilation this new corpus, we describe main stages BNC2014’s creation: design, data and metadata collection, transcription, XML encoding, annotation. In doing so aim to (i) encourage users approach with...

10.1075/ijcl.22.3.02lov article EN cc-by International Journal of Corpus Linguistics 2017-07-27

The online use of Violence and Journey metaphors by patients with cancer, as compared with health professionals: a mixed methods study

OPENALEX - Publications

Elena Semino Zsófia Demjén Jane Demmen Veronika Koller Sheila Payne and 2 more

To compare the frequencies with which patients cancer and health professionals use Violence Journey metaphors when writing online; to investigate of these by cancer, in view critiques war-related for adoption notion 'cancer journey' UK policy documents.Computer-assisted quantitative qualitative study two data sets totalling 753 302 words.A UK-based online forum (500 134 words) a website (253 168 words).56 between 2007 2012; 307 2008 2013.Patients both approximately 1.5 times per 1000 words...

10.1136/bmjspcare-2014-000785 article EN cc-by BMJ Supportive & Palliative Care 2015-03-05

A glossary of corpus linguistics

OPENALEX - Publications

Paul Baker Andrew Hardie Tony McEnery

This alphabetic guide provides definitions and discussion of key terms used in corpus linguistics. Corpus data is being a growing number English Linguistics departments which have no record past research with data. the first comprehensive glossary many specialist linguistics will be useful for linguists non alike. Clearly written, by team experienced academics field, full coverage both traditional contemporary terminology. Entries are focused around following broad groupings: * Important...

10.5860/choice.44-3603 article EN Choice Reviews Online 2007-03-01

A computer-assisted study of the use of Violence metaphors for cancer and end of life by patients, family carers and health professionals

OPENALEX - Publications

Jane Demmen Elena Semino Zsófia Demjén Veronika Koller Andrew Hardie and 2 more

This study combines quantitative semi-automated corpus methods with manual qualitative analysis to investigate the use of Violence metaphors for cancer and end life in a 1,500,000-word data from three stakeholder groups healthcare: patients, family carers healthcare professionals. general, especially military metaphors, are conventionally used talk about illness, particularly cancer. However, they have also been criticized their potentially negative implications. The innovative methodology...

10.1075/ijcl.20.2.03dem article EN International Journal of Corpus Linguistics 2015-08-17

Visual GISting: bringing together corpus linguistics and Geographical Information Systems

OPENALEX - Publications

Ian Gregory Andrew Hardie

Corpus linguistics and Geographical Information Systems (GIS) are approaches exploiting computer-based methodologies in the study of, respectively, language usage, spatial patterns geographical databases. We present an approach that uses corpus methods to bridge gap between textual content of a (and, thus, typically concerns many branches humanities) geo-referenced database at heart GIS. Using part-of-speech tagging extract instances proper nouns from corpus, gazetteer limit these those...

10.1093/llc/fqr022 article EN Literary and Linguistic Computing 2011-05-20

Modest XML for Corpora: Not a standard, but a suggestion

OPENALEX - Publications

Andrew Hardie

Abstract This paper argues for, and presents, a modest approach to XML encoding for use by the majority of contemporary linguists who need engage in corpus construction. While extensive standards exist - most notably, Text Encoding Initiative’s Guidelines Corpus Standard based on them these are rather heavyweight approaches, implicitly intended major corpus-building projects, which different from increasingly common efforts construction undertaken individual researchers support their...

10.2478/icame-2014-0004 article EN ICAME journal 2014-04-04

Corpus Linguistics and South Asian Languages: Corpus Creation and Tool Development

OPENALEX - Publications

Paul Baker Andrew Hardie Tony McEnery Richard Zhonghua Xiao Kalina Bontcheva and 8 more

This paper describes the work carried out on EMILLE Project (Enabling Minority Language Engineering), which was undertaken by Universities of Lancaster and Sheffield. The primary resource developed project is Corpus, consists a series monolingual corpora for fourteen South Asian languages, totalling more than 96 million words, parallel corpus English five these languages. Corpus also includes an annotated component, namely, part-of-speech tagged Urdu data, together with twenty written Hindi...

10.1093/llc/19.4.509 article EN Literary and Linguistic Computing 2004-11-01

Functional variation in the Spoken BNC2014 and the potential for register analysis

OPENALEX - Publications

Robbie Love Václav Březina Tony McEnery Abi Hawtin Andrew Hardie and 1 more

Abstract This article focuses on how register considerations informed and guided the design of spoken component British National Corpus 2014 (Spoken BNC2014). It discusses why compilers corpus sought to gather recordings from just one broad – ‘informal conversation’ this other decisions afforded contributors much freedom with regards selection situational contexts for recordings. resulted in a high level diversity parameters such as recording location activity type , each which was captured...

10.1075/rs.18013.lov article EN cc-by Register Studies 2019-09-25

The interpretation of topic models for scholarly analysis: An evaluation and critique of current practice

OPENALEX - Publications

Mathew Gillings Andrew Hardie

Abstract Topic modelling is a method of statistical data mining corpus documents, popular in the digital humanities and, increasingly, social sciences. A critical methodological issue how ‘topics’ (groups co-selected word types) can be interpreted analytically meaningful terms. In current literature, this typically done by ‘eyeballing’; that is, cursory and largely unsystematic examination ‘top’ words each algorithmically identified group. We critically evaluate approach dual analysis,...

10.1093/llc/fqac075 article EN Digital Scholarship in the Humanities 2022-12-22

Automatically Analyzing Large Texts in aGISEnvironment: The Registrar General's Reports and Cholera in the 19th Century

OPENALEX - Publications

Patricia Murrieta‐Flores Alistair Baron Ian Gregory Andrew Hardie Paul Rayson

Abstract The aim of this article is to present new research showcasing how Geographic Information Systems in combination with Natural Language Processing and Corpus Linguistics methods can offer innovative venues analyze large textual collections the Humanities, particularly historical research. Using as examples parts collection Registrar General's Reports that contain more than 200,000 pages descriptions, census data vital statistics for UK , we introduce newly developed automated tools...

10.1111/tgis.12106 article EN Transactions in GIS 2014-11-14

Empowerment and disempowerment in the Glencairn Uprising

OPENALEX - Publications

Sheryl Prentice Andrew Hardie

The Glencairn Uprising (1653–1654) was a military rebellion by Scottish Highlanders under the leadership of William, Earl Glencairn, against English government Oliver Cromwell. This paper investigates presentation actors and groups on both sides — but most especially himself in contemporary London press. theoretical framework analysis is Critical Discourse Analysis (modelled approach van Dijk 1991); however, corpus-based methodology, partially-quantitative analysis, are employed. documents...

10.1075/jhp.10.1.03pre article EN Journal of Historical Pragmatics 2009-02-02

Coming Soon ...