NFDI4DS | UHH-SEMS - Publication Details

Analysis of named entity recognition and linking for tweets

OPENALEX - Publications

Leon Derczynski Diana Maynard Giuseppe Rizzo Marieke van Erp Genevieve Gorrell and 3 more

10.1016/j.ipm.2014.10.006 article EN Information Processing & Management 2014-11-19

SemEval-2019 Task 7: RumourEval, Determining Rumour Veracity and Support for Rumours

OPENALEX - Publications

Genevieve Gorrell Elena Kochkina Maria Liakata Ahmet Aker Arkaitz Zubiaga and 2 more

Since the first RumourEval shared task in 2017, interest automated claim validation has greatly increased, as danger of “fake news” become a mainstream concern. However support for rumour verification remains its infancy. It is therefore important that this area continues to provide focus effort, which likely increase. Rumour characterised by need consider evolving conversations and news updates reach verdict on rumour’s veracity. As 2017 we provided dataset dubious posts ensuing social...

10.18653/v1/s19-2147 article EN 2019-01-01

Natural language processing to extract symptoms of severe mental illness from clinical text: the Clinical Record Interactive Search Comprehensive Data Extraction (CRIS-CODE) project

OPENALEX - Publications

Richard Jackson Rashmi Patel Nishamali Jayatilleke Anna Kolliakou Michael Ball and 4 more

Objectives We sought to use natural language processing develop a suite of models capture key symptoms severe mental illness (SMI) from clinical text, facilitate the secondary healthcare data in research. Design Development and validation information extraction applications for ascertaining SMI routine health records using Clinical Record Interactive Search (CRIS) resource; description their distribution corpus discharge summaries. Setting Electronic large provider serving geographic...

10.1136/bmjopen-2016-012012 article EN cc-by BMJ Open 2017-01-01

SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research*

OPENALEX - Publications

Honghan Wu Giulia Toti Katherine I. Morley Zina Ibrahim Amos Folarin and 10 more

Abstract Objective Unlocking the data contained within both structured and unstructured components of electronic health records (EHRs) has potential to provide a step change in available for secondary research use, generation actionable medical insights, hospital management, trial recruitment. To achieve this, we implemented SemEHR, an open source semantic search analytics tool EHRs. Methods SemEHR implements generic information extraction (IE) retrieval infrastructure by identifying...

10.1093/jamia/ocx160 article EN cc-by Journal of the American Medical Informatics Association 2018-01-08

GATE Teamware: a web-based, collaborative text annotation framework

OPENALEX - Publications

Kalina Bontcheva Hamish Cunningham Ian Roberts Angus Roberts Valentin Tablan and 2 more

Abstract This paper presents GATE Teamware—an open-source, web-based, collaborative text annotation framework. It enables users to carry out complex corpus projects, involving distributed annotator teams. Different user roles are provided (annotator, manager, administrator) with customisable interface functionalities, in order support the workflows and interactions that occur projects. Documents may be pre-processed automatically, so human annotators can begin has already been pre-annotated...

10.1007/s10579-013-9215-6 article EN cc-by Language Resources and Evaluation 2013-02-01

CogStack - experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital

OPENALEX - Publications

Richard Jackson Ismail E. Kartoglu Clive Stringer Genevieve Gorrell Angus Roberts and 10 more

Traditional health information systems are generally devised to support clinical data collection at the point of care. However, as significance modern economy expands in scope and permeates healthcare domain, there is an increasing urgency for organisations offer that address expectations clinicians, researchers business intelligence community alike. Amongst other emergent requirements, principal unmet need might be defined 3R principle (right data, right place, time) deficiencies...

10.1186/s12911-018-0623-9 article EN cc-by BMC Medical Informatics and Decision Making 2018-06-25

Negative symptoms in schizophrenia: a study in a large clinical sample of patients using a novel automated method

OPENALEX - Publications

Rashmi Patel Nishamali Jayatilleke Matthew Broadbent Chin‐Kuo Chang Nadia Foskett and 8 more

Objectives To identify negative symptoms in the clinical records of a large sample patients with schizophrenia using natural language processing and assess their relationship outcomes. Design Observational study an anonymised electronic health record case register. Setting South London Maudsley NHS Trust (SLaM), provider inpatient community mental healthcare UK. Participants 7678 receiving care during 2011. Main outcome measures Hospital admission, readmission duration admission. Results 10...

10.1136/bmjopen-2015-007619 article EN cc-by BMJ Open 2015-09-01

Countering method bias in questionnaire‐based user studies

OPENALEX - Publications

Genevieve Gorrell Nigel Ford Andrew Madden Peter Holdridge Barry Eaglestone

Purpose This paper seeks to discuss reliability problems associated with questionnaires, commonly employed in library and information science. It aims focus on the effects of “common method variance” (CMV), which is a form bias, ways countering these effects. Design/methodology/approach The critically reviews use existing tools for demonstrating questionnaire‐based studies. In particular, it focuses Cronbach's alpha, “Harman's single factor test” Lindell Whitney's “marker variable” approach....

10.1108/00220411111124569 article EN Journal of Documentation 2011-04-26

Risk Assessment Tools and Data-Driven Approaches for Predicting and Preventing Suicidal Behavior

OPENALEX - Publications

Sumithra Velupillai Gergö Hadlaczky Enrique Baca‐García Genevieve Gorrell Nomi Werbeloff and 6 more

Risk assessment of suicidal behavior is a time-consuming but notoriously inaccurate activity for mental health services globally. In the last 50 years large number tools have been designed suicide risk assessment, and tested in wide variety populations, studies show that these suffer from low positive predictive values. More recently, advances research fields such as machine learning natural language processing applied on datasets shown promising results care, may enable an important shift...

10.3389/fpsyt.2019.00036 article EN cc-by Frontiers in Psychiatry 2019-02-13

Which politicians receive abuse? Four factors illuminated in the UK general election 2019

OPENALEX - Publications

Genevieve Gorrell Mehmet E. Bakir Ian Roberts Mark Greenwood Kalina Bontcheva

Abstract The 2019 UK general election took place against a background of rising online hostility levels toward politicians, and concerns about the impact this on democracy, as record number politicians cited abuse they had been receiving reason for not standing re-election. We present four-factor framework in understanding who receives why. four factors are prominence, events, engagement personal characteristics. collected 4.2 million tweets sent to or from candidates six week period...

10.1140/epjds/s13688-020-00236-9 article EN cc-by EPJ Data Science 2020-07-02

Social Media and Information Overload: Survey Results

OPENALEX - Publications

Kalina Bontcheva Genevieve Gorrell Bridgette Wessels

A UK-based online questionnaire investigating aspects of usage user-generated media (UGM), such as Facebook, LinkedIn and Twitter, attracted 587 participants. Results show a high degree engagement with social networking significant other professional media, microblogs blogs. Participants who experience information overload are those engage less frequently the rather than have fewer posts to read. Professional users different behaviours users. Microbloggers complain greatest extent. Two...

10.48550/arxiv.1306.0813 preprint EN other-oa arXiv (Cornell University) 2013-01-01

Twits, Twats and Twaddle: Trends in Online Abuse towards UK Politicians

OPENALEX - Publications

Genevieve Gorrell Mark Greenwood Ian Roberts Diana Maynard Kalina Bontcheva

Concerns have reached the mainstream about how social media are affecting political outcomes. One trajectory for this is exposure of politicians to online abuse. In paper we use 1.4 million tweets from months before 2015 and 2017 UK general elections explore abuse directed at politicians. Results show that increased substantially in compared with 2015. Abusive a strong relationship total received, indicating most part impersonality, but second pathway targets less prominent individuals,...

10.1609/icwsm.v12i1.15070 article EN Proceedings of the International AAAI Conference on Web and Social Media 2018-06-15

Comparing grammar-based and robust approaches to speech understanding: a case study

OPENALEX - Publications

Sylvia Knight Genevieve Gorrell Manny Rayner David Milward Rob Koeling and 1 more

Previous work has demonstrated the success of statistical language models when enough training data is available [1], but despite that, grammar-based systems are proving preferred choice in successful commercial such as HeyAnita [2], BeVocal [3] and Tellme [4], largely due to difficulty involved obtaining a corpus data. Here we trained an SLM on obtained using system compared performance two with regards recognition. We also parsed output robust parser accuracy semantic systems. The...

10.21437/eurospeech.2001-420 article EN 2001-09-03

Classifying Twitter favorites: Like, bookmark, or Thanks?

OPENALEX - Publications

Genevieve Gorrell Kalina Bontcheva

Since its foundation in 2006, Twitter has enjoyed a meteoric rise popularity, currently boasting over 500 million users. Its short text nature means that the service is open to variety of different usage patterns, which have evolved rapidly terms user base and utilization. Prior work categorized T witter users, as well studied use lists re‐tweets how these can be used infer profiles interests. The focus this article on studying why users mark tweets “favorites”—a functionality with poorly...

10.1002/asi.23352 article EN cc-by Journal of the Association for Information Science and Technology 2014-12-22

Statistical filtering and subcategorization frame acquisition

OPENALEX - Publications

Anna Korhonen Genevieve Gorrell Diana McCarthy

Research into the automatic acquisition of subcategorization frames (SCFs) from corpora is starting to produce large-scale computational lexicons which include valuable frequency information. However, accuracy resulting shows room for improvement. One significant source error lies in statistical filtering used by some researchers remove noise automatically acquired frames. In this paper, we compare three different approaches out spurious hypotheses. Two hypothesis tests perform poorly,...

10.3115/1117794.1117819 article EN 2000-01-01

Generalized hebbian algorithm for incremental latent semantic analysis

OPENALEX - Publications

Genevieve Gorrell Brandyn J. Webb

The Generalized Hebbian Algorithm is shown to be equivalent Latent Semantic Analysis, and applicable a range of LSAstyle tasks. GHA learning algorithm which converges on an approximation the eigen decomposition unseen frequency matrix given observations presented in sequence. Use allows very large datasets processed.

10.21437/interspeech.2005-28 article EN Interspeech 2022 2005-09-04

Adding intelligent help to mixed-initiative spoken dialogue systems

OPENALEX - Publications

Genevieve Gorrell Ian Lewin Manny Rayner

The rapidly expanding voice recognition industry has so far shown a preference for grammar-based language modelling, despite the better overall performance of statistical modelling. Given that advantages approach make it unlikely to be replaced as primary solution in near future, is natural wonder whether some combination two approaches may prove useful. Here, we describe an implemented system uses modelling and decision-tree classifier provide user with feedback when grammarbased fails....

10.21437/icslp.2002-566 article EN 4th International Conference on Spoken Language Processing (ICSLP 1996) 2002-09-16

Towards “metacognitively aware” IR systems: an initial user study

OPENALEX - Publications

Genevieve Gorrell Barry Eaglestone Nigel Ford Peter Holdridge Andrew Madden

Purpose The purpose of this paper is to describe: a new taxonomy metacognitive skills designed support the study metacognition in context web searching; data collection instrument based on taxonomy; and results testing sample university students staff. Design/methodology/approach review literature, extended cover searching. This forms basis for design instrument, which tested with 405 staff Sheffield University. Findings Subjects regard range focused as broadly similar. However, number...

10.1108/00220410910952429 article EN Journal of Documentation 2009-04-24

RumourEval 2019: Determining Rumour Veracity and Support for Rumours

OPENALEX - Publications

Genevieve Gorrell Kalina Bontcheva Leon Derczynski Elena Kochkina Maria Liakata and 1 more

This is the proposal for RumourEval-2019, which will run in early 2019 as part of that year's SemEval event. Since first RumourEval shared task 2017, interest automated claim validation has greatly increased, dangers "fake news" have become a mainstream concern. Yet support rumour checking remains its infancy. For this reason, it important area continues to provide focus effort, likely increase. We therefore propose continuation veracity further rumours determined, and previously, supportive...

10.48550/arxiv.1809.06683 preprint EN cc-by arXiv (Cornell University) 2018-01-01

ResToRinG CaPitaLiZaTion in #TweeTs

OPENALEX - Publications

Kamel Nebhi Kalina Bontcheva Genevieve Gorrell

The rapid proliferation of microblogs such as Twitter has resulted in a vast quantity written text becoming available that contains interesting information for NLP tasks. However, the noise level tweets is so high standard tools perform poorly. In this pa- per, we present statistical truecaser using 3-gram language model built with truecased newswire texts and tweets. Our truecasing method shows an improvement named entity recognition part-of-speech tagging

10.1145/2740908.2743039 article EN 2015-05-18

CogStack - Experiences Of Deploying Integrated Information Retrieval And Extraction Services In A Large National Health Service Foundation Trust Hospital

OPENALEX - Publications

Richard Jackson Ismail E. Kartoglu Asha Agrawal Kenneth Lui Honghan Wu and 10 more

Traditional health information systems are generally devised to support clinical data collection at the point of care. However, as significance modern economy expands in scope and permeates healthcare domain, there is an increasing urgency for organisations offer that address expectations clinicians, researchers business intelligence community alike. Amongst other emergent requirements, principal unmet need might be defined 3R principle (right data, right place, time) deficiencies...

10.1101/123299 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2017-04-02