- Topic Modeling
- Natural Language Processing Techniques
- Speech and dialogue systems
- Semantic Web and Ontologies
- Multimodal Machine Learning Applications
- Explainable Artificial Intelligence (XAI)
- Privacy-Preserving Technologies in Data
- Advanced Text Analysis Techniques
- Decision-Making and Behavioral Economics
- Intelligent Tutoring Systems and Adaptive Learning
- Hate Speech and Cyberbullying Detection
- Human-Automation Interaction and Safety
- Geographic Information Systems Studies
- Data Visualization and Analytics
- Video Analysis and Summarization
- Adversarial Robustness in Machine Learning
- Web Data Mining and Analysis
- Artificial Intelligence in Games
- Scientific Computing and Data Management
- Advanced Image and Video Retrieval Techniques
- Authorship Attribution and Profiling
- Spam and Phishing Detection
- Text and Document Classification Technologies
- Persona Design and Applications
- Machine Learning and Data Classification
Edinburgh Napier University
2016-2024
University of Edinburgh
2020
University of Coimbra
2017
Universitat Pompeu Fabra
2017
Thomson Reuters (United States)
2017
Bridge University
2017
University of Cambridge
2017
Heriot-Watt University
2015
David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser. Proceedings of the 13th International Conference on Natural Language Generation. 2020.
Decision-making is often dependent on uncertain data, e.g.data associated with confidence scores or probabilities.We present a comparison of different information presentations for data and, the first time, measure their effects human decision-making.We show that use Natural Language Generation (NLG) improves decision-making under uncertainty, compared to state-of-theart graphical-based representation methods.In task-based study 442 adults, we found using NLG lead 24% better average than...
In this paper we present a snapshot of endto-end NLG system evaluations as presented in conference and journal papers 1 over the last ten years order to better understand nature type that have been undertaken.We find researchers tend favour specific evaluation methods, their approaches are also correlated with publication venue.We further discuss what factors may influence types used for given system.
The proliferation of social media platforms changed the way people interact online. However, engagement with comes a price, users’ privacy. Breaches privacy, such as Cambridge Analytica scandal, can reveal how data be weaponized in political campaigns, which many times trigger hate speech and anti-immigration views. Hate detection is challenging task due to different sources that have an impact on language used, well lack relevant annotated data. To tackle this, we collected manually...
Large scale adoption of pre-trained language models has introduced a new era convenient knowledge transfer for slew natural processing tasks. However, these run the risk undermining user trust, since they may enable malicious users to expose personally identifying information about subjects in other datasets through re-identification attacks. We present an empirical investigation into extent personal that can be extracted from representations produced by popular models, and we show positive...
Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina Mcmillan-major, Anna Shvets, Ashish Upadhyay, Bernd Bohnet, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna...
We present a novel approach for automatic report generation from time-series data, in the context of student feedback generation. Our proposed methodology treats content selection as multi-label (ML) classification problem, which takes input data and outputs set templates, while capturing dependencies between selected templates. show that this method generates output closer to lecturers actually generated, achieving 3.5% higher accuracy 15% F-score than multiple simple classifiers keep...
Emiel van Miltenburg, Miruna Clinciu, Ondřej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Schoch, Craig Thomson, Luou Wen. Proceedings of the 14th International Conference on Natural Language Generation. 2021.
Decision-making is often dependent on uncertain data, e.g. data associated with confidence scores or probabilities. This article presents a comparison of different information presentations for and, the first time, measures their effects human decision-making, in domain weather forecast generation. We use game-based setup to evaluate systems. show that Natural Language Generation (NLG) enhances decision-making under uncertainty, compared state-of-the-art graphical-based representation...
Predicting the success of referring expressions (RE) is vital for real-world applications such as navigation systems.Traditionally, research has focused on studying Referring Expression Generation (REG) in virtual, controlled environments.In this paper, we describe a novel study spatial references from real scenes rather than virtual.First, investigate how humans objects open, uncontrolled scenarios and compare our findings to those reported virtual environments.We show that REs differ...
Earlier research has shown that few studies in Natural Language Generation (NLG) evaluate their system outputs using an error analysis, despite known limitations of automatic evaluation metrics and human ratings. This position paper takes the stance analyses should be encouraged, discusses several ways to do so. is not just based on our shared experience as authors, but we also distributed a survey means public consultation. We provide overview existing barriers carry out analyses, proposes...
Referring to landmarks has been identified lead improved navigation instructions.However, a previous corpus study suggests that human "wizards" also choose refer street names and generate user-centric instructions.In this paper, we conduct task-based evaluation of two systems reflecting the wizards' behaviours compare them against an version landmark-based systems, which resorts descriptions if landmark is estimated be invisible.We use GRUVE virtual interactive environment for evaluation.We...
Data-to-text systems are powerful in generating reports from data automatically and thus they simplify the presentation of complex data. Rather than presenting using visualisation techniques, data-to-text use natural (human) language, which is most common way for human-human communication. In addition, can adapt their output content to users' preferences, background or interests therefore be pleasant users interact with. Content selection an important part every system, because it module...
Neural language models have contributed to state-of-the-art results in a number of downstream applications including sentiment analysis, intent classification and others. However, obtaining text representations or embeddings using these risks encoding personally identifiable information learned from context cues that may lead privacy leaks. To ameliorate this issue, we propose Context-Aware Private Embeddings (CAPE), novel approach which combines differential adversarial learning preserve...
The Covid-19 pandemic required many aspects of life to move online. This accelerated a broader trend for increasing use ICT and AI, with implications both the world work career development. article explores potential benefits challenges including AI in practice. It provides an overview technology, current uses, illustrate ways which it could enhance existing services, attendant practical ethical posed. Finally, recommendations are provided policy practice that will support development...
A Natural Language Generation (NLG) system is able to generate text from nonlinguistic data, ideally personalising the content a user’s specific needs. In some cases, however, there are multiple stakeholders with their own individual goals, needs and preferences. this paper, we explore feasibility of combining preferences two different user groups, lecturers students, when generating summaries in context student feedback generation. The each group modelled as multivariate optimisation...
Large scale adoption of large language models has introduced a new era convenient knowledge transfer for slew natural processing tasks. However, these also run the risk undermining user trust by exposing unwanted information about data subjects, which may be extracted malicious party, e.g. through adversarial attacks. We present an empirical investigation into extent personal encoded pre-trained representations range popular models, and we show positive correlation between complexity model,...
This paper proposes a novel task on commonsense-enhanced task-based dialogue grounded in documents and describes the Task2Dial dataset, dataset of document-grounded dialogues, where an Information Giver (IG) provides instructions (by consulting document) to Follower (IF), so that latter can successfully complete task. In this unique setting, IF ask clarification questions which may not be underlying document require commonsense knowledge answered. The poses new challenges: (1) its human...
We present FeedbackGen, a system that uses multi-adaptive approach to Natural Language Generation.With the term 'multi-adaptive', we refer is able adapt its content different user groups simultaneously, in our case adapting both lecturers and students.We novel student feedback generation, which simultaneously takes into account preferences of students when determining be conveyed summary.In this framework, utilise knowledge derived from ratings on summaries by extracting most relevant...