- Misinformation and Its Impacts
- Deception detection and forensic psychology
- Hate Speech and Cyberbullying Detection
- Topic Modeling
- Adversarial Robustness in Machine Learning
- Cybercrime and Law Enforcement Studies
- Psychopathy, Forensic Psychiatry, Sexual Offending
- Sentiment Analysis and Opinion Mining
- Authorship Attribution and Profiling
- Information and Cyber Security
- Computational and Text Analysis Methods
- Blockchain Technology Applications and Security
- Crime, Illicit Activities, and Governance
- Mental Health via Writing
- Terrorism, Counterterrorism, and Political Violence
- Social Media and Politics
- Natural Language Processing Techniques
- Spam and Phishing Detection
- Advanced Text Analysis Techniques
- Memory Processes and Influences
- Social and Intergroup Psychology
- Mental Health Research Topics
- scientometrics and bibliometrics research
- Privacy-Preserving Technologies in Data
- Media Influence and Health
University College London
2015-2024
Tilburg University
2020-2024
University of Amsterdam
2015-2022
Okayama University
2021
East Stroudsburg University
2020
University of North Carolina Health Care
2020
University of North Carolina at Chapel Hill
2020
University of Notre Dame
2020
Japan External Trade Organization
2017-2019
American Academy of Forensic Sciences
2018
The proliferation of misleading information in everyday access media outlets such as social feeds, news blogs, and online newspapers have made it challenging to identify trustworthy sources, thus increasing the need for computational tools able provide insights into reliability content. In this paper, we focus on automatic identification fake content news. Our contribution is twofold. First, introduce two novel datasets task detection, covering seven different domains. We describe...
Abstract ‘Deepfakes’ are computationally created entities that falsely represent reality. They can take image, video, and audio modalities, pose a threat to many areas of systems societies, comprising topic interest various aspects cybersecurity cybersafety. In 2020, workshop consulting AI experts from academia, policing, government, the private sector, state security agencies ranked deepfakes as most serious threat. These noted since fake material propagate through uncontrolled routes,...
Pump-and-dump schemes are fraudulent price manipulations through the spread of misinformation and have been around in economic settings since at least 1700s. With new technologies cryptocurrency trading, problem has intensified to a shorter time scale broader scope. The scientific literature on pump-and-dump is scarce, government regulation not yet caught up, leaving cryptocurrencies particularly vulnerable this type market manipulation. This paper examines existing information from...
In this crowdsourced initiative, independent analysts used the same dataset to test two hypotheses regarding effects of scientists' gender and professional status on verbosity during group meetings. Not only analytic approach but also operationalizations key variables were left unconstrained up individual analysts. For instance, could choose operationalize as job title, institutional ranking, citation counts, or some combination. To maximize transparency process by which choices are made, a...
The COVID-19 pandemic is having a dramatic impact on societies and economies around the world. With various measures of lockdowns social distancing in place, it becomes important to understand emotional responses large scale. In this paper, we present first ground truth dataset COVID-19. We asked participants indicate their emotions express these text. This resulted Real World Worry Dataset 5,000 texts (2,500 short + 2,500 long texts). Our analyses suggest that correlated with linguistic...
Recent efforts have shown that neural text processing models are vulnerable to adversarial examples, but the nature of these examples is poorly understood. In this work, we show attacks against CNN, LSTM and Transformer-based classification perform word substitutions identifiable through frequency differences between replaced words their corresponding substitutions. Based on findings, propose frequency-guided (FGWS), a simple algorithm exploiting properties for detection examples. FGWS...
Language models such as GPT-3 have caused a furore in the research community. Some studies found that has some creative abilities and makes mistakes are on par with human behaviour. This paper answers related question: Who is GPT-3? We administered two validated measurement tools to assess its personality, values it holds self-reported demographics. Our results show scores similarly samples terms of personality - when provided model response memory holds. provide first evidence psychological...
There is accumulating evidence that reaction times (RTs) can be used to detect recognition of critical (e.g., crime) information. A limitation this research base its reliance upon small samples (average n = 24), and indications publication bias. To advance RT-based memory detection, we report the development first web-based detection test. Participants in (Study1: 255; Study2: 262) tried hide 2 high salient (birthday, country origin) low (favourite colour, favourite animal) autobiographical...
There is an increasing demand for automated verbal deception detection systems. We propose named entity recognition (NER; i.e., the automatic identification and extraction of information from text) to model three established theoretical principles: (i) truth tellers provide accounts that are richer in detail, (ii) contain more contextual references (specific persons, locations, times), (iii) deceivers tend withhold potentially checkable information. test whether NER captures these concepts...
Abstract Despite considerable concern about how human trafficking offenders may use the Internet to recruit their victims, arrange logistics or advertise services, Internet-trafficking nexus remains unclear. This study explored prevalence and correlates of a set commonly-used indicators labour in online job advertisements. Taking case approach, we focused on major Lithuanian website aimed at people seeking work abroad. We examined snapshot advertisements ( n = 430), assessing both general...
Abstract The increased threat of right-wing extremist violence necessitates a better understanding online extremism. Radical message boards, small-scale social media platforms, and other internet fringes have been reported to fuel hatred. current paper examines data from the forum Stormfront between 2001 2015. We specifically aim understand development user activity use language. Various time-series models depict posting frequency prevalence intensity Individual analyses examine whether some...
Spurred by the recent rapid increase in development and distribution of large language models (LLMs) across industry academia, much work has drawn attention to safety- security-related threats vulnerabilities LLMs, including context potentially criminal activities. Specifically, it been shown that LLMs can be misused for fraud, impersonation, generation malware; while other authors have considered more general problem AI alignment. It is important developers practitioners alike are aware...
Abstract Large Language Models (LLMs) could be a useful tool for lawyers. However, empirical research on their effectiveness in conducting legal tasks is scant. We study securities cases involving cryptocurrencies as one of numerous contexts where AI support the process, studying GPT-3.5’s reasoning and ChatGPT’s drafting capabilities. examine whether a) GPT-3.5 can accurately determine which laws are potentially being violated from fact pattern, b) there difference juror decision-making...
The Internet has already changed people's lives considerably and is likely to drastically change forensic research. We developed a web-based test reveal concealed autobiographical information. Initial studies identified number of conditions that affect diagnostic efficiency. By combining these moderators, this study investigated the full potential online ID-check. Participants (n = 101) tried hide their identity claimed false in reaction time-based Concealed Information Test. Half...
Abstract This paper introduces the Grievance Dictionary, a psycholinguistic dictionary that can be used to automatically understand language use in context of grievance-fueled violence threat assessment. We describe development dictionary, which was informed by suggestions from experienced assessment practitioners. These and subsequent human computational word list generation resulted 20,502 words annotated 2318 participants. The validated applying it texts written violent non-violent...
Summary Recently, verbal credibility assessment has been extended to the detection of deceptive intentions, use a model statement, and predictive modeling. The current investigation combines these 3 elements detect intentions on large scale. Participants read statement wrote truthful or about their planned weekend activities (Experiment 1). With linguistic features for machine learning, more than 80% participants were classified correctly. Exploratory analyses suggested that liars included...
Fraud across the decentralized finance (DeFi) ecosystem is growing, with victims losing billions to DeFi scams every year. However, there a disconnect between reported value of these and associated legal prosecutions. We use open-source investigative tools (1) investigate potential frauds involving Ethereum tokens using on-chain data token smart contract analysis, (2) ways proceeds from were subsequently laundered. The analysis enabled us uncover transaction-based evidence several rug pull...