Walter Daelemans

ORCID: 0000-0002-9832-7890
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Speech and dialogue systems
  • Authorship Attribution and Profiling
  • Biomedical Text Mining and Ontologies
  • Hate Speech and Cyberbullying Detection
  • Speech Recognition and Synthesis
  • Semantic Web and Ontologies
  • Text Readability and Simplification
  • Advanced Text Analysis Techniques
  • Sentiment Analysis and Opinion Mining
  • Spam and Phishing Detection
  • Digital Communication and Language
  • Machine Learning in Healthcare
  • AI-based Problem Solving and Planning
  • Data Mining Algorithms and Applications
  • Lexicography and Language Studies
  • Second Language Acquisition and Learning
  • Algorithms and Data Compression
  • Phonetics and Phonology Research
  • Social Media and Politics
  • Reading and Literacy Development
  • Text and Document Classification Technologies
  • Neural Networks and Applications
  • Bullying, Victimization, and Aggression

University of Antwerp
2015-2024

SGH Warsaw School of Economics
2022-2023

Textron Systems (United Kingdom)
2023

University of Liège
2021

Ludwig-Maximilians-Universität München
2021

University of Sheffield
2021

University of Chicago
2021

The University of Melbourne
2021

KU Leuven
2021

Information Technology University
2021

While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on adequate detection potentially harmful messages and information overload Web requires intelligent systems identify potential risks automatically. The focus this paper is automatic in text by modelling posts written bullies,...

10.1371/journal.pone.0203794 article EN cc-by PLoS ONE 2018-10-08

A common characteristic of communication on online social networks is that it happens via short messages, often using non-standard language variations. These characteristics make this type text a challenging genre for natural processing. Moreover, in these digital communities easy to provide false name, age, gender and location order hide one's true identity, providing criminals such as pedophiles with new possibilities groom their victims. It would therefore be useful if user profiles can...

10.1145/2065023.2065035 article EN 2011-10-28

Pattern is a package for Python 2.4+ with functionality web mining (Google + Twitter Wikipedia, spider, HTML DOM parser), natural language processing (tagger/chunker, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, k-means clustering, Naive Bayes k-NN SVM classifiers) and network analysis (graph centrality visualization). It well documented bundled 30+ examples 350+ unit tests. The source code licensed under BSD available from...

10.5555/2188385.2343710 article EN Journal of Machine Learning Research 2012-01-01

10.1023/a:1007585615670 article EN Machine Learning 1999-01-01

We examine how differences in language models, learned by different data-driven systems performing the same NLP task, can be exploited to yield a higher accuracy than best individual system. do this means of experiments involving task morphosyntactic word class tagging, on basis three tagged corpora. Four well-known tagger generators (hidden Markov model, memory-based, transformation rules, and maximum entropy) are trained corpus data. After comparison, their outputs combined using several...

10.1162/089120101750300508 article EN Computational Linguistics 2001-06-01

Finding negation signals and their scope in text is an important subtask information extraction. In this paper we present a machine learning system that finds the of biomedical texts. The combines several classifiers works two phases. To investigate robustness approach, tested on three subcorpora BioScope corpus representing different types. It achieves best results to date for task, with error reduction 32.07% compared current state art results.

10.3115/1596374.1596381 article EN 2009-01-01

Applications of authorship attribution `in the wild’ [Koppel, M., Schler, J., and Argamon, S. (2010). Authorship in wild. Language Resources Evaluation. Advanced Access published January 12, 2010:10.1007/s10579-009-9111-2], for instance social networks, will likely involve large sets candidate authors only limited data per author. In this article, we present results a systematic study two important parameters supervised machine learning that significantly affect performance computational...

10.1093/llc/fqq013 article EN Literary and Linguistic Computing 2010-08-16

Most studies in statistical or machine learning based authorship attribution focus on two a few authors. This leads to an overestimation of the importance features extracted from training data and found be discriminating for these small sets also use sizes that are unrealistic situations which stylometry is applied (e.g., forensics), thereby overestimate accuracy their approach situations. A more realistic interpretation task as verification problem we approximate by pooling many different...

10.3115/1599081.1599146 article EN 2008-01-01

We present BioGraph, a data integration and mining platform for the exploration discovery of biomedical information. The offers prioritizations putative disease genes, supported by functional hypotheses. show that BioGraph can retrospectively confirm recently discovered genes identify potential susceptibility outperforming existing technologies, without requiring prior domain knowledge. Additionally, allows generic applications beyond gene discovery. is accessible at http://www.biograph.be .

10.1186/gb-2011-12-6-r57 article EN cc-by Genome biology 2011-01-01

Simon Šuster, Walter Daelemans. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.

10.18653/v1/n18-1140 article EN cc-by 2018-01-01

10.1023/a:1006506017891 article EN Artificial Intelligence Review 1997-01-01

example, where the machine side requires some standard expressions in order to execute certain actions.Language translation is one of most complicated tasks human brain, which utilizes not only linguistic knowledge but also world, and varieties our sophisticated senses.EBMT possible approaches mechanism translation, this book represents a considerable contribution field.But we have investigate many other possibilities approach level complex functions brain.

10.1162/0891201042544866 article EN Computational Linguistics 2004-11-25

Identifying hedged information in biomedical literature is an important subtask extraction because it would be misleading to extract speculative as factual information.In this paper we present a machine learning system that finds the scope of hedge cues texts.The based on similar negation cues.We show same finding approach can applied both and hedging.To investigate robustness approach, tested three subcorpora BioScope corpus represent different text types.

10.3115/1572364.1572369 article EN 2009-01-01

In this paper we present a machine learning system that finds the scope of negation in biomedical texts. The consists two memory-based engines, one decides if tokens sentence are signals, and another full these signals. Our approach to detection differs main aspects from existing research on negation. First, focus finding instead determining whether term is negated or not. Second, apply supervised techniques, whereas most systems rule-based algorithms. As far as know, way approaching task novel.

10.3115/1613715.1613805 article EN 2008-01-01
Coming Soon ...