- Natural Language Processing Techniques
- Topic Modeling
- Semantic Web and Ontologies
- Speech and dialogue systems
- Lexicography and Language Studies
- linguistics and terminology studies
- Language and cultural evolution
- Syntax, Semantics, Linguistic Variation
- Digital Humanities and Scholarship
- Text Readability and Simplification
- Advanced Text Analysis Techniques
- Second Language Acquisition and Learning
- Linguistic Studies and Language Acquisition
- Computational and Text Analysis Methods
- Authorship Attribution and Profiling
- Innovative Teaching and Learning Methods
- Service-Oriented Architecture and Web Services
- Biomedical Text Mining and Ontologies
- Second Language Learning and Teaching
- Multi-Agent Systems and Negotiation
- Sentiment Analysis and Opinion Mining
- Linguistic Variation and Morphology
- South Asian Studies and Conflicts
- Multilingual Education and Policy
- Library Science and Information Systems
University of Gothenburg
2015-2024
Uppsala University
1988-2022
Centre for Digital Humanities
2022
University of Helsinki
2022
Universidade Federal de Juiz de Fora
2018
Swedish National Bank
2012-2017
Radboud University Nijmegen
2011
Max Planck Institute for Evolutionary Anthropology
2011
Stockholm University
2002
This article surveys work on Unsupervised Learning of Morphology. We define Morphology as the problem inducing a description (of some kind, even if only morpheme-segmentation) how orthographic words are built up given raw text data language. briefly go through history and motivation this problem. Next, over 200 items listed with brief characterization, most important ideas in field critically discussed. summarize achievements so far give pointers for future developments.
Our languages are in constant flux driven by external factors such as cultural, societal and technological changes, well only partially understood internal motivations. Words acquire new meanings lose old senses, words coined or borrowed from other obsolete slide into obscurity. Understanding the characteristics of shifts meaning use is useful for those who work with content historical texts, interested general public, but also itself. The findings automatic lexical semantic change...
We present an experiment where natural language processing tools are used to automatically identify potential constructions in a corpus. The was conducted as part of the ongoing efforts develop Swedish constructicon. Using automatic method suggest has advantages not only for efficiency but also methodologically: it forces analyst look more objectively at actually occurring corpora, opposed focusing on “interesting” only. As heuristic identifying constructions, proved successful, yielding...
This article describes the development of a geographical information system (GIS) at Språkbanken as part visualization solution to be used in an archive historical Swedish literary texts. The research problems we are aiming address concern orthographic and morphological variation, missing place names, name coordinates. Some these form central methods tools for automatic analysis texts our unit. We discuss advantages challenges covering large-scale spelling variation names from different...
We present a framework and its implementation relying on Natural Language Processing methods, which aims at the identification of exercise item candidates from corpora. The hybrid system combining heuristics machine learning methods includes number relevant selection criteria. focus two fundamental aspects: linguistic complexity dependence extracted sentences their original context. Previous work generation addressed these criteria only to limited extent, refined overall candidate sentence...
The concept of culturomics was born out the availability massive amounts textual data and interest to make sense cultural language phenomena over time. Thus far however, has only made use of, shown great potential statistical methods. In this paper, we present a vision for knowledge-based that complements traditional culturomics. We discuss possibilities challenges combining methods with address major arise due nature data; diversity sources, changes in time as well temporal dynamics...