Starkaður Barkarson

ORCID: 0009-0004-2739-1475
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Linguistics and language evolution
  • linguistics and terminology studies
  • Lexicography and Language Studies
  • Basque language and culture studies
  • Topic Modeling
  • Hungarian Social, Economic and Educational Studies
  • Linguistics, Language Diversity, and Identity
  • European and International Law Studies
  • Legal Language and Interpretation
  • Hate Speech and Cyberbullying Detection
  • Digital Communication and Language
  • Linguistic research and analysis
  • Freedom of Expression and Defamation
  • Gender Studies in Language
  • Mathematics, Computing, and Information Processing
  • Computational and Text Analysis Methods
  • Educational Technology and Assessment
  • Semantic Web and Ontologies
  • Service-Oriented Architecture and Web Services
  • Biomedical Text Mining and Ontologies
  • Digital Rights Management and Security

Árni Magnússon Institute for Icelandic Studies
2022-2024

This paper presents the ParlaMint corpora containing transcriptions of sessions 17 European national parliaments with half a billion words. The are uniformly encoded, contain rich meta-data about 11 thousand speakers, and linguistically annotated following Universal Dependencies formalism named entities. Samples conversion scripts available from project's GitHub repository, complete openly via CLARIN.SI repository for download, as well through NoSketch Engine KonText concordancers Parlameter...

10.1007/s10579-021-09574-0 article EN cc-by Language Resources and Evaluation 2022-02-02

Abstract The paper presents the results of ParlaMint II project, which comprise comparable corpora parliamentary debates 29 European countries and autonomous regions, covering at least period from 2015 to 2022, containing over 1 billion words. are uniformly encoded, contain rich metadata about their 24 thousand speakers, linguistically annotated up level Universal Dependencies syntax named entities. focuses on enhancement made since I project compilation corpora, including encoding...

10.21203/rs.3.rs-4176128/v1 preprint EN cc-by Research Square (Research Square) 2024-04-01

In Iceland, the word of year is chosen annually, both by Icelandic National Broadcasting Service and Árni Magnússon Institute for Studies (AMI). We explore possibility doing same but a more than 100 years ago. try using methods as AMI does our times. This approach has various limitations, which we discuss, raises many questions, such how much texts from journals periodicals reflect actual use time.

10.5617/dhnbpub.11522 article EN Digital Humanities in the Nordic and Baltic Countries Publications 2024-09-18

Abstract The paper presents the results of ParlaMint II project, which comprise comparable corpora parliamentary debates 29 European countries and autonomous regions, covering at least period from 2015 to 2022, containing over 1 billion words. are uniformly encoded, contain rich metadata about their 24 thousand speakers, linguistically annotated up level Universal Dependencies syntax named entities. focuses on enhancement made since I project compilation corpora, including encoding...

10.1007/s10579-024-09798-w article EN cc-by Language Resources and Evaluation 2024-12-28
Coming Soon ...