Maciej Ogrodniczuk

ORCID: 0000-0002-3467-9424
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Language and Culture
  • Topic Modeling
  • Semantic Web and Ontologies
  • Literature, Language, and Rhetoric Studies
  • Speech and dialogue systems
  • linguistics and terminology studies
  • Mathematics, Computing, and Information Processing
  • Text Readability and Simplification
  • Lexicography and Language Studies
  • Linguistics, Language Diversity, and Identity
  • Biomedical Text Mining and Ontologies
  • Library Science and Information Systems
  • Advanced Text Analysis Techniques
  • European and International Law Studies
  • Digital Humanities and Scholarship
  • Linguistic research and analysis
  • Language, Metaphor, and Cognition
  • Service-Oriented Architecture and Web Services
  • Digital Rights Management and Security
  • Image Processing and 3D Reconstruction
  • Authorship Attribution and Profiling
  • Speech Recognition and Synthesis
  • Legal Language and Interpretation
  • Historical Geopolitical and Social Dynamics

Polish Academy of Sciences
2014-2024

Institute of Computer Science
2014-2024

The Institute of the Polish Language of the Polish Academy of Sciences
2022

Czech Academy of Sciences, Institute of Computer Science
2014-2019

Université de Tours
2014

University of Warsaw
2004

This paper presents the ParlaMint corpora containing transcriptions of sessions 17 European national parliaments with half a billion words. The are uniformly encoded, contain rich meta-data about 11 thousand speakers, and linguistically annotated following Universal Dependencies formalism named entities. Samples conversion scripts available from project's GitHub repository, complete openly via CLARIN.SI repository for download, as well through NoSketch Engine KonText concordancers Parlameter...

10.1007/s10579-021-09574-0 article EN cc-by Language Resources and Evaluation 2022-02-02

The paper presents Korpusomat, a web application aimed at building annotated corpora for the purpose of corpus linguistic studies.Korpusomat combines existing tools, such as morphological analyser, tagger and search engine, provides an easy-to-use environment technically compatible with National Corpus Polish from almost any text, including texts in binary formats.In we present current state project, its features functionalities, well some future plans developments tasks.A usage example is...

10.12921/cmst.2018.0000005 article EN Computational Methods in Science and Technology 2018-03-31

Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, barriers impacting business, cross-lingual cross-cultural communication are still omnipresent. Language Technologies (LTs) powerful means to break down these barriers. While last decade has seen various initiatives that created multitude approaches technologies tailored Europe's specific needs, there an immense level fragmentation. At same time, AI...

10.48550/arxiv.2003.13833 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Zdeněk Žabokrtský, Miloslav Konopik, Anna Nedoluzhko, Michal Novák, Maciej Ogrodniczuk, Martin Popel, Ondrej Prazak, Jakub Sido, Daniel Zeman. Proceedings of the CRAC 2023 Shared Task on Multilingual Coreference Resolution. 2023.

10.18653/v1/2023.crac-sharedtask.1 article EN cc-by 2023-01-01

This paper presents an overview of the shared task on multilingual coreference resolution associated with CRAC 2022 workshop. Shared participants were supposed to develop trainable systems capable identifying mentions and clustering them according identity coreference. The public edition CorefUD 1.0, which contains 13 datasets for 10 languages, was used as source training evaluation data. CoNLL score in previous coreference-oriented tasks main metric. There 8 prediction submitted by 5...

10.48550/arxiv.2209.07841 preprint EN other-oa arXiv (Cornell University) 2022-01-01
Coming Soon ...