Michal Seják

ORCID: 0009-0008-0365-898X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Advanced Text Analysis Techniques
  • Text Readability and Simplification
  • Radiomics and Machine Learning in Medical Imaging
  • Phonocardiography and Auscultation Techniques
  • EEG and Brain-Computer Interfaces
  • ECG Monitoring and Analysis

University of West Bohemia
2021-2023

Sleep spindles, an oscillatory brain activity occurring during light non-rapid eye movement (NREM) sleep, are important for memory consolidation and cognitive functions. Accurate detection is understanding the role of spindles in sleep state physiology health better neurological disorders. However, manual spindle labeling electroencephalography (EEG) data time-consuming impractical most clinical research settings intracranial EEG (iEEG) presents additional challenges identification due to...

10.1101/2025.04.13.25325696 preprint EN cc-by 2025-04-14

This paper describes the training process of first Czech monolingual language representation models based on BERT and ALBERT architectures.We pre-train our more than 340K sentences, which is 50 times multilingual that include data.We outperform 9 out 11 datasets.In addition, we establish new state-of-the-art results nine datasets.At end, discuss properties upon results.We publish all pretrained fine-tuned freely for research community.

10.26615/978-954-452-072-4_149 article EN 2021-01-01

Abstract This paper describes a novel dataset consisting of sentences with two different semantic similarity annotations; and without surrounding context. The data originate from the journalistic domain in Czech language. final contains 138,556 human annotations divided into train test sets. In total, 485 journalism students participated creation process. To increase reliability set, we compute as an average 9 individual annotation scores. We evaluate quality by measuring inter...

10.21203/rs.3.rs-2130964/v1 preprint EN cc-by Research Square (Research Square) 2022-10-26

This paper describes a novel dataset consisting of sentences with semantic similarity annotations. The data originate from the journalistic domain in Czech language. We describe process collecting and annotating detail. contains 138,556 human annotations divided into train test sets. In total, 485 journalism students participated creation process. To increase reliability set, we compute annotation as an average 9 individual evaluate quality by measuring inter intra annotators' agreements....

10.48550/arxiv.2108.08708 preprint EN cc-by-nc-nd arXiv (Cornell University) 2021-01-01

This paper describes the training process of first Czech monolingual language representation models based on BERT and ALBERT architectures. We pre-train our more than 340K sentences, which is 50 times multilingual that include data. outperform 9 out 11 datasets. In addition, we establish new state-of-the-art results nine At end, discuss properties upon results. publish all pre-trained fine-tuned freely for research community.

10.48550/arxiv.2103.13031 preprint EN cc-by-nc-nd arXiv (Cornell University) 2021-01-01

Electrocardiograms (ECGs) are commonly used by cardiologists to detect heart-related pathological conditions. Reliable collections of ECGs crucial for precise diagnosis. However, in clinical practice, the assignment captured ECG recordings incorrect patients can occur inadvertently. In collaboration with a and research facility which recognized this challenge reached out us, we present study that addresses issue. work, propose small efficient neural-network based model determining whether...

10.48550/arxiv.2306.06196 preprint EN cc-by-nc-nd arXiv (Cornell University) 2023-01-01
Coming Soon ...