Svetla Boytcheva

ORCID: 0000-0002-5542-9168
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Biomedical Text Mining and Ontologies
  • Natural Language Processing Techniques
  • Topic Modeling
  • Semantic Web and Ontologies
  • Data Mining Algorithms and Applications
  • Machine Learning in Healthcare
  • Data Quality and Management
  • Educational Games and Gamification
  • AI in cancer detection
  • Advanced Text Analysis Techniques
  • Experimental Learning in Engineering
  • Open Education and E-Learning
  • Logic, Reasoning, and Knowledge
  • Time Series Analysis and Forecasting
  • Artificial Intelligence in Healthcare
  • Simulation Techniques and Applications
  • Educational Technology and Assessment
  • Intelligent Tutoring Systems and Adaptive Learning
  • Digital Imaging for Blood Diseases
  • Rough Sets and Fuzzy Logic
  • Text and Document Classification Technologies
  • linguistics and terminology studies
  • Distributed and Parallel Computing Systems
  • Multi-Agent Systems and Negotiation
  • Virtual Reality Applications and Impacts

Bulgarian Academy of Sciences
2011-2024

Ontotext (Bulgaria)
2019-2024

Institute of Information and Communication Technologies
2011-2024

Sofia University "St. Kliment Ohridski"
2000-2024

University of Padua
2023

Sirma Group (Bulgaria)
2022

Society of Interventional Radiology
2022

Stefansson Arctic Institute
2021

American University in Bulgaria
2012-2014

University of Library Studies and Information Technologies
2009-2012

The digitalization of clinical workflows and the increasing performance deep learning algorithms are paving way towards new methods for tackling cancer diagnosis. However, availability medical specialists to annotate digitized images free-text diagnostic reports does not scale with need large datasets required train robust computer-aided diagnosis that can target high variability cases data produced. This work proposes evaluates an approach eliminate manual annotations tools in digital...

10.1038/s41746-022-00635-4 article EN cc-by npj Digital Medicine 2022-07-22

Exa-scale volumes of medical data have been produced for decades. In most cases, the diagnosis is reported in free text, encoding knowledge that still largely unexploited. order to allow decoding included reports, we propose an unsupervised extraction system combining a rule-based expert with pre-trained Machine Learning (ML) models, namely Semantic Knowledge Extractor Tool (SKET). Combining techniques and ML models provides high accuracy results extraction. This work demonstrates viability...

10.1016/j.jpi.2022.100139 article EN cc-by-nc-nd Journal of Pathology Informatics 2022-01-01

The process of converting non-game educational content and processes into game-like is called gamification. This article describes a gamified evaluation software for university students in Science, Technology, Engineering, the Arts Mathematics (STEAM) courses, based on competence profiles problems. traditional learning management systems tools cannot handle gamification to its full potential because unique requirements environments. We designed novel assessment methodology implemented STEAM...

10.3390/info11060316 article EN cc-by Information 2020-06-11

Abstract This paper presents the results of an on-going research project for knowledge extraction from large corpora clinical narratives in Bulgarian language, approximately 100 million outpatient care notes. Entities with numerical values are mined free text and extracted information is stored a structured format. The Algorithms retrospective analyses big data analytics applied studying treatment evaluating diabetes compensation control arterial blood pressure.

10.1515/cait-2015-0055 article EN cc-by-nc-nd Cybernetics and Information Technologies 2015-11-01

Abstract Background Studying comorbidities of disorders is important for detection and prevention. For discovering frequent patterns diseases we can use retrospective analysis population data, by filtering events with common properties similar significance. Most pattern mining methods do not consider contextual information about extracted patterns. Further data developments might enable more efficient applications in specific tasks like identification. Methods We propose a cascade approach...

10.1007/s13755-017-0024-y article EN cc-by Health Information Science and Systems 2017-09-28

Abstract This paper presents a transformer-based approach for symptom Named Entity Recognition (NER) in Spanish clinical texts and multilingual entity linking on the SympTEMIST dataset. For NER, we fine tune RoBERTa-based token-level classifier with Bidirectional Long Short-Term Memory conditional random field layers an augmented train set, achieving F1 score of 0.73. is performed via hybrid dictionaries, generating candidates from knowledge base containing Unified Medical Language System...

10.1093/database/baae090 article EN cc-by Database 2024-01-01

This paper presents methods for shallow Information Extraction (IE) from the free text zones of hospital Patient Records (PRs) in Bulgarian language Safety through Intelligent Procedures medication (PSIP) project. We extract automatically information about drug names, dosage, modes and frequency assign corresponding ATC code to each event. Using various modules rule-based analysis, our IE components PSIP perform a significant amount symbolic computations. try address negative statements,...

10.3233/978-1-60750-740-6-119 article EN Studies in health technology and informatics 2011-01-01

10.18653/v1/2024.semeval-1.235 article EN Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2024-01-01

The task of automatic diagnosis encoding into standard medical classifications and ontologies is great importance in medicine -both to support the daily tasks physicians preparation reporting clinical documentation, for processing reports.In this paper, we investigate application performance different deep learning transformers ICD-10 texts Bulgarian.The comparative analysis attempts find which approach more efficient be used finetuning pre-trained BERT family transformer deal with a...

10.26615/978-954-452-072-4_162 article EN 2021-01-01

We present an approach for medical text coding with SNOMED CT.Our uses publicly available linked open data from terminologies and ontologies as training the algorithms.We claim that even small corpora made of short snippets can be used to train models given task.We propose a method based on transformers enhanced clustering filtering candidates.Further, we adopt classical machine learning -support vector classification (SVC) using transformer embeddings.The resulting proves more accurate than...

10.26615/978-954-452-092-2_057 article EN 2023-01-01

This paper reports a research effort in Information Extraction, especially template pattern matching. Our approach uses reach domain knowledge the football (soccer) area and logical form representation for necessary inferences of facts templates filling. system FRET1 (Football Reports Extraction Templates) is compatible to language-engineering environment GATE handles its internal representations some intermediate analysis results.

10.3115/1067737.1067744 article EN 2003-01-01

This paper presents experiments in automatic Information Extraction of medication events, diagnoses, and laboratory tests form hospital patient records, order to increase the completeness description episode care. Each record our information system contains structured data text descriptions, including full discharge letters. From these letters, we extract automatically about just before time hospitalization, especially for drugs prescribed patient, but not delivered by pharmacy; also values...

10.3233/978-1-60750-740-6-260 article EN Studies in health technology and informatics 2011-01-01

Information Extraction (IE) from medical texts aims at the automatic recognition of entities and relations interests. IE is based on shallow analysis considers only sentences containing important words. Thus drugs discharge letters can identify as ‘current’ some past or future medication events. This article presents heuristic observations enabling to filter that are taken by patients during hospitalization. These heuristics default PR structure linguistic expressions...

10.3233/978-1-60750-806-9-527 article EN Studies in health technology and informatics 2011-01-01

We describe a method which extracts Association Rules from texts in order to recognise verbalisations of risk factors.Usually some basic vocabulary about factors is known but medical conditions are expressed clinical narratives with much higher variety.We propose an approach for data-driven learning specialised which, once collected, enables early alerting potentially affected patients.The illustrated by experimens records patients Chronic Obstructive Pulmonary Disease (COPD) and comorbidity...

10.26615/978-954-452-044-1_009 article EN 2017-11-10

This paper presents an approach for prediction of results sport events.Usually the forecasting approaches are based on structured data.We test hypothesis that sports can be predicted by using natural language processing and machine learning techniques applied over interviews with players shortly before events.The proposed method uses deep contextual models, unstructured textual documents.Several experiments were performed in individual like boxing, martial arts, tennis.The from conducted...

10.26615/978-954-452-056-4_142 article EN 2019-10-22

This paper addresses the task of categorizing companies within industry classification schemes.The dataset consists encyclopedic articles about and their economic activities.The target schema is build by mapping linked open data in a semi-supervised manner.Target classes are built bottom-up from DBpedia.We apply several state art text techniques, based both on deep learning classical vectorspace models.

10.26615/978-954-452-056-4_134 article EN 2019-10-22

We propose a method that processes raw informal medical texts (from health forums) and formal (outpatient records) in Bulgarian language order to extract typical word co-occurrences the form of association rules.When mining these rules we use some context information small terminological lexicons generalize extracted frequent patterns.This allows study expressions terminology identify automatically descriptions types patient statuses.The paper presents generated from 300,000 outpatient...

10.26615/978-954-452-049-6_019 article EN 2017-11-10

Word vectors with varying dimensionalities and produced by different algorithms have been extensively used in NLP.The corpora that the are trained on can contain either natural language text (e.g.Wikipedia or newswire articles) artificially-generated pseudo due to data sparseness.We exploit Lexical Chain based templates over Knowledge Graph for generating pseudo-corpora controlled linguistic value.These then learning word embeddings.A number of experiments conducted following test sets:...

10.26615/978-954-452-049-6_087 article EN 2017-11-10

This paper presents an approach for the automatic association of diagnoses in Bulgarian language to ICD-10 codes. Since this task is currently performed manually by medical professionals, ability automate it would save time and allow doctors focus more on patient care. The presented employs a fine-tuned model (i.e. BERT) as multi-class classification model. As there are several different types BERT models, we conduct experiments assess applicability domain specific adaptation. To train our...

10.1145/3429210.3429224 article EN 2020-11-19
Coming Soon ...