- Natural Language Processing Techniques
- Topic Modeling
- Biomedical Text Mining and Ontologies
- Machine Learning in Healthcare
- Text Readability and Simplification
- Electronic Health Records Systems
- linguistics and terminology studies
- Health Education and Validation
- Nursing Diagnosis and Documentation
- Data Quality and Management
- Semantic Web and Ontologies
- Artificial Intelligence in Healthcare
- Medical Coding and Health Information
- Advanced Text Analysis Techniques
- Communication and COVID-19 Impact
- Media and Communication Studies
- Mental Health via Writing
- Misinformation and Its Impacts
Pontifícia Universidade Católica do Paraná
2013-2024
Pontifical Catholic University of Puerto Rico
2019-2023
Agrupamento de Escolas Nuno Álvares
2019
Elisa Terumi Rubel Schneider, João Vitor Andrioli de Souza, Julien Knafou, Lucas Emanuel Silva e Oliveira, Jenny Copara, Yohan Bonescki Gumiel, Ferro Antunes Emerson Cabrera Paraiso, Douglas Teodoro, Cláudia Maria Cabral Moro Barra. Proceedings of the 3rd Clinical Natural Language Processing Workshop. 2020.
The high volume of research focusing on extracting patient information from electronic health records (EHRs) has led to an increase in the demand for annotated corpora, which are a precious resource both development and evaluation natural language processing (NLP) algorithms. absence multipurpose clinical corpus outside scope English language, especially Brazilian Portuguese, is glaring severely impacts scientific progress biomedical NLP field.In this study, semantically was developed using...
Abstract Automatic detection of negated content is often a prerequisite in information extraction systems various domains. In the biomedical domain especially, this task important because negation plays an role. work, two main contributions are proposed. First, we work with languages which have been poorly addressed up to now: Brazilian Portuguese and French. Thus, developed new corpora for these manually annotated marking cues their scope. Second, propose automatic methods based on...
Contextual word embeddings and the Transformers architecture have reached state-of-the-art results in many natural language processing (NLP) tasks improved adaptation of models for multiple domains. Despite improvement reuse construction models, few resources are still developed Portuguese language, especially health domain. Furthermore, clinical available not representative enough all medical specialties. This work explores deep contextual embedding to support NLP tasks. We transferred...
Natural Language Processing and Machine Learning techniques can be used to automatically identify, extract manipulate textual clinical data. Many of these methods are strongly dependent on annotated corpora that very difficult find in the domain, especially for Brazilian Portuguese language. The annotation task is expensive time-consuming; hence, it important provide intelligent computational tools facilitate this kind work. In paper, we propose a collaborative tool assists user by proposing...
ABSTRACT Objective: to reflect on the use of computational tools in cross-mapping method between clinical terminologies. Method: reflection study. Results: consists obtaining a list terms through extraction and normalization; connection those reference base, by means predefined rules; grouping into categories: exact or partial combination or, more detail, similar term, comprehensive restricted term non-agreeing term. Performed manually many studies, it can be automated with Unified Medical...
Considering the difficulties of extracting entities from Electronic Health Records (EHR) texts in Portuguese, we explore Conditional Random Fields (CRF) algorithm to build a Named Entity Recognition (NER) system based on corpus clinical Portuguese data annotated by experts. We acquaint challenges and methods classify Abbreviations, Disorders, Procedures Chemicals within texts. By selecting meaningful set features, parameters with best performance results demonstrate that method is promising...
In this paper, we trained a set of Portuguese clinical word embedding models different granularities from multi-specialty and multi-institutional narrative datasets. Then, assessed their impact on downstream biomedical NLP task Urinary Tract Infection disease identification. Additionally, intrinsically evaluated our main model using an adapted version Bio-SimLex for the language. Our empirical results showed that larger, coarse-grained achieved slightly better outcome when compared with...
The emerging penetration of Health IT in Latin America (especially Brazil) has exacerbated the ever-increasing amount Electronic Record (EHR) clinical free text documents.This imposes a workflow efficiency challenge on clinicians who need to synthesize such documents during typically time-constrained patient care. We propose an ontology-driven semantic search framework that effectively supports clinicians' information synthesis at point
This study describes MappICNP, an automatic method for mapping between Brazilian Portuguese clinical narratives in free text and International Classification Nursing Practice (ICNP) concepts. It's composed of six natural language processing rules, related to terms comparison. A set 2,638 extracted from hospitals nursing notes was mapped. MappICNP helps map 1,607 terms, 113 less than a manual approach. The results demostrate its advantages minimizing the time spent reducing scope analysis...
Objetivo: investigar a eficácia dos modelos de linguagem grande escala (LLM) no reconhecimento entidades nomeadas (NER) em notas clínicas português. Método: Foi analisado o desempenho GPT-3.5, Gemini, Llama-3 e Sabiá-2, na realização NER 30 para identificação das "Sinais ou Sintomas", "Doenças Síndromes" "Dados Negados". A tarefa foi avaliada pelos resultados da precisão, recall F-score cada um destes LLMs. Resultados: O modelo apresentou superior, especialmente sensibilidade, alcançando...
Discharge summaries are an important clinical narrative as they include the continuity of care information. Identification data contained in their text is a difficult task due to its freeform and lack consensus on essential content. This research proposes rule-based method verify presence information about Portuguese texts, applying Natural Language Processing (NLP) techniques, based annotated medical corpus. After experiments, 4 rules were defined applied 200 identify if have or not process...