Andrés Carvallo

ORCID: 0000-0002-5042-8772
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Biomedical Text Mining and Ontologies
  • Advanced Text Analysis Techniques
  • Machine Learning in Healthcare
  • Text Readability and Simplification
  • Advanced Graph Neural Networks
  • Recommender Systems and Techniques
  • Speech and dialogue systems
  • Tuberculosis Research and Epidemiology
  • Robotics and Sensor-Based Localization
  • Misinformation and Its Impacts
  • Empathy and Medical Education
  • Data-Driven Disease Surveillance
  • Advanced Vision and Imaging
  • Complex Network Analysis Techniques
  • Global Public Health Policies and Epidemiology
  • Journalism and Media Studies
  • Knowledge Management and Technology
  • Academic integrity and plagiarism
  • Online and Blended Learning
  • Multimodal Machine Learning Applications
  • Explainable Artificial Intelligence (XAI)
  • Advertising and Communication Studies
  • Digital Games and Media

Pontificia Universidad Católica de Chile
2019-2021

Millennium Institute
2021

Center for Research and Advanced Studies of the National Polytechnic Institute
2001-2002

There has been significant progress in recent years the field of Natural Language Processing thanks to introduction Transformer architecture. Current state-of-the-art models, via a large number parameters and pre-training on massive text corpus, have shown impressive results several downstream tasks. Many researchers studied previous (non-Transformer) models understand their actual behavior under different scenarios, showing that these are taking advantage clues or failures datasets slight...

10.48550/arxiv.2002.06261 preprint EN other-oa arXiv (Cornell University) 2020-01-01

The success of pre-trained word embeddings has motivated its use in tasks the biomedical domain. BERT language model shown remarkable results on standard performance metrics such as Named Entity Recognition (NER) and Semantic Textual Similarity (STS), which brought significant progress field NLP. However, it is unclear whether these systems work seemingly well critical domains, legal or medical. For that reason, this work, we propose an adversarial evaluation scheme two well-known datasets...

10.48550/arxiv.2004.11157 preprint EN other-oa arXiv (Cornell University) 2020-01-01

In recent years there have been considerable advances in pre-trained language models, where non-English versions also made available. Due to their increasing use, many lightweight of these models (with reduced parameters) released speed up training and inference times. However, lighter (e.g., ALBERT, DistilBERT) for languages other than English are still scarce. this paper we present ALBETO DistilBETO, which ALBERT DistilBERT exclusively on Spanish corpora. We train several ranging from 5M...

10.48550/arxiv.2204.09145 preprint EN cc-by arXiv (Cornell University) 2022-01-01

COVID-19 has brought about a significant challenge to the whole of humanity, but mainly medical community. Clinicians must keep updated continuously symptoms, diagnoses, and effectiveness emergent treatments under never-ending flood scientific literature. In this context, role evidence based medicine (EBM) for curating most substantial support public health clinical practice turns especially essential is being challenged as never before. Artificial Intelligence can have crucial in situation....

10.52591/lxai202012126 preprint EN 2020-12-12

Abstract In order to perform visual servoing tasks in a robotic system, one is confronted with the low sampling rate of standard cameras and time delay introduced by image processing. One way circumvent time‐delay problem estimate future positions moving object interest employing prediction techniques. this work, three techniques, namely Kalman filtering two adaptive techniques least squares forgetting factor projection algorithm, respectively, are evaluated terms their error. Experimental...

10.1002/acs.628 article EN International Journal of Adaptive Control and Signal Processing 2001-04-23

The success of pretrained word embeddings has motivated their use in the biomedical domain, with contextualized yielding remarkable results several NLP tasks. However, there is a lack research on quantifying behavior under severe "stress" scenarios. In this work, we systematically evaluate three language models adversarial examples -- automatically constructed tests that allow us to examine how robust are. We propose two types stress scenarios focused named entity recognition (NER) task, one...

10.18653/v1/2021.bionlp-1.13 preprint EN cc-by 2021-01-01

While civilized users employ social media to stay informed and discuss daily occurrences, haters perceive these platforms as fertile ground for attacking groups individuals. The prevailing approach counter this phenomenon involves detecting such attacks by identifying toxic language. Effective platform measures aim report block their network access. In context, employing hate speech detection methods aids in amidst vast volumes of text, which are impossible humans analyze manually. our...

10.48550/arxiv.2405.13011 preprint EN arXiv (Cornell University) 2024-05-13

Since the early days of Web 2.0, online communities have been growing quickly and become important part life for large number people. In one these communities, fanfiction.net, users can read write stories which are adapted, recreated modified from original famous books, tv series, movies, among others. By following their authors, fanfiction community creates a social network. Previous research on has shown how features network help explain behavior community, so we interested in studying...

10.48550/arxiv.1909.02886 preprint EN other-oa arXiv (Cornell University) 2019-01-01

The success of neural network embeddings has entailed a renewed interest in using knowledge graphs for wide variety machine learning and information retrieval tasks. In particular, recent recommendation methods based on graph have shown state-of-the-art performance. general, these encode latent rating patterns content features. Differently from previous work, this paper, we propose to exploit extracted that combine ratings aspect-based opinions expressed textual reviews. We then adapt...

10.48550/arxiv.2107.03385 preprint EN public-domain arXiv (Cornell University) 2021-01-01

CoTranslate is a web-based platform designed to efficiently label and review translations from language experts, with the aim of creating high-quality sentence-pair corpuses for training neural machine translation models. Utilizing Django backend ReactJS frontend, fosters collaboration among experts in translating validating sentences. Focused on developing quality corpora, particularly low-resource languages, addresses linguistic barriers enhances quality. By streamlining creation robust...

10.2139/ssrn.4472277 preprint EN 2023-01-01

The emergence of COVID-19 has highlighted the importance reliable information for clinical decision-making and public health policies. Evidence-based medicine (EBM) seeks to identify evaluate scientific documents related novel diseases, biomedical text classification is crucial in accurately categorizing such documents. To aid this process, we present a comprehensive dataset COVID-19-related documents, with 18,854 labeled that include document type, title, abstract, metadata as pubmed id,...

10.2139/ssrn.4489189 preprint EN 2023-01-01

Active learning is an algorithmic approach that strategically selects a subset of examples for labeling, with the goal reducing workload and required resources. Previous research has applied active to Neural Machine Translation (NMT) high-resource or well-represented languages, achieving significant reductions in manual labor. In this study, we explore application NMT context Mapudungun, low-resource language spoken by Mapuche community South America. Mapudungun was chosen due limited number...

10.18653/v1/2023.americasnlp-1.2 article EN cc-by 2023-01-01

The COVID-19 pandemic has underlined the need for reliable information clinical decision-making and public health policies. As such, evidence-based medicine (EBM) is essential in identifying evaluating scientific documents pertinent to novel diseases, accurate classification of biomedical text integral this process. Given context, we introduce a comprehensive, curated dataset composed COVID-19-related documents. This includes 20,047 labeled that were meticulously classified into five...

10.1016/j.dib.2023.109720 article EN cc-by Data in Brief 2023-10-24

The COVID-19 has brought about a significant challenge to the whole of humanity, but with special burden upon medical community. Clinicians must keep updated continuously symptoms, diagnoses, and effectiveness emergent treatments under never-ending flood scientific literature. In this context, role evidence-based medicine (EBM) for curating most substantial evidence support public health clinical practice turns essential is being challenged as never before due high volume research articles...

10.48550/arxiv.2012.00584 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...