- Natural Language Processing Techniques
- Topic Modeling
- Complex Network Analysis Techniques
- Text and Document Classification Technologies
- Advanced Text Analysis Techniques
- Web Data Mining and Analysis
- Advanced Clustering Algorithms Research
- Data Management and Algorithms
- Spam and Phishing Detection
- Data Quality and Management
- Semantic Web and Ontologies
- Misinformation and Its Impacts
- Biomedical Text Mining and Ontologies
- Data Visualization and Analytics
- Recommender Systems and Techniques
- Manufacturing Process and Optimization
- Data Mining Algorithms and Applications
- Direction-of-Arrival Estimation Techniques
- AI-based Problem Solving and Planning
- Data-Driven Disease Surveillance
- Electronic Health Records Systems
- Machine Learning in Healthcare
- Authorship Attribution and Profiling
- Sentiment Analysis and Opinion Mining
- Medical Coding and Health Information
National University of Distance Education
2009-2024
Universidad Nacional de Educación
2015
Universidad Rey Juan Carlos
2014
In this work, we explore the types of triggers that spark trends on Twitter, introducing a typology with following 4 types: news , ongoing events memes and commemoratives . While previous research has analyzed trending topics over long term, look at earliest tweets produce trend, aim categorizing early on. This allows us to provide filtered subset end users. We experiment set straightforward language‐independent features based social spread categorize them using typology. Our method provides...
Twitter summarizes the great deal of messages posted by users in form trending topics that reflect top conversations being discussed at a given moment. These tend to be connected current affairs. Different happenings can give rise emergence these topics. For instance, sports event broadcasted on TV, or viral meme introduced community users. Detecting type origin facilitate information filtering, enhance real-time data processing, and improve user experience. In this paper, we introduce...
User-generated annotations on social bookmarking sites can provide interesting and promising metadata for web document management tasks like page classification. These user-generated include diverse types of information, such as tags comments. Nonetheless, each kind annotation has a different nature popularity level. In this work, we analyze evaluate the usefulness these to classify pages over taxonomy that proposed by Open Directory Project. We compare them separately content-based...
Social tagging systems are becoming an interesting way to retrieve web information from previously annotated data. These sites present a tag cloud made up by the most popular tags, where neither grouping nor their corresponding content is considered. We methodology obtain and visualize of related tags based on use self-organizing maps, relations among established taking into account textual tagged documents. Each map unit can be represented relevant terms it contains, so that possible study...
Undiagnosed and untreated human immunodeficiency virus (HIV) infection increases morbidity in the HIV-positive person allows onward transmission of virus. Minimizing missed opportunities for HIV diagnosis when a patient visits healthcare facility is essential restraining epidemic working toward its eventual elimination. Most state-of-the-art proposals employ machine learning (ML) methods structured data to enhance diagnoses, however, there dearth recent utilizing unstructured textual from...
This paper presents an approach for Multilingual Document Clustering in comparable corpora. The algorithm is of heuristic nature and it uses as unique evidence clustering the identification cognate named entities between both sides One main advantages this that does not depend on bilingual or multilingual resources. However, depends possibility identifying languages used corpus. An additional advantage need any information about right number clusters; calculates it. We have tested with a...
In this article, we present a new clustering algorithm for Person Name Disambiguation in web search results. The groups results according to the individuals they refer to. best state‐of‐the‐art approaches require training data order learn thresholds deciding when group webpages. However, ambiguity level of person names on could not be previously estimated and those methods strongly depend obtained with collections. We concept adaptive threshold, which avoids need previous supervised learning...
In this article, we present a new algorithm for clustering bilingual collection of comparable news items in groups specific topics. Our hypothesis is that named entities ( NE s) are more informative than other features the when fine grained The does not need as input any information related to number clusters, and carries out only based on regarding shared items. This proposal evaluated using different data sets outperforms state‐of‐the‐art algorithms, thereby proving plausibility approach....
Named entities (NEs) can facilitate access to multilingual knowledge sources--which have exploded in recent years--but the identification, classification, and retrieval of NEs remain challenging tasks.
Cognates are words in different languages that have similar spelling and meaning. The identification of cognates is very useful for many Natural Language Processing tasks, also the process learning a second language. This paper presents new approach to classify pairs into cognates/false friends or not related classes. proposed uses fuzzy system combine complementary string similarity measures order improve cognate task. underlying hypothesis combination by applying heuristic knowledge, can...
The idea of component reuse in system design is implicit all material and energy engineering and, particular, electronic engineering. This panorama a desideratum for knowledge (KE) where, on the contrary, there great diversity terms without unique, clear, complete unequivocal meaning, as consequence, notorious lack agreed-upon libraries reusable components. To contribute to solution this problem, article presents simple relational model linking elicitation with implementation. based natural...
Nowadays, earth stations for downloading data from LEO (low orbit) satellites use large reflector antennas. These antennas pose a number of impairments regarding their mechanical complexity, lower flexibility, network efficiency and higher cost. Furthermore, can track only one satellite at time, so the segment is reduced. In order to improve performance traditional stations, alternative antenna technologies shall be considered. A possibility explored in this contribution makes arrays with...
The evaluation of clustering results is one the most important issues in cluster analysis, a core task for effective information access. There are two types measures evaluating quality results: internal and external. External validity evaluate how well match prior knowledge about data, whereas do not need external information, dealing only with within data. In this regard, main drawback that they applicable real-world situations. paper we present an experimental study to determine whether it...
Named Entity Recognition (NER) is an important task used to extract relevant information from biomedical texts. Recently, pre-trained language models have made great progress in this task, particularly English language. However, the performance of Spanish domain has not been evaluated experimentation framework designed specifically for task. We present approach named entity recognition medical texts that makes use domain. also data augmentation techniques improve identification less frequent...
The implantation of the International Statistical Classification Diseases and Related Health Problems 10th Revision has caused coding becomes increasingly complex slow for health professionals. ICD-10 presents a very different architecture to previous version, ICD-9, defining new structure thousands additional codes. Some Spanish institutions try incorporate assist software. However, currently proposed systems do not offer functionality directly suggesting codes from free text in hospital...