Marc Franco-Salvador

ORCID: 0000-0001-7946-6601
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Text Readability and Simplification
  • Authorship Attribution and Profiling
  • Text and Document Classification Technologies
  • Sentiment Analysis and Opinion Mining
  • Spam and Phishing Detection
  • Speech and dialogue systems
  • Hate Speech and Cyberbullying Detection
  • COVID-19 diagnosis using AI
  • Misinformation and Its Impacts
  • Digital Mental Health Interventions
  • Online Learning and Analytics
  • Mental Health via Writing
  • Domain Adaptation and Few-Shot Learning
  • Spanish Linguistics and Language Studies
  • Oil and Gas Production Techniques
  • COVID-19 and Mental Health
  • Linguistics, Language Diversity, and Identity
  • Video Analysis and Summarization
  • Academic integrity and plagiarism
  • Web Application Security Vulnerabilities
  • Imbalanced Data Classification Techniques
  • Translation Studies and Practices
  • Pneumonia and Respiratory Infections

NortonLifeLock (United States)
2023

Universitat Politècnica de València
2012-2023

GfK (Germany)
2017-2018

Polytechnic University of Puerto Rico
2017

In this work we describe the system built for three English subtasks of Se-mEval 2016 Task 3 by Department Computer Science University Houston (UH) and Pattern Recognition Human Language Technology (PRHLT) research center -Universitat Politècnica de València: UH-PRHLT.Our represents instances using both lexical semantic-based similarity measures between text pairs.Our semantic features include use distributed representations words, knowledge graphs generated with BabelNet multilingual...

10.18653/v1/s16-1126 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2016-01-01

We study the problem of building text classifiers with little or no training data, commonly known as zero and few-shot classification. In recent years, an approach based on neural textual entailment models has been found to give strong results a diverse range tasks. this work, we show that proper pre-training, Siamese Networks embed texts labels offer competitive alternative. These allow for large reduction in inference cost: constant number rather than linear. Furthermore, introduce label...

10.18653/v1/2022.acl-long.584 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Current approaches to cross-language document retrieval and categorization are based on discriminative methods which represent documents in a low-dimensional vector space. In this paper we propose shift from the supervised knowledge-based paradigm provide similarity measure draws BabelNet, large multilingual knowledge resource. Our experiments show state-of-the-art results cross-lingual categorization.

10.3115/v1/e14-1044 article EN cc-by 2014-01-01

The polarity classification task aims at automatically identifying whether a subjective text is positive or negative. When the target domain different from those where model was trained, we refer to cross-domain setting. That setting usually implies use of adaptation method. In this work, study single and tasks string kernels perspective. Contrary classical methods, which employ texts both domains detect pivot features, do not for training. Our approach detects lexical peculiarities that...

10.18653/v1/e17-2089 article EN cc-by 2017-01-01

In this paper, we present our participation to the EmoContext shared task on detecting emotions in English textual conversations between a human and chatbot. We propose four neural systems combine them further improve results. show that ensemble can successfully distinguish three (SAD, HAPPY, ANGRY) separate from rest (OTHERS) highly-imbalanced scenario. Our best system achieved 0.77 F1-score was ranked fourth out of 165 submissions.

10.18653/v1/s19-2057 article EN cc-by 2019-01-01

Sanja Štajner, Marc Franco-Salvador, Simone Paolo Ponzetto, Rosso, Heiner Stuckenschmidt. Proceedings of the 55th Annual Meeting Association for Computational Linguistics (Volume 2: Short Papers). 2017.

10.18653/v1/p17-2016 article EN cc-by 2017-01-01

Background The current COVID-19 pandemic is associated with extensive individual and societal challenges, including challenges to both physical mental health. To date, the development of health problems such as depressive symptoms accompanying population-based federal distancing measures largely unknown, opportunities for rapid, effective, valid monitoring are currently a relevant matter investigation. Objective In this study, we aim investigate, first, temporal progression during and,...

10.2196/27140 article EN cc-by JMIR Mental Health 2021-06-18

10.1016/j.ijhcs.2018.01.006 article EN International Journal of Human-Computer Studies 2018-01-31

This paper presents the overview of AuTexTification shared task as part IberLEF 2023 Workshop in Iberian Languages Evaluation Forum, within framework SEPLN conference. consists two subtasks: for Subtask 1, participants had to determine whether a text is human-authored or has been generated by large language model. For 2, attribute machine-generated one six different generation models. Our dataset contains more than 160.000 texts across languages (English and Spanish) five domains (tweets,...

10.48550/arxiv.2309.11285 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

The objective of Native Language Identification is to determine the native language author a text that he or she wrote in another language. By contrast, Variety aims at classifying texts representing different varieties single We postulate both tasks may be reduced objective, which identify variety text. design general approach combines string kernels and word embeddings, capture characteristics texts. results our experiments show achieves excellent on tasks, without any task-specific adaptations.

10.1016/j.procs.2017.08.068 article EN Procedia Computer Science 2017-01-01

Paraphrase plagiarism identification represents a very complex task given that plagiarized texts are intentionally modified through several rewording techniques. Accordingly, this paper introduces two new measures for evaluating the relatedness of texts: semantically-informed similarity measure and edit distance. Both able to extract semantic information from either an external resource or distributed representation words, resulting in informative features training supervised classifier...

10.3233/jifs-169483 article EN Journal of Intelligent & Fuzzy Systems 2018-05-18
Coming Soon ...