NFDI4DS | UHH-SEMS - Publication Details

Marc Franco-Salvador

ORCID: 0000-0001-7946-6601

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5001237595

Research Areas

Topic Modeling
Natural Language Processing Techniques
Text Readability and Simplification
Authorship Attribution and Profiling
Text and Document Classification Technologies
Sentiment Analysis and Opinion Mining
Spam and Phishing Detection
Speech and dialogue systems
Hate Speech and Cyberbullying Detection
COVID-19 diagnosis using AI
Misinformation and Its Impacts
Digital Mental Health Interventions
Online Learning and Analytics
Mental Health via Writing
Domain Adaptation and Few-Shot Learning
Spanish Linguistics and Language Studies
Oil and Gas Production Techniques
COVID-19 and Mental Health
Linguistics, Language Diversity, and Identity
Video Analysis and Summarization
Academic integrity and plagiarism
Web Application Security Vulnerabilities
Imbalanced Data Classification Techniques
Translation Studies and Practices
Pneumonia and Respiratory Infections

NortonLifeLock (United States)
2023

Universitat Politècnica de València
2012-2023

GfK (Germany)
2017-2018

Polytechnic University of Puerto Rico
2017

A systematic study of knowledge graph analysis for cross-language plagiarism detection

OPENALEX - Publications

Marc Franco-Salvador Paolo Rosso Manuel Montes-y-Gómez

10.1016/j.ipm.2015.12.004 article EN Information Processing & Management 2016-01-15

Cross-language plagiarism detection over continuous-space- and knowledge graph-based representations of language

OPENALEX - Publications

Marc Franco-Salvador Parth Gupta Paolo Rosso Rafael E. Banchs

10.1016/j.knosys.2016.08.004 article EN Knowledge-Based Systems 2016-08-06

UH-PRHLT at SemEval-2016 Task 3: Combining Lexical and Semantic-based Features for Community Question Answering

OPENALEX - Publications

Marc Franco-Salvador Sudipta Kar Thamar Solorio Paolo Rosso

In this work we describe the system built for three English subtasks of Se-mEval 2016 Task 3 by Department Computer Science University Houston (UH) and Pattern Recognition Human Language Technology (PRHLT) research center -Universitat Politècnica de València: UH-PRHLT.Our represents instances using both lexical semantic-based similarity measures between text pairs.Our semantic features include use distributed representations words, knowledge graphs generated with BabelNet multilingual...

10.18653/v1/s16-1126 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2016-01-01

Few-Shot Learning with Siamese Networks and Label Tuning

OPENALEX - Publications

Thomas Müller Guillermo Pérez-Torró Marc Franco-Salvador

We study the problem of building text classifiers with little or no training data, commonly known as zero and few-shot classification. In recent years, an approach based on neural textual entailment models has been found to give strong results a diverse range tasks. this work, we show that proper pre-training, Siamese Networks embed texts labels offer competitive alternative. These allow for large reduction in inference cost: constant number rather than linear. Furthermore, introduce label...

10.18653/v1/2022.acl-long.584 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

A Knowledge-based Representation for Cross-Language Document Retrieval and Categorization

OPENALEX - Publications

Marc Franco-Salvador Paolo Rosso Roberto Navigli

Current approaches to cross-language document retrieval and categorization are based on discriminative methods which represent documents in a low-dimensional vector space. In this paper we propose shift from the supervised knowledge-based paradigm provide similarity measure draws BabelNet, large multilingual knowledge resource. Our experiments show state-of-the-art results cross-lingual categorization.

10.3115/v1/e14-1044 article EN cc-by 2014-01-01

A resource-light method for cross-lingual semantic textual similarity

OPENALEX - Publications

Goran Glavaš Marc Franco-Salvador Simone Paolo Ponzetto Paolo Rosso

10.1016/j.knosys.2017.11.041 article EN Knowledge-Based Systems 2017-12-02

Cross-domain polarity classification using a knowledge-enhanced meta-classifier

OPENALEX - Publications

Marc Franco-Salvador Fermín L. Cruz José A. Troyano Paolo Rosso

10.1016/j.knosys.2015.05.020 article EN Knowledge-Based Systems 2015-05-25

Single and Cross-domain Polarity Classification using String Kernels

OPENALEX - Publications

Rosa M. Giménez-Pérez Marc Franco-Salvador Paolo Rosso

The polarity classification task aims at automatically identifying whether a subjective text is positive or negative. When the target domain different from those where model was trained, we refer to cross-domain setting. That setting usually implies use of adaptation method. In this work, study single and tasks string kernels perspective. Contrary classical methods, which employ texts both domains detect pivot features, do not for training. Our approach detects lexical peculiarities that...

10.18653/v1/e17-2089 article EN cc-by 2017-01-01

SymantoResearch at SemEval-2019 Task 3: Combined Neural Models for Emotion Classification in Human-Chatbot Conversations

OPENALEX - Publications

Angelo Basile Marc Franco-Salvador Neha Pawar Sanja Štajner Mara Chinea Ríos and 1 more

In this paper, we present our participation to the EmoContext shared task on detecting emotions in English textual conversations between a human and chatbot. We propose four neural systems combine them further improve results. show that ensemble can successfully distinguish three (SAD, HAPPY, ANGRY) separate from rest (OTHERS) highly-imbalanced scenario. Our best system achieved 0.77 F1-score was ranked fourth out of 165 submissions.

10.18653/v1/s19-2057 article EN cc-by 2019-01-01

Sentence Alignment Methods for Improving Text Simplification Systems

OPENALEX - Publications

Sanja Štajner Marc Franco-Salvador Simone Paolo Ponzetto Paolo Rosso Heiner Stuckenschmidt

Sanja Štajner, Marc Franco-Salvador, Simone Paolo Ponzetto, Rosso, Heiner Stuckenschmidt. Proceedings of the 55th Annual Meeting Association for Computational Linguistics (Volume 2: Short Papers). 2017.

10.18653/v1/p17-2016 article EN cc-by 2017-01-01

Indications of Depressive Symptoms During the COVID-19 Pandemic in Germany: Comparison of National Survey and Twitter Data

OPENALEX - Publications

Caroline Cohrdes Seren Yenikent Jiawen Wu Bilal Ghanem Marc Franco-Salvador and 1 more

Background The current COVID-19 pandemic is associated with extensive individual and societal challenges, including challenges to both physical mental health. To date, the development of health problems such as depressive symptoms accompanying population-based federal distancing measures largely unknown, opportunities for rapid, effective, valid monitoring are currently a relevant matter investigation. Objective In this study, we aim investigate, first, temporal progression during and,...

10.2196/27140 article EN cc-by JMIR Mental Health 2021-06-18

Multilingual phrase sampling for text entry evaluations

OPENALEX - Publications

Marc Franco-Salvador Luis A. Leiva

10.1016/j.ijhcs.2018.01.006 article EN International Journal of Human-Computer Studies 2018-01-31

Overview of AuTexTification at IberLEF 2023: Detection and Attribution of Machine-Generated Text in Multiple Domains

OPENALEX - Publications

Areg Mikael Sarvazyan José Ángel González Marc Franco-Salvador Francisco Rangel Berta Chulvi and 1 more

This paper presents the overview of AuTexTification shared task as part IberLEF 2023 Workshop in Iberian Languages Evaluation Forum, within framework SEPLN conference. consists two subtasks: for Subtask 1, participants had to determine whether a text is human-authored or has been generated by large language model. For 2, attribute machine-generated one six different generation models. Our dataset contains more than 160.000 texts across languages (English and Spanish) five domains (tweets,...

10.48550/arxiv.2309.11285 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

Bridging the Native Language and Language Variety Identification Tasks

OPENALEX - Publications

Marc Franco-Salvador Greg Kondrak Paolo Rosso

The objective of Native Language Identification is to determine the native language author a text that he or she wrote in another language. By contrast, Variety aims at classifying texts representing different varieties single We postulate both tasks may be reduced objective, which identify variety text. design general approach combines string kernels and word embeddings, capture characteristics texts. results our experiments show achieves excellent on tasks, without any task-specific adaptations.

10.1016/j.procs.2017.08.068 article EN Procedia Computer Science 2017-01-01

Semantically-informed distance and similarity measures for paraphrase plagiarism identification

OPENALEX - Publications

Miguel Á. Álvarez‐Carmona Marc Franco-Salvador Esaú Villatoro-Tello Manuel Montes-y-Gómez Paolo Rosso and 1 more

Paraphrase plagiarism identification represents a very complex task given that plagiarized texts are intentionally modified through several rewording techniques. Accordingly, this paper introduces two new measures for evaluating the relatedness of texts: semantically-informed similarity measure and edit distance. Both able to extract semantic information from either an external resource or distributed representation words, resulting in informative features training supervised classifier...

10.3233/jifs-169483 article EN Journal of Intelligent & Fuzzy Systems 2018-05-18

Coming Soon ...