NFDI4DS | UHH-SEMS - Publication Details

Parth Gupta

ORCID: 0000-0003-0232-3412

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5102919542

Research Areas

Natural Language Processing Techniques
Topic Modeling
Text Readability and Simplification
Authorship Attribution and Profiling
Text and Document Classification Technologies
Web Data Mining and Analysis
Information Retrieval and Search Behavior
Speech Recognition and Synthesis
Academic integrity and plagiarism
Recommender Systems and Techniques
Data Stream Mining Techniques
Algorithms and Data Compression
Advanced Text Analysis Techniques
Bayesian Modeling and Causal Inference
Mobile Crowdsensing and Crowdsourcing
Spanish Linguistics and Language Studies
Computational and Text Analysis Methods
Imbalanced Data Classification Techniques
Image Retrieval and Classification Techniques
Handwritten Text Recognition Techniques
Advanced Bandit Algorithms Research
Semantic Web and Ontologies
Adversarial Robustness in Machine Learning
Scientific Research and Technology
Image Enhancement Techniques

Amazon (United States)
2020-2024

Guru Gobind Singh Indraprastha University
2023

Search
2020-2023

Indian Institute of Technology Bombay
2023

Amazon (Germany)
2020

Indian Institute of Technology Roorkee
2020

Amity University
2020

Universitat Politècnica de València
2011-2017

Institute for Infocomm Research
2017

International Institute of Information Technology, Hyderabad
2014

Query expansion for mixed-script information retrieval

OPENALEX - Publications

Parth Gupta Kalika Bali Rafael E. Banchs Monojit Choudhury Paolo Rosso

For many languages that use non-Roman based indigenous scripts (e.g., Arabic, Greek and Indic languages) one can often find a large amount of user generated transliterated content on the Web in Roman script. Such creates monolingual or multi-lingual space with more than script which we refer to as Mixed-Script space. IR mixed-script is challenging because queries written either native need be matched documents both scripts. Moreover, features extensive spelling variations. In this paper,...

10.1145/2600428.2609622 article EN 2014-07-03

Methods for cross-language plagiarism detection

OPENALEX - Publications

Alberto Barrón‐Cedeño Parth Gupta Paolo Rosso

10.1016/j.knosys.2013.06.018 article EN Knowledge-Based Systems 2013-07-04

Cross-language plagiarism detection over continuous-space- and knowledge graph-based representations of language

OPENALEX - Publications

Marc Franco-Salvador Parth Gupta Paolo Rosso Rafael E. Banchs

10.1016/j.knosys.2016.08.004 article EN Knowledge-Based Systems 2016-08-06

Continuous space models for CLIR

OPENALEX - Publications

Parth Gupta Rafael E. Banchs Paolo Rosso

10.1016/j.ipm.2016.11.002 article EN Information Processing & Management 2016-12-07

Can Clicks Be Both Labels and Features?

OPENALEX - Publications

Tao Yang Chen Luo Hanqing Lu Parth Gupta Bing Yin and 1 more

Using implicit feedback collected from user clicks as training labels for learning-to-rank algorithms is a well-developed paradigm that has been extensively studied and used in modern IR systems. ranking features, on the other hand, not fully explored existing literature. Despite its potential improving short-term system performance, whether incorporation of features beneficial systems long term still questionable. Two most important problems are (1) explicit bias introduced by noisy...

10.1145/3477495.3531948 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

A Hybrid Multi-focus Image Fusion Technique using SWT and PCA

OPENALEX - Publications

Tushar Tyagi Parth Gupta Prabhishek Singh

This paper presents a new hybrid and parallel processing image fusion technique for multi-focus images. Here, two different methods are used i.e. Stationary Wavelet Transform (SWT) Principal Component Analysis (PCA) that implemented on the input images in parallel. These applied same dataset. method is although computationally bit slower than compared but still it shows better results. The fused obtained from SWT PCA later again using method. technique. result of proposed with other...

10.1109/confluence47617.2020.9057960 article EN 2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence) 2020-01-01

Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes Approach

OPENALEX - Publications

Tao Yang Cuize Han Chen Luo Parth Gupta Jeff M. Phillips and 1 more

Ranking is at the core of many artificial intelligence (AI) applications, including search engines, recommender systems, etc. Modern ranking systems are often constructed with learning-to-rank (LTR) models built from user behavior signals. While previous studies have demonstrated effectiveness using signals (e.g., clicks) as both features and labels LTR algorithms, we argue that existing algorithms indiscriminately treat non-behavior in input could lead to suboptimal performance practice....

10.1145/3589334.3645487 article EN Proceedings of the ACM Web Conference 2022 2024-05-08

PAN@FIRE

OPENALEX - Publications

Parth Gupta Paul Clough Paolo Rosso Mark Stevenson Rafael E. Banchs

The automatic alignment of documents in a quasi-comparable corpus is an important research problem for resource poor cross-language technologies. News stories form one the most prolific and abundant language resource. [email protected] task, !ndia news story search (CL!NSS), aimed to address linking task across languages English Hindi. We present overview track with results analysis.

10.1145/2701336.2701639 article EN 2013-12-04

Mapping farmer vulnerability to target interventions for climate-resilient agriculture: science in practice

OPENALEX - Publications

Pooja Prasad Parth Gupta Hemant Belsare Chirag M. Mahendra Manasi Bhopale and 2 more

Abstract Farmers in dryland regions are highly vulnerable to rainfall variability. This vulnerability is unequal, as it mediated by biophysical and social factors. Implementing policies for climate resilience requires identification of farmers who most extreme events like dry spells. We develop a novel approach conceptualizing spell at the farm scale terms monsoon crop water deficit. Using inputs weather, terrain, soil properties, land-use-land-cover, cadastral maps, our tool models an...

10.2166/wp.2023.036 article EN cc-by Water Policy 2023-07-19

Squeezing bottlenecks: Exploring the limits of autoencoder semantic representation capabilities

OPENALEX - Publications

Parth Gupta Rafael E. Banchs Paolo Rosso

10.1016/j.neucom.2015.06.091 article EN Neurocomputing 2015-11-10

Treating Cold Start in Product Search by Priors

OPENALEX - Publications

Parth Gupta Tommaso Dreossi Jan Bakus Yu-Hsiang Lin Vamsi Salaka

New products in e-commerce platforms suffer from cold start, both recommendation and search. In this study, we present experiments to deal with start search by predicting priors for behavioral features learning rank set up. The offline results show that our technique generates which closely track posterior values. online A/B test on 140MM queries shows treatment improves new impressions increased customers engagement pointing their relevance quality.

10.1145/3366424.3382705 article EN Companion Proceedings of the The Web Conference 2018 2020-04-20

A deep source-context feature for lexical selection in statistical machine translation

OPENALEX - Publications

Parth Gupta Marta R. Costa‐jussà Paolo Rosso Rafael E. Banchs

10.1016/j.patrec.2016.02.014 article EN Pattern Recognition Letters 2016-03-12

Cross-language Plagiarism Detection Using BabelNet’s Statistical Dictionary

OPENALEX - Publications

Marc Franco-Salvador Parth Gupta Paolo Rosso

En los ultimos anos ha habido importantes avances en el campo de la deteccion plagio automatica. Uno ellos es translingue, cual trata detectar entre documentos diferentes idiomas. La mayoria aproximaciones que existen para esta tarea hacen uso diccionarios estadisticos lidiar con las traducciones palabras documentos. Un diccionario estadistico nos proporciona, una palabra dada, lista posibles sus respectivas probabilidades. El objetivo este trabajo analizar rendimiento del red semantica...

10.13053/cys-16-4-1439 article ES Computación y Sistemas 2012-12-14

Seasonal Relevance in E-Commerce Search

OPENALEX - Publications

Haode Yang Parth Gupta Roberto F. Galán Dan Bu Dongmei Jia

Seasonality is an important dimension for relevance in e-commerce search. For example, a query jacket has different set of relevant documents winter than summer. optimal user experience, the search engines should incorporate seasonality product In this paper, we formally introduce concept seasonal relevance, define it and quantify using data from major store. our analyses, find 39% queries are highly seasonally to time would benefit handling ranking. We propose LogSR VelSR features capture...

10.1145/3459637.3481951 article EN 2021-10-26

Coming Soon ...