Alireza Salemi

ORCID: 0009-0006-1937-2615
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Text Analysis Techniques
  • Domain Adaptation and Few-Shot Learning
  • Advanced Image and Video Retrieval Techniques
  • Recommender Systems and Techniques
  • Multimodal Machine Learning Applications
  • Text Readability and Simplification
  • Speech and dialogue systems
  • Software Engineering Research
  • Web Data Mining and Analysis
  • Advanced Graph Neural Networks
  • Algorithms and Data Compression
  • Image Retrieval and Classification Techniques
  • Speech Recognition and Synthesis
  • Data Quality and Management
  • Information Retrieval and Search Behavior
  • Hate Speech and Cyberbullying Detection
  • Machine Learning and Data Classification
  • Machine Learning and Algorithms
  • Privacy-Preserving Technologies in Data
  • Handwritten Text Recognition Techniques

Amherst College
2025

University of Massachusetts Amherst
2023-2025

University of Tehran
2021-2023

10.1145/3626772.3657957 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2024-07-10

10.1145/3626772.3657783 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2024-07-10

Knowledge-Intensive Visual Question Answering (KI-VQA) refers to answering a question about an image whose answer does not lie in the image. This paper presents new pipeline for KI-VQA tasks, consisting of retriever and reader. First, we introduce DEDR, symmetric dual encoding dense retrieval framework which documents queries are encoded into shared embedding space using uni-modal (textual) multi-modal encoders. We iterative knowledge distillation approach that bridges gap between...

10.1145/3539618.3591629 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023-07-18

This paper presents ICAT, an evaluation framework for measuring coverage of diverse factual information in long-form text generation. ICAT breaks down a long output into list atomic claims and not only verifies each claim through retrieval from (reliable) knowledge source, but also computes the alignment between various aspects expected to be presented output. We study three implementations framework, with different assumption on availability method. By adopting data diversification task...

10.48550/arxiv.2501.03545 preprint EN arXiv (Cornell University) 2025-01-07

Personalized text generation requires a unique ability of large language models (LLMs) to learn from context that they often do not encounter during their standard training. One way encourage LLMs better use personalized for generating outputs align with the user's expectations is instruct them reason over past preferences, background knowledge, or writing style. To achieve this, we propose Reasoning-Enhanced Self-Training Text Generation (REST-PG), framework trains personal data response...

10.48550/arxiv.2501.04167 preprint EN arXiv (Cornell University) 2025-01-07

Evaluating personalized text generated by large language models (LLMs) is challenging, as only the LLM user, i.e., prompt author, can reliably assess output, but re-engaging same individuals across studies infeasible. This paper addresses challenge of evaluating generation introducing ExPerT, an explainable reference-based evaluation framework. ExPerT leverages to extract atomic aspects and their evidence from reference texts, match aspects, evaluate alignment based on content writing style...

10.48550/arxiv.2501.14956 preprint EN arXiv (Cornell University) 2025-01-24

10.1145/3626772.3657733 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2024-07-10

Long-text generation is seemingly ubiquitous in real-world applications of large language models such as generating an email or writing a review. Despite the fundamental importance and prevalence long-text many practical applications, existing work on personalized has focused very short text. To overcome these limitations, we study problem generation, that is, for specific user while being practically useful vast majority naturally require longer In this work, demonstrate user-specific...

10.48550/arxiv.2407.11016 preprint EN arXiv (Cornell University) 2024-06-26

This paper studies a category of visual question answering tasks, in which accessing external knowledge is necessary for the questions. called outside-knowledge (OK-VQA). A major step developing OK-VQA systems to retrieve relevant documents given multi-modal query. Current state-of-the-art asymmetric dense retrieval model this task uses an architecture with query encoder and uni-modal document encoder. Such requires large amount training data effective performance. We propose automatic...

10.1145/3578337.3605137 article EN 2023-08-09

In the field of language modeling, models augmented with retrieval components have emerged as a promising solution to address several challenges faced in natural processing (NLP) field, including knowledge grounding, interpretability, and scalability. Despite primary focus on NLP, we posit that paradigm retrieval-enhancement can be extended broader spectrum machine learning (ML) such computer vision, time series prediction, computational biology. Therefore, this work introduces formal...

10.48550/arxiv.2407.12982 preprint EN arXiv (Cornell University) 2024-07-17

This paper highlights the importance of personalization in large language models and introduces LaMP benchmark -- a novel for training evaluating producing personalized outputs. offers comprehensive evaluation framework with diverse tasks multiple entries each user profile. It consists seven tasks, spanning three text classification four generation tasks. We additionally propose two retrieval augmentation approaches that retrieve personal items from profile personalizing model To this aim,...

10.48550/arxiv.2304.11406 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

This paper studies retrieval-augmented approaches for personalizing large language models (LLMs), which potentially have a substantial impact on various applications and domains. We propose the first attempt to optimize retrieval that deliver limited number of personal documents purpose personalized generation. develop two optimization algorithms solicit feedback from downstream generation tasks optimization--one based reinforcement learning whose reward function is defined using any...

10.48550/arxiv.2404.05970 preprint EN arXiv (Cornell University) 2024-04-08

Evaluating retrieval-augmented generation (RAG) presents challenges, particularly for retrieval models within these systems. Traditional end-to-end evaluation methods are computationally expensive. Furthermore, of the model's performance based on query-document relevance labels shows a small correlation with RAG system's downstream performance. We propose novel approach, eRAG, where each document in list is individually utilized by large language model system. The output generated then...

10.48550/arxiv.2404.13781 preprint EN arXiv (Cornell University) 2024-04-21

This paper introduces uRAG--a framework with a unified retrieval engine that serves multiple downstream retrieval-augmented generation (RAG) systems. Each RAG system consumes the results for unique purpose, such as open-domain question answering, fact verification, entity linking, and relation extraction. We introduce generic training guideline standardizes communication between search systems engage in optimizing model. lays groundwork us to build large-scale experimentation ecosystem...

10.48550/arxiv.2405.00175 preprint EN arXiv (Cornell University) 2024-04-30

This paper investigates the design of a unified search engine to serve multiple retrieval-augmented generation (RAG) agents, each with distinct task, backbone large language model (LLM), and retrieval-augmentation strategy. We introduce an iterative approach where generates retrieval results for these RAG agents gathers feedback on quality retrieved documents during offline phase. is then used iteratively optimize using novel expectation-maximization algorithm, goal maximizing agent's...

10.48550/arxiv.2410.09942 preprint EN arXiv (Cornell University) 2024-10-13

Privacy-preserving methods for personalizing large language models (LLMs) are relatively under-explored. There two schools of thought on this topic: (1) generating personalized outputs by the input prompt through retrieval augmentation from user's personal information (RAG-based methods), and (2) parameter-efficient fine-tuning LLMs per user that considers efficiency space limitations (PEFT-based methods). This paper presents first systematic comparison between approaches a wide range...

10.48550/arxiv.2409.09510 preprint EN arXiv (Cornell University) 2024-09-14

Retrieval-enhanced machine learning (REML) refers to the use of information retrieval methods support reasoning and inference in tasks. Although relatively recent, these approaches can substantially improve model performance. This includes improved generalization, knowledge grounding, scalability, freshness, attribution, interpretability on-device learning. To date, despite being influenced by work community, REML research has predominantly been presented natural language processing (NLP)...

10.1145/3673791.3698439 article EN cc-by-nd 2024-12-08

An evolving solution to address hallucination and enhance accuracy in large language models (LLMs) is Retrieval-Augmented Generation (RAG), which involves augmenting LLMs with information retrieved from an external knowledge source, such as the web. This paper profiles several RAG execution pipelines demystifies complex interplay between their retrieval generation phases. We demonstrate that while exact schemes are expensive, they can reduce inference time compared approximate variants...

10.48550/arxiv.2412.15246 preprint EN arXiv (Cornell University) 2024-12-14

Detecting which parts of a sentence contribute to that sentence's toxicity—rather than providing sentence-level verdict hatefulness— would increase the interpretability models and allow human moderators better understand outputs system. This paper presents our team's, UTNLP, methodology results in SemEval-2021 shared task 5 on toxic spans detection. We test multiple contextual embeddings report best setting out all. The experiments start with keyword-based are followed by attention-based,...

10.18653/v1/2021.semeval-1.136 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2021-01-01

ive text summarization is one of the areas influenced by emergence pre-trained language models. Current pre-training works in abstractive give more points to summaries with words common main and pay less attention semantic similarity between generated sentences original document. We propose ARMAN, a Transformer-based encoder-decoder model three novel objectives address this issue. In salient from document are selected according modified score be masked form pseudo summary. To summarize...

10.18653/v1/2021.emnlp-main.741 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

Abstractive text summarization is one of the areas influenced by emergence pre-trained language models. Current pre-training works in abstractive give more points to summaries with words common main and pay less attention semantic similarity between generated sentences original document. We propose ARMAN, a Transformer-based encoder-decoder model three novel objectives address this issue. In salient from document are selected according modified score be masked form pseudo summary. To...

10.48550/arxiv.2109.04098 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Multilingual pre-training significantly improves many multilingual NLP tasks, including machine translation. Most existing methods are based on some variants of masked language modeling and text-denoising objectives monolingual data. data ignores the availability parallel in pairs. Also, other works integrate available human-generated translation their pre-training. This kind is definitely helpful, but it limited even high-resource paper introduces a novel semi-supervised method, SPDG, that...

10.48550/arxiv.2304.01282 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Knowledge-Intensive Visual Question Answering (KI-VQA) refers to answering a question about an image whose answer does not lie in the image. This paper presents new pipeline for KI-VQA tasks, consisting of retriever and reader. First, we introduce DEDR, symmetric dual encoding dense retrieval framework which documents queries are encoded into shared embedding space using uni-modal (textual) multi-modal encoders. We iterative knowledge distillation approach that bridges gap between...

10.48550/arxiv.2304.13649 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01
Coming Soon ...