- Topic Modeling
- Natural Language Processing Techniques
- Advanced Text Analysis Techniques
- Domain Adaptation and Few-Shot Learning
- Advanced Image and Video Retrieval Techniques
- Recommender Systems and Techniques
- Multimodal Machine Learning Applications
- Text Readability and Simplification
- Speech and dialogue systems
- Software Engineering Research
- Web Data Mining and Analysis
- Advanced Graph Neural Networks
- Algorithms and Data Compression
- Image Retrieval and Classification Techniques
- Speech Recognition and Synthesis
- Data Quality and Management
- Information Retrieval and Search Behavior
- Hate Speech and Cyberbullying Detection
- Machine Learning and Data Classification
- Machine Learning and Algorithms
- Privacy-Preserving Technologies in Data
- Handwritten Text Recognition Techniques
Amherst College
2025
University of Massachusetts Amherst
2023-2025
University of Tehran
2021-2023
Knowledge-Intensive Visual Question Answering (KI-VQA) refers to answering a question about an image whose answer does not lie in the image. This paper presents new pipeline for KI-VQA tasks, consisting of retriever and reader. First, we introduce DEDR, symmetric dual encoding dense retrieval framework which documents queries are encoded into shared embedding space using uni-modal (textual) multi-modal encoders. We iterative knowledge distillation approach that bridges gap between...
This paper presents ICAT, an evaluation framework for measuring coverage of diverse factual information in long-form text generation. ICAT breaks down a long output into list atomic claims and not only verifies each claim through retrieval from (reliable) knowledge source, but also computes the alignment between various aspects expected to be presented output. We study three implementations framework, with different assumption on availability method. By adopting data diversification task...
Personalized text generation requires a unique ability of large language models (LLMs) to learn from context that they often do not encounter during their standard training. One way encourage LLMs better use personalized for generating outputs align with the user's expectations is instruct them reason over past preferences, background knowledge, or writing style. To achieve this, we propose Reasoning-Enhanced Self-Training Text Generation (REST-PG), framework trains personal data response...
Evaluating personalized text generated by large language models (LLMs) is challenging, as only the LLM user, i.e., prompt author, can reliably assess output, but re-engaging same individuals across studies infeasible. This paper addresses challenge of evaluating generation introducing ExPerT, an explainable reference-based evaluation framework. ExPerT leverages to extract atomic aspects and their evidence from reference texts, match aspects, evaluate alignment based on content writing style...
Long-text generation is seemingly ubiquitous in real-world applications of large language models such as generating an email or writing a review. Despite the fundamental importance and prevalence long-text many practical applications, existing work on personalized has focused very short text. To overcome these limitations, we study problem generation, that is, for specific user while being practically useful vast majority naturally require longer In this work, demonstrate user-specific...
This paper studies a category of visual question answering tasks, in which accessing external knowledge is necessary for the questions. called outside-knowledge (OK-VQA). A major step developing OK-VQA systems to retrieve relevant documents given multi-modal query. Current state-of-the-art asymmetric dense retrieval model this task uses an architecture with query encoder and uni-modal document encoder. Such requires large amount training data effective performance. We propose automatic...
In the field of language modeling, models augmented with retrieval components have emerged as a promising solution to address several challenges faced in natural processing (NLP) field, including knowledge grounding, interpretability, and scalability. Despite primary focus on NLP, we posit that paradigm retrieval-enhancement can be extended broader spectrum machine learning (ML) such computer vision, time series prediction, computational biology. Therefore, this work introduces formal...
This paper highlights the importance of personalization in large language models and introduces LaMP benchmark -- a novel for training evaluating producing personalized outputs. offers comprehensive evaluation framework with diverse tasks multiple entries each user profile. It consists seven tasks, spanning three text classification four generation tasks. We additionally propose two retrieval augmentation approaches that retrieve personal items from profile personalizing model To this aim,...
This paper studies retrieval-augmented approaches for personalizing large language models (LLMs), which potentially have a substantial impact on various applications and domains. We propose the first attempt to optimize retrieval that deliver limited number of personal documents purpose personalized generation. develop two optimization algorithms solicit feedback from downstream generation tasks optimization--one based reinforcement learning whose reward function is defined using any...
Evaluating retrieval-augmented generation (RAG) presents challenges, particularly for retrieval models within these systems. Traditional end-to-end evaluation methods are computationally expensive. Furthermore, of the model's performance based on query-document relevance labels shows a small correlation with RAG system's downstream performance. We propose novel approach, eRAG, where each document in list is individually utilized by large language model system. The output generated then...
This paper introduces uRAG--a framework with a unified retrieval engine that serves multiple downstream retrieval-augmented generation (RAG) systems. Each RAG system consumes the results for unique purpose, such as open-domain question answering, fact verification, entity linking, and relation extraction. We introduce generic training guideline standardizes communication between search systems engage in optimizing model. lays groundwork us to build large-scale experimentation ecosystem...
This paper investigates the design of a unified search engine to serve multiple retrieval-augmented generation (RAG) agents, each with distinct task, backbone large language model (LLM), and retrieval-augmentation strategy. We introduce an iterative approach where generates retrieval results for these RAG agents gathers feedback on quality retrieved documents during offline phase. is then used iteratively optimize using novel expectation-maximization algorithm, goal maximizing agent's...
Privacy-preserving methods for personalizing large language models (LLMs) are relatively under-explored. There two schools of thought on this topic: (1) generating personalized outputs by the input prompt through retrieval augmentation from user's personal information (RAG-based methods), and (2) parameter-efficient fine-tuning LLMs per user that considers efficiency space limitations (PEFT-based methods). This paper presents first systematic comparison between approaches a wide range...
Retrieval-enhanced machine learning (REML) refers to the use of information retrieval methods support reasoning and inference in tasks. Although relatively recent, these approaches can substantially improve model performance. This includes improved generalization, knowledge grounding, scalability, freshness, attribution, interpretability on-device learning. To date, despite being influenced by work community, REML research has predominantly been presented natural language processing (NLP)...
An evolving solution to address hallucination and enhance accuracy in large language models (LLMs) is Retrieval-Augmented Generation (RAG), which involves augmenting LLMs with information retrieved from an external knowledge source, such as the web. This paper profiles several RAG execution pipelines demystifies complex interplay between their retrieval generation phases. We demonstrate that while exact schemes are expensive, they can reduce inference time compared approximate variants...
Detecting which parts of a sentence contribute to that sentence's toxicity—rather than providing sentence-level verdict hatefulness— would increase the interpretability models and allow human moderators better understand outputs system. This paper presents our team's, UTNLP, methodology results in SemEval-2021 shared task 5 on toxic spans detection. We test multiple contextual embeddings report best setting out all. The experiments start with keyword-based are followed by attention-based,...
ive text summarization is one of the areas influenced by emergence pre-trained language models. Current pre-training works in abstractive give more points to summaries with words common main and pay less attention semantic similarity between generated sentences original document. We propose ARMAN, a Transformer-based encoder-decoder model three novel objectives address this issue. In salient from document are selected according modified score be masked form pseudo summary. To summarize...
Abstractive text summarization is one of the areas influenced by emergence pre-trained language models. Current pre-training works in abstractive give more points to summaries with words common main and pay less attention semantic similarity between generated sentences original document. We propose ARMAN, a Transformer-based encoder-decoder model three novel objectives address this issue. In salient from document are selected according modified score be masked form pseudo summary. To...
Multilingual pre-training significantly improves many multilingual NLP tasks, including machine translation. Most existing methods are based on some variants of masked language modeling and text-denoising objectives monolingual data. data ignores the availability parallel in pairs. Also, other works integrate available human-generated translation their pre-training. This kind is definitely helpful, but it limited even high-resource paper introduces a novel semi-supervised method, SPDG, that...
Knowledge-Intensive Visual Question Answering (KI-VQA) refers to answering a question about an image whose answer does not lie in the image. This paper presents new pipeline for KI-VQA tasks, consisting of retriever and reader. First, we introduce DEDR, symmetric dual encoding dense retrieval framework which documents queries are encoded into shared embedding space using uni-modal (textual) multi-modal encoders. We iterative knowledge distillation approach that bridges gap between...