Helia Hashemi

ORCID: 0000-0001-7258-7849
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Expert finding and Q&A systems
  • Multimodal Machine Learning Applications
  • Information Retrieval and Search Behavior
  • Natural Language Processing Techniques
  • Mobile Crowdsensing and Crowdsourcing
  • FinTech, Crowdfunding, Digital Finance
  • Domain Adaptation and Few-Shot Learning
  • Web Data Mining and Analysis
  • Speech and dialogue systems
  • Radio, Podcasts, and Digital Media
  • Recommender Systems and Techniques
  • Caching and Content Delivery
  • Advanced Data Storage Technologies

University of Massachusetts Amherst
2019-2023

Asking clarifying questions in response to ambiguous or faceted queries has been recognized as a useful technique for various information retrieval systems, especially conversational search systems with limited bandwidth interfaces. Analyzing and generating have studied recently but the accurate utilization of user responses relatively less explored. In this paper, we enrich representations learned by Transformer networks using novel attention mechanism from external sources that weights...

10.1145/3397271.3401061 article EN 2020-07-25

Podcasts are spoken documents across a wide-range of genres and styles, with growing listenership the world, rapidly lowering barrier to entry for both listeners creators. The great strides in search recommendation research industry have yet see impact podcast space, where recommendations still largely driven by word mouth. In this perspective paper, we highlight many differences between podcasts other media, discuss our on challenges future directions domain information access.

10.1145/3404835.3462805 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2021-07-11

In information retrieval (IR), domain adaptation is the process of adapting a model to new whose data distribution different from source domain. Existing methods in this area focus on unsupervised where they have access target document collection or supervised (often few-shot) additionally (limited) labeled There also exists research improving zero-shot performance models with no adaptation. This paper introduces category IR that as-yet unexplored. Here, similar setting, we assume does not...

10.1145/3578337.3605127 article EN 2023-08-09

Estimating the quality of a result list, often referred to as query performance prediction (QPP), is challenging and important task in information retrieval. It can be used feedback users, search engines, system administrators. Although predicting retrieval models has been extensively studied for ad-hoc task, effectiveness methods question answering (QA) systems relatively unstudied. The short length answers, dominance neural QA, re-ranking nature most QA make unique, important, technically...

10.1145/3341981.3344249 article EN 2019-09-26

Representation learning has always played an important role in information retrieval (IR) systems. Most models, including recent neural approaches, use representations to calculate similarities between queries and documents find relevant from a corpus. Recent models large-scale pre-trained language for query representation. The typical of these however, major limitation that they generate only single representation query, which may have multiple intents or facets. focus this paper is address...

10.1145/3459637.3482445 article EN 2021-10-26

This paper introduces a framework for the automated evaluation of natural language texts. A manually constructed rubric describes how to assess multiple dimensions interest. To evaluate text, large model (LLM) is prompted with each question and produces distribution over potential responses. The LLM predictions often fail agree well human judges -- indeed, humans do not fully one another. However, distributions can be $\textit{combined}$ $\textit{predict}$ judge's annotations on all...

10.18653/v1/2024.acl-long.745 preprint EN 2024-01-01

Learning multiple intent representations for queries has potential applications in facet generation, document ranking, search result diversification, and explanation. The state-of-the-art model this task assumes that there is a sequence of representations. In paper, we argue the should not be penalized as long it generates an accurate complete set Based on intuition, propose stochastic permutation invariant approach optimizing such networks. We extrinsically evaluate proposed generation...

10.1145/3511808.3557666 article EN Proceedings of the 31st ACM International Conference on Information & Knowledge Management 2022-10-15

Over recent years, podcasts have emerged as a novel medium for sharing and broadcasting information over the Internet. Audio streaming platforms originally designed music content, such Amazon Music, Pandora, Spotify, reported rapid growth, with millions of users consuming every day. With emerging new information, need to develop access systems that enable efficient effective discovery from heterogeneous collection is more important than ever. However, in domains still remains understudied....

10.1145/3447548.3467188 article EN 2021-08-13

Considering the widespread use of mobile and voice search, answer passage retrieval for non-factoid questions plays a critical role in modern information systems. Despite importance task, community still feels significant lack large-scale question answering collections with real comprehensive relevance judgments. In this paper, we develop release collection 2,626 open-domain from diverse set categories. The dataset, called ANTIQUE, contains 34,011 manual annotations. were asked by users...

10.48550/arxiv.1905.08957 preprint EN other-oa arXiv (Cornell University) 2019-01-01

The rise in popularity of mobile and voice search has led to a shift focus from document retrieval short answer passage for non-factoid questions. Some the questions have multiple answers, aim is retrieve set relevant passages, which covers all these alternatives. Compared documents, answers are more specific typically form defined types or groups. Grouping passages based on strong similarity measures may provide means identifying types. Typically, kNN clustering combination with term-based...

10.1145/3471158.3472249 article EN 2021-07-11

Podcasts are spoken documents across a wide-range of genres and styles, with growing listenership the world, rapidly lowering barrier to entry for both listeners creators. The great strides in search recommendation research industry have yet see impact podcast space, where recommendations still largely driven by word mouth. In this perspective paper, we highlight many differences between podcasts other media, discuss our on challenges future directions domain information access.

10.48550/arxiv.2106.09227 preprint EN other-oa arXiv (Cornell University) 2021-01-01

In information retrieval (IR), domain adaptation is the process of adapting a model to new whose data distribution different from source domain. Existing methods in this area focus on unsupervised where they have access target document collection or supervised (often few-shot) additionally (limited) labeled There also exists research improving zero-shot performance models with no adaptation. This paper introduces category IR that as-yet unexplored. Here, similar setting, we assume does not...

10.48550/arxiv.2307.02740 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Asking clarifying questions in response to ambiguous or faceted queries has been recognized as a useful technique for various information retrieval systems, especially conversational search systems with limited bandwidth interfaces. Analyzing and generating have studied recently but the accurate utilization of user responses relatively less explored. In this paper, we enrich representations learned by Transformer networks using novel attention mechanism from external sources that weights...

10.48550/arxiv.2006.07548 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...