NFDI4DS | UHH-SEMS - Publication Details

Rishabh Joshi

ORCID: 0000-0003-2513-1536

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5060923104

Research Areas

Topic Modeling
Natural Language Processing Techniques
Misinformation and Its Impacts
Speech and dialogue systems
Biomedical Text Mining and Ontologies
Sentiment Analysis and Opinion Mining
Semantic Web and Ontologies
Digital Communication and Language
Language, Discourse, Communication Strategies
Language, Metaphor, and Cognition
Photoacoustic and Ultrasonic Imaging
Image and Signal Denoising Methods
Advanced Text Analysis Techniques
Masonry and Concrete Structural Analysis
Authorship Attribution and Profiling
Multimodal Machine Learning Applications
Structural Response to Dynamic Loads
Optical Coherence Tomography Applications
Fuzzy Logic and Control Systems
Advanced Image Fusion Techniques
Land Use and Ecosystem Services
Digital Games and Media
Advanced Graph Neural Networks
3D Surveying and Cultural Heritage
Multi-Agent Systems and Negotiation

Shri Ramswaroop Memorial University
2019-2024

Carnegie Mellon University
2018-2023

Pacific Northwest National Laboratory
2023

Northwestern University
2019

Birla Institute of Technology and Science, Pilani
2018

Indian Institute of Science Bangalore
2018

RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information

OPENALEX - Publications

Shikhar Vashishth Rishabh Joshi Sai Suman Prayaga Chiranjib Bhattacharyya Partha Talukdar

Distantly-supervised Relation Extraction (RE) methods train an extractor by automatically aligning relation instances in a Knowledge Base (KB) with unstructured text. In addition to instances, KBs often contain other relevant side information, such as aliases of relations (e.g., founded and co-founded are for the founderOfCompany). RE models usually ignore readily available information. this paper, we propose RESIDE, distantly-supervised neural extraction method which utilizes additional...

10.18653/v1/d18-1157 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Calibrating Sequence likelihood Improves Conditional Language Generation

OPENALEX - Publications

Yao Zhao Misha Khalman Rishabh Joshi Shashi Narayan Mohammad Saleh and 1 more

Conditional language models are predominantly trained with maximum likelihood estimation (MLE), giving probability mass to sparsely observed target sequences. While MLE assign high plausible sequences given the context, model probabilities often do not accurately rank-order generated by quality. This has been empirically in beam search decoding as output quality degrading large sizes, and strategies benefiting from heuristics such length normalization repetition-blocking. In this work, we...

10.48550/arxiv.2210.00045 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Improving broad-coverage medical entity linking with semantic type prediction and large-scale datasets

OPENALEX - Publications

Shikhar Vashishth Denis Newman-Griffis Rishabh Joshi Ritam Dutt Carolyn Penstein Rosé

Biomedical natural language processing tools are increasingly being applied for broad-coverage information extraction-extracting medical of all types in a scientific document or clinical note. In such settings, linking mentions concepts to standardized vocabularies requires choosing the best candidate from large inventories covering dozens types. This study presents novel semantic type prediction module biomedical NLP pipelines and two automatically-constructed, large-scale datasets with...

10.1016/j.jbi.2021.103880 article EN cc-by Journal of Biomedical Informatics 2021-08-11

Analysing the Extent of Misinformation in Cancer Related Tweets

OPENALEX - Publications

Rakesh Bal Sayan Sinha Swastika Dutta Rishabh Joshi Sayan Ghosh and 1 more

Twitter has become one of the most sought after places to discuss a wide variety topics, including medically relevant issues such as cancer. This helps spread awareness regarding various causes, cures and prevention methods However, no proper analysis been performed, which discusses validity claims. In this work, we aim tackle misinformation in platforms. We collect present dataset tweets talk specifically about cancer propose an attention-based deep learning model for automated detection...

10.1609/icwsm.v14i1.7359 article EN Proceedings of the International AAAI Conference on Web and Social Media 2020-05-26

DialoGraph: Incorporating Interpretable Strategy-Graph Networks into Negotiation Dialogues

OPENALEX - Publications

Rishabh Joshi Vidhisha Balachandran Shikhar Vashishth Alan W. Black Yulia Tsvetkov

To successfully negotiate a deal, it is not enough to communicate fluently: pragmatic planning of persuasive negotiation strategies essential. While modern dialogue agents excel at generating fluent sentences, they still lack grounding and cannot reason strategically. We present DialoGraph, system that incorporates in using graph neural networks. DialoGraph explicitly dependencies between sequences enable improved interpretable prediction next optimal strategies, given the context. Our...

10.48550/arxiv.2106.00920 preprint EN other-oa arXiv (Cornell University) 2021-01-01

SLiC-HF: Sequence Likelihood Calibration with Human Feedback

OPENALEX - Publications

Yao Zhao Rishabh Joshi Tianqi Liu Misha Khalman Mohammad Saleh and 1 more

Learning from human feedback has been shown to be effective at aligning language models with preferences. Past work often relied on Reinforcement Human Feedback (RLHF), which optimizes the model using reward scores assigned a trained preference data. In this we show how recently introduced Sequence Likelihood Calibration (SLiC), can also used effectively learn preferences (SLiC-HF). Furthermore, demonstrate done data collected for different model, similar off-policy, offline RL Automatic and...

10.48550/arxiv.2305.10425 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Statistical Rejection Sampling Improves Preference Optimization

OPENALEX - Publications

Tianqi Liu Yao Zhao Rishabh Joshi Misha Khalman Mohammad Saleh and 2 more

Improving the alignment of language models with human preferences remains an active research challenge. Previous approaches have primarily utilized Reinforcement Learning from Human Feedback (RLHF) via online RL methods such as Proximal Policy Optimization (PPO). Recently, offline Sequence Likelihood Calibration (SLiC) and Direct Preference (DPO) emerged attractive alternatives, offering improvements in stability scalability while maintaining competitive performance. SLiC refines its loss...

10.48550/arxiv.2309.06657 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Building Math Agents with Multi-Turn Iterative Preference Learning

OPENALEX - Publications

Wei Xiong Chengshuai Shi Jiaming Shen Aviv Rosenberg Zhen Qin and 8 more

Recent studies have shown that large language models' (LLMs) mathematical problem-solving capabilities can be enhanced by integrating external tools, such as code interpreters, and employing multi-turn Chain-of-Thought (CoT) reasoning. While current methods focus on synthetic data generation Supervised Fine-Tuning (SFT), this paper the complementary direct preference learning approach to further improve model performance. However, existing algorithms are originally designed for single-turn...

10.48550/arxiv.2409.02392 preprint EN arXiv (Cornell University) 2024-09-03

ResPer: Computationally Modelling Resisting Strategies in Persuasive Conversations

OPENALEX - Publications

Ritam Dutt Sayan Sinha Rishabh Joshi Surya Shekhar Chakraborty Meredith Riggs and 3 more

Ritam Dutt, Sayan Sinha, Rishabh Joshi, Surya Shekhar Chakraborty, Meredith Riggs, Xinru Yan, Haogang Bao, Carolyn Rose. Proceedings of the 16th Conference European Chapter Association for Computational Linguistics: Main Volume. 2021.

10.18653/v1/2021.eacl-main.7 article EN cc-by 2021-01-01

Effect of seismic load on behaviour of RCC, composite and light steel building analysed using ETABS software: A comparative study

OPENALEX - Publications

Vaibhav Srivastava Rishabh Joshi Kamal Kumar Rifat Reşatoğlu Mohd Zain and 1 more

10.1016/j.matpr.2023.03.564 article EN Materials Today Proceedings 2023-04-01

Unsupervised Keyphrase Extraction via Interpretable Neural Networks

OPENALEX - Publications

Rishabh Joshi Vidhisha Balachandran Emily Saldanha Maria Glenski Svitlana Volkova and 1 more

Keyphrase extraction aims at automatically extracting a list of "important" phrases representing the key concepts in document. Prior approaches for unsupervised keyphrase resorted to heuristic notions phrase importance via embedding clustering or graph centrality, requiring extensive domain expertise. Our work presents simple alternative approach which defines keyphrases as document that are salient predicting topic To this end, we propose INSPECT—an uses self-explaining models identifying...

10.18653/v1/2023.findings-eacl.82 article EN cc-by 2023-01-01

A Team Based Player Versus Player Recommender Systems Framework For Player Improvement

OPENALEX - Publications

Rishabh Joshi Varun Gupta Xinyue Li Yue Cui Ziwen Wang and 6 more

Modern Massively Multi-player Online Games (MMOGs) have grown to become extremely complex in terms of the usable resources games, resulting an increase amount data collected by tracking in-game activities players. This has opened door for researchers come up with novel methods utilize this improve and personalize user experience. In paper, a but flexible framework towards building team based recommender system player-versus-player (PvP) content such MMOGs is presented, applied case study...

10.1145/3290688.3290750 article EN Proceedings of the Australasian Computer Science Week Multiconference 2019-01-14

LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for Multi-Granular Propaganda Span Identification

OPENALEX - Publications

Sopan Khosla Rishabh Joshi Ritam Dutt Alan W. Black Yulia Tsvetkov

In this paper we describe our submission for the task of Propaganda Span Identification in news articles. We introduce a BERT-BiLSTM based span-level propaganda classification model that identifies which token spans within sentence are indicative propaganda. The ”multi-granular” incorporates linguistic knowledge at various levels text granularity, including word, and document level syntactic, semantic pragmatic affect features, significantly improve performance, compared to its...

10.18653/v1/2020.semeval-1.230 article EN cc-by 2020-01-01

AMUSED: A Multi-Stream Vector Representation Method for Use in Natural Dialogue

OPENALEX - Publications

Gaurav Kumar Rishabh Joshi Jaspreet Singh Promod Yenigalla

The problem of building a coherent and non-monotonous conversational agent with proper discourse coverage is still an area open research. Current architectures only take care semantic contextual information for given query fail to completely account syntactic external knowledge which are crucial generating responses in chit-chat system. To overcome this problem, we propose end multi-stream deep learning architecture learns unified embeddings query-response pairs by leveraging from memory...

10.48550/arxiv.1912.10160 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Keeping Up Appearances: Computational Modeling of Face Acts in Persuasion Oriented Discussions

OPENALEX - Publications

Ritam Dutt Rishabh Joshi Carolyn Penstein Rosé

The notion of face refers to the public self-image an individual that emerges both from individual’s own actions as well interaction with others. Modeling and understanding its state changes throughout a conversation is critical study maintenance basic human needs in through interaction. Grounded politeness theory Brown Levinson (1978), we propose generalized framework for modeling acts persuasion conversations, resulting reliable coding manual, annotated corpus, computational models....

10.18653/v1/2020.emnlp-main.605 article EN cc-by 2020-01-01

LiPO: Listwise Preference Optimization through Learning-to-Rank

OPENALEX - Publications

Tianqi Liu Zhen Qin Junru Wu Jiaming Shen Misha Khalman and 7 more

Aligning language models (LMs) with curated human feedback is critical to control their behaviors in real-world applications. Several recent policy optimization methods, such as DPO and SLiC, serve promising alternatives the traditional Reinforcement Learning from Human Feedback (RLHF) approach. In practice, often comes a format of ranked list over multiple responses amortize cost reading prompt. Multiple can also be by reward or AI feedback. There lacks study on directly fitting upon...

10.48550/arxiv.2402.01878 preprint EN arXiv (Cornell University) 2024-02-02

Human Alignment of Large Language Models through Online Preference Optimisation

OPENALEX - Publications

Daniele Calandriello Daniel Guo Rémi Munos Mark Rowland Yunhao Tang and 8 more

Ensuring alignment of language models' outputs with human preferences is critical to guarantee a useful, safe, and pleasant user experience. Thus, has been extensively studied recently several methods such as Reinforcement Learning from Human Feedback (RLHF), Direct Policy Optimisation (DPO) Sequence Likelihood Calibration (SLiC) have emerged. In this paper, our contribution two-fold. First, we show the equivalence between two recent methods, namely Identity (IPO) Nash Mirror Descent...

10.48550/arxiv.2403.08635 preprint EN arXiv (Cornell University) 2024-03-13

Offline Regularised Reinforcement Learning for Large Language Models Alignment

OPENALEX - Publications

Pierre Harvey Richemond Yunhao Tang Daniel Guo Daniele Calandriello Mohammad Gheshlaghi Azar and 13 more

The dominant framework for alignment of large language models (LLM), whether through reinforcement learning from human feedback or direct preference optimisation, is to learn data. This involves building datasets where each element a quadruplet composed prompt, two independent responses (completions the prompt) and between responses, yielding preferred dis-preferred response. Such data typically scarce expensive collect. On other hand, \emph{single-trajectory} triplet response naturally more...

10.48550/arxiv.2405.19107 preprint EN arXiv (Cornell University) 2024-05-29

Preference Optimization as Probabilistic Inference

OPENALEX - Publications

Abbas Abdolmaleki Bilal Piot Bobak Shahriari Jost Tobias Springenberg Tim Hertweck and 7 more

Existing preference optimization methods are mainly designed for directly learning from human feedback with the assumption that paired examples (preferred vs. dis-preferred) available. In contrast, we propose a method can leverage unpaired preferred or dis-preferred examples, and works even when only one type of (positive negative) is This flexibility allows us to apply it in scenarios varying forms models, including training generative language models based on as well policies sequential...

10.48550/arxiv.2410.04166 preprint EN arXiv (Cornell University) 2024-10-05

RRM: Robust Reward Model Training Mitigates Reward Hacking

OPENALEX - Publications

Tianqi Liu Wei Xiong Jie Ren Lichang Chen Junru Wu and 13 more

Reward models (RMs) play a pivotal role in aligning large language (LLMs) with human preferences. However, traditional RM training, which relies on response pairs tied to specific prompts, struggles disentangle prompt-driven preferences from prompt-independent artifacts, such as length and format. In this work, we expose fundamental limitation of current training methods, where RMs fail effectively distinguish between contextual signals irrelevant artifacts when determining To address this,...

10.48550/arxiv.2409.13156 preprint EN arXiv (Cornell University) 2024-09-19

Coming Soon ...