Shreyas Chaudhari

ORCID: 0000-0002-8826-2253
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Reinforcement Learning in Robotics
  • Advanced Bandit Algorithms Research
  • Optimization and Search Problems
  • Domain Adaptation and Few-Shot Learning
  • Wireless Communication Security Techniques
  • Energy Harvesting in Wireless Networks
  • Machine Learning and Algorithms
  • Sparse and Compressive Sensing Techniques
  • Advanced Neural Network Applications
  • Viral Infectious Diseases and Gene Expression in Insects
  • Advanced Wireless Communication Technologies
  • Evolutionary Algorithms and Applications
  • Neural Networks and Applications
  • Stochastic Gradient Optimization Techniques
  • Radio Astronomy Observations and Technology
  • Healthcare Operations and Scheduling Optimization
  • Neural dynamics and brain function
  • Data Quality and Management
  • Graph theory and applications
  • Financial Distress and Bankruptcy Prediction
  • Recommender Systems and Techniques
  • Neural Networks and Reservoir Computing
  • Machine Learning in Healthcare
  • Data Stream Mining Techniques
  • Blood donation and transfusion practices

Carnegie Mellon University
2018-2025

University of Massachusetts Amherst
2024

Uber AI (United States)
2021

Indian Institute of Technology Madras
2017-2018

State-of-the-art large language models (LLMs) have become indispensable tools for various tasks. However, training LLMs to serve as effective assistants humans requires careful consideration. A promising approach is reinforcement learning from human feedback (RLHF), which leverages update the model in accordance with preferences and mitigate issues like toxicity hallucinations. Yet, an understanding of RLHF largely entangled initial design choices that popularized method current research...

10.48550/arxiv.2404.08555 preprint EN arXiv (Cornell University) 2024-04-12

10.1109/icassp49660.2025.10890126 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

We consider a multi-armed bandit framework where the rewards obtained by pulling different arms are correlated. develop unified approach to leverage these reward correlations and present fundamental generalizations of classic algorithms correlated setting. proof technique analyze proposed algorithms. Rigorous analysis C-UCB (the version Upper-confidence-bound) reveals that algorithm ends up certain sub-optimal arms, termed as non-competitive, only O(1) times, opposed O(log T) pulls required...

10.1109/tit.2021.3081508 article EN IEEE Transactions on Information Theory 2021-05-18

A blood Bank can be defined as a bank or storage place where is collected, preserved and used whenever needed demanded. Everyone aware that the traditional management system includes paperwork. Its way of working not efficient enough at time emergency situations. The main aim creating cloud-based to make available on people, even in With help this project, user able view information about every entity related i.e. hospitals, donors, location another etc. security factor maintained properly....

10.1109/icscet.2018.8537351 article EN 2018 International Conference on Smart City and Emerging Technology (ICSCET) 2018-01-01

We propose a simple, practical, and intuitive approach for domain adaptation in reinforcement learning. Our stems from the idea that agent's experience source should look similar to its target domain. Building off of probabilistic view RL, we formally show can achieve this goal by compensating difference dynamics modifying reward function. This modified function is simple estimate learning auxiliary classifiers distinguish source-domain transitions target-domain transitions. Intuitively,...

10.48550/arxiv.2006.13916 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Graph Convolutional Neural Networks (graph CNNs) adapt the traditional CNN architecture for use on graphs, replacing convolution layers with graph layers. Although similar in architecture, CNNs are used geometric deep learning whereas conventional grid-based data, such as audio or images, seemingly no direct relationship between two classes of neural networks.This paper shows that under certain conditions can be data a good approximation to CNNs, avoiding need CNNs. We show this by using an...

10.1109/icassp48485.2024.10446093 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Persona agents, which are LLM agents that act according to an assigned persona, have demonstrated impressive contextual response capabilities across various applications. These persona offer significant enhancements diverse sectors, such as education, healthcare, and entertainment, where model developers can align agent responses different user requirements thereby broadening the scope of However, evaluating performance is incredibly challenging due complexity assessing adherence in...

10.48550/arxiv.2407.18416 preprint EN arXiv (Cornell University) 2024-07-25

10.1109/tsp.2024.3496692 article AF IEEE Transactions on Signal Processing 2024-01-01

Direction of arrival (DoA) estimation is a well studied problem with several significant applications in radar, sonar, wireless communications, and audio signal processing. A majority conventional algorithms for DoA require prior knowledge the number transmitters and/or sufficient measurements estimating received covariance matrix. When these requirements are not satisfied, performance such degrades considerably. Recently, some deep learning-based approaches to direction have been proposed....

10.1109/ieeeconf56349.2022.10052106 article EN 2014 48th Asilomar Conference on Signals, Systems and Computers 2022-10-31

We consider a finite-armed structured bandit problem in which mean rewards of different arms are known functions common hidden parameter 8*. Since we do not place any restrictions on these functions, the setting subsumes several previously studied frameworks that assume linear or invertible reward functions. propose novel approach to gradually estimate 8* and use together with substantially reduce exploration sub-optimal arms. This enables us fundamentally generalize classical algorithm...

10.1109/jsait.2020.3041246 article EN publisher-specific-oa IEEE Journal on Selected Areas in Information Theory 2020-11-01

Unsupervised time series clustering is a challenging problem with diverse industrial applications such as anomaly detection, bio-wearables, etc. These typically involve small, low-power devices on the edge that collect and process real-time sensory signals. State-of-the-art time-series methods perform some form of loss minimization extremely computationally intensive from perspective devices. In this work, we propose neuromorphic approach to unsupervised based Temporal Neural Networks...

10.1109/icassp39728.2021.9414882 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

Next-generation cosmic microwave background (CMB) surveys are expected to provide valuable information about the primordial universe by creating maps of mass along line sight. Traditional tools for these lensing convergence include quadratic estimator and maximum likelihood based iterative estimator. Here, we apply a generative adversarial network (GAN) reconstruct field. We compare our results with previous deep learning approach -- Residual-UNet discuss pros cons each. In process, use...

10.48550/arxiv.2205.07368 preprint EN cc-by arXiv (Cornell University) 2022-01-01

We consider a finite-armed structured bandit problem in which mean rewards of different arms are known functions common hidden parameter $\theta^*$. Since we do not place any restrictions these functions, the setting subsumes several previously studied frameworks that assume linear or invertible reward functions. propose novel approach to gradually estimate $\theta^*$ and use together with substantially reduce exploration sub-optimal arms. This enables us fundamentally generalize classic...

10.48550/arxiv.1810.08164 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Recommendation strategies are typically evaluated by using previously logged data, employing off-policy evaluation methods to estimate their expected performance. However, for that present users with slates of multiple items, the resulting combinatorial action space renders many these impractical. Prior work has developed estimators leverage structure in performance, but estimation entire performance distribution remains elusive. Estimating complete allows a more comprehensive recommendation...

10.1609/aaai.v38i8.28667 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

In this paper, we introduce a fresh perspective on the challenges of credit assignment and policy evaluation. First, delve into nuances eligibility traces explore instances where their updates may result in unexpected to preceding states. From investigation emerges concept novel value function, which refer as ????????????? ????? ????????. Unlike traditional state functions, bidirectional functions account for both future expected returns (rewards anticipated from current onward) past...

10.1609/aaai.v38i11.29115 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Directly parameterizing and learning gradients of functions has widespread significance, with specific applications in optimization, generative modeling, optimal transport. This paper introduces gradient networks (GradNets): novel neural network architectures that parameterize various function classes. GradNets exhibit specialized architectural constraints ensure correspondence to functions. We provide a comprehensive GradNet design framework includes methods for transforming into monotone...

10.48550/arxiv.2404.07361 preprint EN arXiv (Cornell University) 2024-04-10

Peer-to-peer learning is an increasingly popular framework that enables beyond-5G distributed edge devices to collaboratively train deep neural networks in a privacy-preserving manner without the aid of central server. Neural network training algorithms for emerging environments, e.g., smart cities, have many design considerations are difficult tune deployment settings -- such as architectures and hyperparameters. This presents critical need characterizing dynamics optimization used highly...

10.48550/arxiv.2409.15267 preprint EN arXiv (Cornell University) 2024-09-23

Evaluating policies using off-policy data is crucial for applying reinforcement learning to real-world problems such as healthcare and autonomous driving. Previous methods evaluation (OPE) generally suffer from high variance or irreducible bias, leading unacceptably prediction errors. In this work, we introduce STAR, a framework OPE that encompasses broad range of estimators -- which include existing special cases achieve lower mean squared STAR leverages state abstraction distill complex,...

10.48550/arxiv.2410.02172 preprint EN arXiv (Cornell University) 2024-10-02

While much effort has been devoted to deriving and analyzing effective convex formulations of signal processing problems, the gradients functions also have critical applications ranging from gradient-based optimization optimal transport. Recent works explored data-driven methods for learning objective functions, but their monotone is seldom studied. In this work, we propose C-MGN M-MGN, two gradient neural network architectures directly functions. We show that, compared state art methods,...

10.1109/icassp49357.2023.10097266 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

We consider a finite-armed structured bandit problem in which mean rewards of different arms are known functions common hidden parameter θ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">*</sup> . This setting subsumes several previously studied frameworks that assume linear or invertible reward functions. propose novel approach to gradually estimate the and use together with substantially reduce exploration sub-optimal arms. enables us...

10.1109/icassp39728.2021.9413628 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

An energy-limited source trying to transmit multiple packets a destination with possibly different sizes is considered. With limited energy, the cannot potentially all bits of packets. In addition, there delay cost associated each packet. Thus, has choose, how many for packet, and order in which these bits, minimize distortion (introduced by transmitting lower number bits) queueing plus transmission delay, across Assuming an exponential metric loss linear cost, we show that optimal...

10.1109/ncc.2018.8600172 article EN 2018-02-01

While much effort has been devoted to deriving and analyzing effective convex formulations of signal processing problems, the gradients functions also have critical applications ranging from gradient-based optimization optimal transport. Recent works explored data-driven methods for learning objective functions, but their monotone is seldom studied. In this work, we propose C-MGN M-MGN, two gradient neural network architectures directly functions. We show that, compared state art methods,...

10.48550/arxiv.2301.10862 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01
Coming Soon ...