Yash Sharma

ORCID: 0009-0006-4206-3159
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Domain Adaptation and Few-Shot Learning
  • Adversarial Robustness in Machine Learning
  • Multimodal Machine Learning Applications
  • AI in cancer detection
  • Radiomics and Machine Learning in Medical Imaging
  • Quantum Computing Algorithms and Architecture
  • Quantum Information and Cryptography
  • Internet Traffic Analysis and Secure E-voting
  • Speech Recognition and Synthesis
  • Natural Language Processing Techniques
  • Neural Networks and Applications
  • Anomaly Detection Techniques and Applications
  • Blind Source Separation Techniques
  • 3D Shape Modeling and Analysis
  • Machine Learning and Algorithms
  • Spatial Cognition and Navigation
  • Advanced Wireless Communication Technologies
  • Imbalanced Data Classification Techniques
  • Hate Speech and Cyberbullying Detection
  • Heart Rate Variability and Autonomic Control
  • Gaze Tracking and Assistive Technology
  • Time Series Analysis and Forecasting
  • IoT and Edge/Fog Computing
  • Mobile and Web Applications
  • Advanced Vision and Imaging

Vellore Institute of Technology University
2024

Rutgers Sexual and Reproductive Health and Rights
2022-2024

Rutgers, The State University of New Jersey
2023

Bennett University
2023

Terna Dental College and Hospital
2023

Devi Ahilya Vishwavidyalaya
2022

National Institute of Technology Raipur
2018

Self-supervised representation learning has shown remarkable success in a number of domains. A common practice is to perform data augmentation via hand-crafted transformations intended leave the semantics invariant. We seek understand empirical this approach from theoretical perspective. formulate process as latent variable model by postulating partition into content component, which assumed invariant augmentation, and style allowed change. Unlike prior work on disentanglement independent...

10.48550/arxiv.2106.04619 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Deep neural networks are vulnerable to adversarial examples, even in the black-box setting, where attacker is restricted solely query access. Existing approaches generating examples typically require a significant number of queries, either for training substitute network or performing gradient estimation. We introduce GenAttack, gradient-free optimization technique that uses genetic algorithms synthesizing setting. Our experiments on different datasets (MNIST, CIFAR-10, and ImageNet) show...

10.48550/arxiv.1805.11090 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Contrastive learning has recently seen tremendous success in self-supervised learning. So far, however, it is largely unclear why the learned representations generalize so effectively to a large variety of downstream tasks. We here prove that feedforward models trained with objectives belonging commonly used InfoNCE family learn implicitly invert underlying generative model observed data. While proofs make certain statistical assumptions about model, we observe empirically our findings hold...

10.48550/arxiv.2102.08850 preprint EN other-oa arXiv (Cornell University) 2021-01-01

In a digitally inclined world email is an essential means of communication for individuals.However individuals who are visually impaired or physically disabled face significant problems to access and manage their emails at workplace.This paper explore advance voice-based system impaired,mainly aimed improving accessibility visually,physically individuals.The uses the power Google Gmail along with imposing technologies like Speech-to-Text(STT),Text-to-Speech (TTS), biometric authentication,...

10.63345/ijrmeet.org.v13.i4.1341 article EN 2025-04-01

Language instructions and demonstrations are two natural ways for users to teach robots personalized tasks. Recent progress in Large Models (LLMs) has shown impressive performance translating language into code robotic However, task continues be a challenge due the length complexity of both code, making learning direct mapping intractable. This paper presents Demo2Code, novel framework that generates robot from via an extended chain-of-thought defines common latent specification connect two....

10.48550/arxiv.2305.16744 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

The electrical signals were first discovered by English scientist Richard Caton in 1875. Since then technology has evolved all its spheres to alleviate the developments this field. Nowadays with help of well-equipped Brain-Computer Interface (BCI) a channel can be established between human brain and immobile body parts. At time around 1920's study activity begun including observing patterns these different frequency ranges. Furthermore, context electrophysiological monitoring brain's...

10.2139/ssrn.3166225 article EN SSRN Electronic Journal 2018-01-01

Perceiving the world in terms of objects and tracking them through time is a crucial prerequisite for reasoning scene understanding. Recently, several methods have been proposed unsupervised learning object-centric representations. However, since these models were evaluated on different downstream tasks, it remains unclear how they compare basic perceptual abilities such as detection, figure-ground segmentation objects. To close this gap, we design benchmark with four data sets varying...

10.48550/arxiv.2006.07034 preprint EN other-oa arXiv (Cornell University) 2020-01-01

This work introduces a novel principle we call disentanglement via mechanism sparsity regularization, which can be applied when the latent factors of interest depend sparsely on past and/or observed auxiliary variables. We propose representation learning method that induces by simultaneously and sparse causal graphical model relates them. develop rigorous identifiability theory, building recent nonlinear independent component analysis (ICA) results, formalizes this shows how variables...

10.48550/arxiv.2107.10098 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Hematoxylin and Eosin (H&E) stained Whole Slide Images (WSIs) are utilized for biopsy visualization-based diagnostic prognostic assessment of diseases. Variation in the H&E staining process across different lab sites can lead to significant variations image appearance. These introduce an undesirable bias when slides examined by pathologists or used training deep learning models. To reduce this bias, need be translated a common domain stain appearance before analysis. We propose...

10.48550/arxiv.1909.01963 preprint EN other-oa arXiv (Cornell University) 2019-01-01

An important and difficult task in code-switched speech recognition is to recognize the language, as lots of words two languages can sound similar, especially some accents. We focus on improving performance end-to-end Automatic Speech Recognition models by conditioning transformer layers language ID character output an per layer supervised manner. To this end, we propose methods introducing specific parameters explainability multi-head attention mechanism, implement a Temporal Loss that...

10.48550/arxiv.2403.08011 preprint EN arXiv (Cornell University) 2024-03-12

Motivated by quantum network applications over classical channels, we initiate the study of $n$-party resource states from which LOCC protocols can create EPR-pairs between any $k$ disjoint pairs parties. We give constructions such where is not too far optimal $n/2$ while individual parties need to hold only a constant number qubits. In special case when each party holds one qubit, describe family $n$-qubit with proportional $\log n$ based on Reed-Muller codes, as well small numerically...

10.48550/arxiv.2211.06497 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Download This Paper Open PDF in Browser Add to My Library Share: Permalink Using these links will ensure access this page indefinitely Copy URL DOI

10.2139/ssrn.4817064 preprint EN 2024-01-01

Motivated by quantum network applications over classical channels, we initiate the study of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>n</mml:mi></mml:math>-party resource states from which LOCC protocols can create EPR-pairs between any xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>k</mml:mi></mml:math> disjoint pairs parties. We give constructions such where is not too far optimal xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>n</mml:mi><mml:mrow...

10.22331/q-2024-05-14-1348 article EN cc-by Quantum 2024-05-14

While generative AI (GenAI) offers countless possibilities for creative and productive tasks, artificially generated media can be misused fraud, manipulation, scams, misinformation campaigns, more. To mitigate the risks associated with maliciously media, forensic classifiers are employed to identify AI-generated content. However, current often not evaluated in practically relevant scenarios, such as presence of an attacker or when real-world artifacts like social degradations affect images....

10.48550/arxiv.2410.01574 preprint EN arXiv (Cornell University) 2024-10-02

Carefully crafted, often imperceptible, adversarial perturbations have been shown to cause state-of-the-art models yield extremely inaccurate outputs, rendering them unsuitable for safety-critical application domains. In addition, recent work has that constraining the attack space a low frequency regime is particularly effective. Yet, it remains unclear whether this due generally search or specifically removing high components from consideration. By systematically controlling of...

10.48550/arxiv.1903.00073 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Artificial Intelligence has core branches like, Machine Learning which takes in data, search patterns then improves itself using the and displays outcome. To start a good day caring of our health is really important. In few village areas, it quite hard to find consultation with doctor whenever needed emergency. The proposal here build an intelligent conversational healthcare chat-bot NLP part artificial intelligence that can diagnose provide required details about particular disease asked by...

10.55041/ijsrem12864 article EN INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 2022-05-07

While self-supervised learning has enabled effective representation in the absence of labels, for vision, video remains a relatively untapped source supervision. To address this, we propose Pixel-level Correspondence (PiCo), method dense contrastive from video. By tracking points with optical flow, obtain correspondence map which can be used to match local features at different time. We validate PiCo on standard benchmarks, outperforming baselines multiple prediction tasks, without...

10.48550/arxiv.2207.03866 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Personal computers and computer networks have become increasingly vulnerable to numerous types of attacks since the introduction Internet. Information has evolved into a valuable asset that must be pro- tected from cyber-attacks. Privacy may violated, vital data destroyed as result attack. Typically, are brought on by failure put security rules in place use easily available tools Firewall, Intrusion Detection System, Prevention Systems some solutions accessible. Each tool comes with its own...

10.48175/ijarsct-9492 article EN International Journal of Advanced Research in Science Communication and Technology 2023-04-26

The fifth- generation(5G) wireless system is anticipated to bring a paradigm shift in the way we communicate, work, and live. It offers unequaled speed, low quiescence, massive connectivity, high trustability, which will enable range of new operations services similar as independent driving, remote surgery, smart homes, Assiduity4.0. still, deployment 5G also poses significant specialized, profitable, social challenges, need for structure investment, diapason allocation, security...

10.48175/ijarsct-9630 article EN International Journal of Advanced Research in Science Communication and Technology 2023-04-30

Learning structured representations of the visual world in terms objects promises to significantly improve generalization abilities current machine learning models. While recent efforts this end have shown promising empirical progress, a theoretical account when unsupervised object-centric representation is possible still lacking. Consequently, understanding reasons for success existing methods as well designing new theoretically grounded remains challenging. In present work, we analyze can...

10.48550/arxiv.2305.14229 preprint EN other-oa arXiv (Cornell University) 2023-01-01

To understand the effects of COVID-19 preventive measures such as social distancing and mask-wearing on wayfinding, we carried out a virtual reality (VR) study. Participants traversed VR room, moving around an obstacle – either person (an “agent”) or inanimate object. We varied whether participant was wearing mask, agent context “safe” “unsafe” in terms potential contagion. Participants’ navigational choices, found, were strongly influenced by safety environment, but not much mask. also saw...

10.31234/osf.io/8chjd preprint EN 2023-02-13

Crowdfunding is a popular method for raising funds projects, businesses, and social causes.However, traditional crowdfunding platforms are often centralized, opaque, inefficient.In recent years, Blockchain technology has become potentially effective remedy to these problems.By leveraging the benefits of blockchain, such as transparency, security, efficiency, we can create new paradigm that decentralized, transparent, efficient.In this paper, explore use blockchain in describe prototype...

10.56726/irjmets36855 article EN International Research Journal of Modernization in Engineering Technology and Science 2023-05-19

The proliferation of Automated Teller Machines (ATMs) in the banking sector has raised stakes identifying most suitable locations for these machines, given their impact on profitability and satisfaction bank clients. This study presents a comprehensive examination variables that influence ATM placement, exploring significance an optimal location challenges associated with this task. Our investigation pivots market-centric such as proximity eateries, fuel stations, are ubiquitously available...

10.1109/elexcom58812.2023.10370336 article EN 2023-08-26

Building Automatic Speech Recognition (ASR) systems for code-switched speech has recently gained renewed attention due to the widespread use of technologies in multilingual communities worldwide. End-to-end ASR are a natural modeling choice their ease and superior performance monolingual settings. However, it is well known that end-to-end require large amounts labeled speech. In this work, we investigate improving low resource settings via data augmentation using text-to-speech (TTS)...

10.48550/arxiv.2010.05549 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...