Naman Jain

ORCID: 0009-0004-4262-0555
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Malware Detection Techniques
  • Software Engineering Research
  • Human Pose and Action Recognition
  • Software Testing and Debugging Techniques
  • Data Quality and Management
  • Blockchain Technology Applications and Security
  • Model-Driven Software Engineering Techniques
  • AI in Service Interactions
  • Semantic Web and Ontologies
  • Anomaly Detection Techniques and Applications
  • Handwritten Text Recognition Techniques
  • Machine Learning and Data Classification
  • Data Stream Mining Techniques
  • Security and Verification in Computing
  • Infection Control and Ventilation
  • Adversarial Robustness in Machine Learning
  • COVID-19 and Mental Health
  • Hand Gesture Recognition Systems
  • ICT in Developing Communities
  • Hate Speech and Cyberbullying Detection
  • Parallel Computing and Optimization Techniques
  • AI in cancer detection
  • Multimodal Machine Learning Applications

Institute of Engineering
2024

ABES Engineering College
2024

Pandit Bhagwat Dayal Sharma Post Graduate Institute of Medical Sciences
2024

Delhi Technological University
2020-2024

LNM Institute of Information Technology
2024

Amity University
2018-2023

Indian Institute of Technology Roorkee
2023

Galgotias University
2022-2023

Sathyabama Institute of Science and Technology
2023

Centre for Development of Advanced Computing
2023

Large pre-trained language models such as GPT-3 [10], Codex [11], and Google's model [7] are now capable of generating code from natural specifications programmer intent. We view these developments with a mixture optimism caution. On the optimistic side, large have potential to improve productivity by providing an automated AI pair for every in world. cautionary since do not understand program semantics, they offer no guarantees about quality suggested code. In this paper, we present...

10.1145/3510003.3510203 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

Pretraining Large Language Models (LLMs) on large corpora of textual data is now a standard paradigm. When using these LLMs for many downstream applications, it common to additionally bake in new knowledge (e.g., time-critical news, or private domain knowledge) into the pretrained model either through RAG-based-prompting, fine-tuning. However, optimal methodology gain such remains an open question. In this paper, we present Retrieval Augmented FineTuning (RAFT), training recipe that improves...

10.48550/arxiv.2403.10131 preprint EN arXiv (Cornell University) 2024-03-15

Evaluating in-the-wild coding capabilities of large language models (LLMs) is a challenging endeavor with no clear solution. We introduce Copilot Arena, platform to collect user preferences for code generation through native integration into developer's working environment. Arena comprises novel interface comparing pairs model outputs, sampling strategy optimized reduce latency, and prompting scheme enable completion functionality. has served over 4.5 million suggestions from 10 collected...

10.48550/arxiv.2502.09328 preprint EN arXiv (Cornell University) 2025-02-13

Large Language Models (LLMs) applied to code-related applications have emerged as a prominent field, attracting significant interest from both academia and industry. However, new improved LLMs are developed, existing evaluation benchmarks (e.g., HumanEval, MBPP) no longer sufficient for assessing their capabilities. In this work, we propose LiveCodeBench, comprehensive contamination-free of code, which continuously collects problems over time contests across three competition platforms,...

10.48550/arxiv.2403.07974 preprint EN arXiv (Cornell University) 2024-03-12

The single object detection has been performed by using the concepts of convolution layers. A neural network consists several different layers such as input layer, at least one hidden and an output layer. dataset used for is on-road vehicle dataset. This three classes images which are Heavy, Auto Light. varying illuminations. performance metrics calculated day dataset, evening night Multiple You Only Look Once (YOLOv3) algorithm. approach encompasses a deep dividing into cell grid each...

10.1109/i-smac47947.2019.9032502 article EN 2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) 2019-12-01

Large language models (LLMs) are incredibly powerful at comprehending and generating data in the form of text, but brittle error-prone. There has been an advent toolkits recipes centered around so-called prompt engineering-the process asking LLM to do something via a series prompts. However, for LLM-powered processing workflows, particular, optimizing quality, while keeping cost bounded, is tedious, manual process. We put forth vision declarative engineering. view LLMs like crowd workers...

10.48550/arxiv.2308.03854 preprint EN cc-by-nc-nd arXiv (Cornell University) 2023-01-01

We evaluate named entity representations of BERT-based NLP models by investigating their robustness to replacements from the same typed class in input. highlight that on several tasks while such perturbations are natural, state art trained surprisingly brittle. The brittleness continues even with recent entity-aware BERT models. also try discern cause this non-robustness, considering factors as tokenization and frequency occurrence. Then we provide a simple method ensembles predictions...

10.18653/v1/2020.repl4nlp-1.24 article EN cc-by 2020-01-01

Aim The aim of this study was to investigate the utility serum resistin levels as a prognostic indicator for mortality in neonates diagnosed with sepsis. Methodology This one-year prospective at Pandit Bhagwat Dayal Sharma Post Graduate Institute Medical Sciences (PGIMS), Rohtak, India, included 151 categorized into two groups based on blood culture results: group 1 (n=86) those culture-negative, probable sepsis and 2 (n=65) culture-positive, proven Blood samples obtained pre-treatment...

10.7759/cureus.55289 article EN Cureus 2024-02-29

India is an agro-based economy and proper information about agricultural practices the key to optimal growth output. In order answer queries of farmer, we have build chatbot based on dataset from Kisan Call Center. This system robust enough related weather, market rates, plant protection government schemes. available 24*7, can be accessed through any electronic device delivered with ease understanding. The a sentence embedding model which gives accuracy 56%. After eliminating synonyms...

10.35543/osf.io/3qp98 preprint EN 2019-06-11

India loses 35% of the annual crop yield due to plant diseases. Early detection diseases remains difficult lack lab infrastructure and expertise. In this paper, we explore possibility computer vision approaches for scalable early disease detection. The availability sufficiently large-scale non-lab data set a major challenge enabling based Against background, present PlantDoc: dataset visual Our contains 2,598 points in total across 13 species up 17 classes diseases, involving approximately...

10.1145/3371158.3371196 preprint EN 2020-01-05

The novel Coronavirus has really been the unexpected Catastrophe, which no one could have even thought of. It emerged as a pandemic and raised questions on health infrastructure facilities available it is required to get large number of people tested. RT-PCR standard diagnostic test being used. But there are several issues with testing like relying only approach when getting difficult for most countries procure amount kits, also some cases false positive results. All these factors definitely...

10.1109/iciccs51141.2021.9432134 article EN 2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS) 2021-05-06

This paper describes a CRF based token level language identification system entry to Language Identification in Code-Switched (CS) Data task of CodeSwitch 2014.Our hinges on using conditional posterior probabilities for the individual codes (words) code-switched data solve task.We also experiment with other linguistically motivated specific as well generic features train sequence labeling algorithm achieving reasonable results.

10.3115/v1/w14-3910 article EN cc-by 2014-01-01

Large pre-trained language models such as GPT-3, Codex, and Google's model are now capable of generating code from natural specifications programmer intent. We view these developments with a mixture optimism caution. On the optimistic side, large have potential to improve productivity by providing an automated AI pair for every in world. cautionary since do not understand program semantics, they offer no guarantees about quality suggested code. In this paper, we present approach augment...

10.48550/arxiv.2112.02969 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Automated software engineering has been greatly empowered by the recent advances in Large Language Models (LLMs) for programming. While current benchmarks have shown that LLMs can perform various tasks like human developers, majority of their evaluations are limited to short and self-contained algorithmic tasks. Solving challenging practical programming requires capability utilizing diverse function calls as tools efficiently implement functionalities data analysis web development. In...

10.48550/arxiv.2406.15877 preprint EN arXiv (Cornell University) 2024-06-22

Cross Site Scripting (XSS) and clickjacking have been ranked among the top web application threats in recent times. This paper introduces XBuster - our client-side defence against XSS, implemented as an extension to Mozilla Firefox browser. splits each HTTP request parameter into HTML JavaScript contexts stores them separately. It searches for both response handles context type differently. defends all XSS attack vectors including partial script injection, attribute injection injection....

10.1109/iccnc.2016.7440629 article EN 2016 International Conference on Computing, Networking and Communications (ICNC) 2016-02-01

The potential therapeutic role of Fenugreek saponin against Alzheimer's disease: Evaluation apoptotic and acetylcholinesterase inhibitory activitiesWagdy K. B. Khalil, Hanaa M. Roshdy, Salwa Kassem

10.7324/japs.2021.11s107 article EN Journal of Applied Pharmaceutical Science 2021-03-05

This paper provides a comprehensive and exhaustive study of adversarial attacks on human pose estimation models the evaluation their robustness. Besides highlighting important differences between well-studied classification pose-estimation systems w.r.t. attacks, we also provide deep insights into design choices to shape future work. We benchmark robustness several 2D single person architectures trained multiple datasets, MPII COCO. In doing so, explore problem attacking non-classification...

10.48550/arxiv.1908.06401 preprint EN cc-by arXiv (Cornell University) 2019-01-01
Coming Soon ...