NFDI4DS | UHH-SEMS - Publication Details

Naman Jain

ORCID: 0009-0004-4262-0555

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101640007

Research Areas

Topic Modeling
Natural Language Processing Techniques
Advanced Malware Detection Techniques
Software Engineering Research
Human Pose and Action Recognition
Software Testing and Debugging Techniques
Data Quality and Management
Blockchain Technology Applications and Security
Model-Driven Software Engineering Techniques
AI in Service Interactions
Semantic Web and Ontologies
Anomaly Detection Techniques and Applications
Handwritten Text Recognition Techniques
Machine Learning and Data Classification
Data Stream Mining Techniques
Security and Verification in Computing
Infection Control and Ventilation
Adversarial Robustness in Machine Learning
COVID-19 and Mental Health
Hand Gesture Recognition Systems
ICT in Developing Communities
Hate Speech and Cyberbullying Detection
Parallel Computing and Optimization Techniques
AI in cancer detection
Multimodal Machine Learning Applications

Institute of Engineering
2024

ABES Engineering College
2024

Pandit Bhagwat Dayal Sharma Post Graduate Institute of Medical Sciences
2024

Delhi Technological University
2020-2024

LNM Institute of Information Technology
2024

Amity University
2018-2023

Indian Institute of Technology Roorkee
2023

Galgotias University
2022-2023

Sathyabama Institute of Science and Technology
2023

Centre for Development of Advanced Computing
2023

Jigsaw

OPENALEX - Publications

Naman Jain Skanda Vaidyanath Arun Iyer Nagarajan Natarajan Suresh Parthasarathy and 2 more

Large pre-trained language models such as GPT-3 [10], Codex [11], and Google's model [7] are now capable of generating code from natural specifications programmer intent. We view these developments with a mixture optimism caution. On the optimistic side, large have potential to improve productivity by providing an automated AI pair for every in world. cautionary since do not understand program semantics, they offer no guarantees about quality suggested code. In this paper, we present...

10.1145/3510003.3510203 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

RAFT: Adapting Language Model to Domain Specific RAG

OPENALEX - Publications

Tianjun Zhang Shishir G. Patil Naman Jain Sheng Shen Matei Zaharia and 2 more

Pretraining Large Language Models (LLMs) on large corpora of textual data is now a standard paradigm. When using these LLMs for many downstream applications, it common to additionally bake in new knowledge (e.g., time-critical news, or private domain knowledge) into the pretrained model either through RAG-based-prompting, fine-tuning. However, optimal methodology gain such remains an open question. In this paper, we present Retrieval Augmented FineTuning (RAFT), training recipe that improves...

10.48550/arxiv.2403.10131 preprint EN arXiv (Cornell University) 2024-03-15

Phosphate removal from urban stormwater runoff using Canna lily and Cyperus alternifolius-based bioretention system

OPENALEX - Publications

Naman Jain Shivani Yadav Sonam Taneja Sanak Ray A. K. Haritash and 1 more

10.1007/s40899-024-01076-5 article EN Sustainable Water Resources Management 2024-03-02

Copilot Arena: A Platform for Code LLM Evaluation in the Wild

OPENALEX - Publications

Wayne Chi Valerie Chen Anastasios N. Angelopoulos Wei-Lin Chiang Anuj Mittal and 5 more

Evaluating in-the-wild coding capabilities of large language models (LLMs) is a challenging endeavor with no clear solution. We introduce Copilot Arena, platform to collect user preferences for code generation through native integration into developer's working environment. Arena comprises novel interface comparing pairs model outputs, sampling strategy optimized reduce latency, and prompting scheme enable completion functionality. has served over 4.5 million suggestions from 10 collected...

10.48550/arxiv.2502.09328 preprint EN arXiv (Cornell University) 2025-02-13

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

OPENALEX - Publications

Naman Jain King Han Alex Gu Wen-Ding Li Fanjia Yan and 5 more

Large Language Models (LLMs) applied to code-related applications have emerged as a prominent field, attracting significant interest from both academia and industry. However, new improved LLMs are developed, existing evaluation benchmarks (e.g., HumanEval, MBPP) no longer sufficient for assessing their capabilities. In this work, we propose LiveCodeBench, comprehensive contamination-free of code, which continuously collects problems over time contests across three competition platforms,...

10.48550/arxiv.2403.07974 preprint EN arXiv (Cornell University) 2024-03-12

Performance Analysis of Object Detection and Tracking Algorithms for Traffic Surveillance Applications using Neural Networks

OPENALEX - Publications

Naman Jain Shreesha Yerragolla Tanuja Guha Mohana

The single object detection has been performed by using the concepts of convolution layers. A neural network consists several different layers such as input layer, at least one hidden and an output layer. dataset used for is on-road vehicle dataset. This three classes images which are Heavy, Auto Light. varying illuminations. performance metrics calculated day dataset, evening night Multiple You Only Look Once (YOLOv3) algorithm. approach encompasses a deep dividing into cell grid each...

10.1109/i-smac47947.2019.9032502 article EN 2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) 2019-12-01

Revisiting Prompt Engineering via Declarative Crowdsourcing

OPENALEX - Publications

Aditya Parameswaran Shreya Shankar Parth Asawa Naman Jain Yujie Wang

Large language models (LLMs) are incredibly powerful at comprehending and generating data in the form of text, but brittle error-prone. There has been an advent toolkits recipes centered around so-called prompt engineering-the process asking LLM to do something via a series prompts. However, for LLM-powered processing workflows, particular, optimizing quality, while keeping cost bounded, is tedious, manual process. We put forth vision declarative engineering. view LLMs like crowd workers...

10.48550/arxiv.2308.03854 preprint EN cc-by-nc-nd arXiv (Cornell University) 2023-01-01

What’s in a Name? Are BERT Named Entity Representations just as Good for any other Name?

OPENALEX - Publications

Sriram Balasubramanian Naman Jain Gaurav Jindal Abhijeet Awasthi Sunita Sarawagi

We evaluate named entity representations of BERT-based NLP models by investigating their robustness to replacements from the same typed class in input. highlight that on several tasks while such perturbations are natural, state art trained surprisingly brittle. The brittleness continues even with recent entity-aware BERT models. also try discern cause this non-robustness, considering factors as tokenization and frequency occurrence. Then we provide a simple method ensembles predictions...

10.18653/v1/2020.repl4nlp-1.24 article EN cc-by 2020-01-01

Serum Resistin as a Potential Mortality Predictor in Neonatal Sepsis

OPENALEX - Publications

Rashika Jain Rohan Acharya Kumud Kapil N. Bhalla Dinkar Yadav and 2 more

Aim The aim of this study was to investigate the utility serum resistin levels as a prognostic indicator for mortality in neonates diagnosed with sepsis. Methodology This one-year prospective at Pandit Bhagwat Dayal Sharma Post Graduate Institute Medical Sciences (PGIMS), Rohtak, India, included 151 categorized into two groups based on blood culture results: group 1 (n=86) those culture-negative, probable sepsis and 2 (n=65) culture-positive, proven Blood samples obtained pre-treatment...

10.7759/cureus.55289 article EN Cureus 2024-02-29

AgriBot: Agriculture-Specific Question Answer System

OPENALEX - Publications

Naman Jain Pranjali Jain Pratik Kayal Jayakrishna Sahit Soham Pachpande and 2 more

India is an agro-based economy and proper information about agricultural practices the key to optimal growth output. In order answer queries of farmer, we have build chatbot based on dataset from Kisan Call Center. This system robust enough related weather, market rates, plant protection government schemes. available 24*7, can be accessed through any electronic device delivered with ease understanding. The a sentence embedding model which gives accuracy 56%. After eliminating synonyms...

10.35543/osf.io/3qp98 preprint EN 2019-06-11

PlantDoc

OPENALEX - Publications

Davinder Singh Naman Jain Pranjali Jain Pratik Kayal Sudhakar Kumawat and 1 more

India loses 35% of the annual crop yield due to plant diseases. Early detection diseases remains difficult lack lab infrastructure and expertise. In this paper, we explore possibility computer vision approaches for scalable early disease detection. The availability sufficiently large-scale non-lab data set a major challenge enabling based Against background, present PlantDoc: dataset visual Our contains 2,598 points in total across 13 species up 17 classes diseases, involving approximately...

10.1145/3371158.3371196 preprint EN 2020-01-05

COVID-19 Detection using Convolutional Neural Network Architectures based upon Chest X-rays Images

OPENALEX - Publications

Ram Murti Rawat Shivam Garg Naman Jain Gagan Gupta

The novel Coronavirus has really been the unexpected Catastrophe, which no one could have even thought of. It emerged as a pandemic and raised questions on health infrastructure facilities available it is required to get large number of people tested. RT-PCR standard diagnostic test being used. But there are several issues with testing like relying only approach when getting difficult for most countries procure amount kits, also some cases false positive results. All these factors definitely...

10.1109/iciccs51141.2021.9432134 article EN 2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS) 2021-05-06

Language Identification in Code-Switching Scenario

OPENALEX - Publications

Naman Jain Riyaz Ahmad Bhat

This paper describes a CRF based token level language identification system entry to Language Identification in Code-Switched (CS) Data task of CodeSwitch 2014.Our hinges on using conditional posterior probabilities for the individual codes (words) code-switched data solve task.We also experiment with other linguistically motivated specific as well generic features train sequence labeling algorithm achieving reasonable results.

10.3115/v1/w14-3910 article EN cc-by 2014-01-01

Jigsaw: Large Language Models meet Program Synthesis

OPENALEX - Publications

Naman Jain Skanda Vaidyanath Arun Iyer Nagarajan Natarajan Suresh Parthasarathy and 2 more

Large pre-trained language models such as GPT-3, Codex, and Google's model are now capable of generating code from natural specifications programmer intent. We view these developments with a mixture optimism caution. On the optimistic side, large have potential to improve productivity by providing an automated AI pair for every in world. cautionary since do not understand program semantics, they offer no guarantees about quality suggested code. In this paper, we present approach augment...

10.48550/arxiv.2112.02969 preprint EN cc-by arXiv (Cornell University) 2021-01-01

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

OPENALEX - Publications

Terry Yue Zhuo Minh Chien Vu Jenny Chim Han Hu Wenhao Yu and 28 more

Automated software engineering has been greatly empowered by the recent advances in Large Language Models (LLMs) for programming. While current benchmarks have shown that LLMs can perform various tasks like human developers, majority of their evaluations are limited to short and self-contained algorithmic tasks. Solving challenging practical programming requires capability utilizing diverse function calls as tools efficiently implement functionalities data analysis web development. In...

10.48550/arxiv.2406.15877 preprint EN arXiv (Cornell University) 2024-06-22

The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations?

OPENALEX - Publications

Alex Gu Wen-Ding Li Naman Jain Theo Olausson Celine Lee and 2 more

10.18653/v1/2024.findings-acl.7 article EN Findings of the Association for Computational Linguistics: ACL 2022 2024-01-01

Two for the price of one: A combined browser defense against XSS and clickjacking

OPENALEX - Publications

Kanpata Sudhakara Rao Naman Jain Nikhil Limaje Abhilash Gupta Mridul Jain and 1 more

Cross Site Scripting (XSS) and clickjacking have been ranked among the top web application threats in recent times. This paper introduces XBuster - our client-side defence against XSS, implemented as an extension to Mozilla Firefox browser. splits each HTTP request parameter into HTML JavaScript contexts stores them separately. It searches for both response handles context type differently. defends all XSS attack vectors including partial script injection, attribute injection injection....

10.1109/iccnc.2016.7440629 article EN 2016 International Conference on Computing, Networking and Communications (ICNC) 2016-02-01

Improving reliability and reducing cost of task execution on preemptible VM instances using machine learning approach

OPENALEX - Publications

Ashish Kumar Mishra Dharmendra Kumar Yadav Yogesh Kumar Naman Jain

10.1007/s11227-018-2717-7 article EN The Journal of Supercomputing 2018-12-08

Effect of abiotic factors on bacoside A content, acetylcholinesterase inhibitory and antioxidant activities of Bacopa monnieri (L.) Wettst

OPENALEX - Publications

Varinder Singh Naman Jain Richa Shri

The potential therapeutic role of Fenugreek saponin against Alzheimer's disease: Evaluation apoptotic and acetylcholinesterase inhibitory activitiesWagdy K. B. Khalil, Hanaa M. Roshdy, Salwa Kassem

10.7324/japs.2021.11s107 article EN Journal of Applied Pharmaceutical Science 2021-03-05

On the Robustness of Human Pose Estimation

OPENALEX - Publications

Sahil Shah Naman Jain Abhishek Sharma Arjun Jain

This paper provides a comprehensive and exhaustive study of adversarial attacks on human pose estimation models the evaluation their robustness. Besides highlighting important differences between well-studied classification pose-estimation systems w.r.t. attacks, we also provide deep insights into design choices to shape future work. We benchmark robustness several 2D single person architectures trained multiple datasets, MPII COCO. In doing so, explore problem attacking non-classification...

10.48550/arxiv.1908.06401 preprint EN cc-by arXiv (Cornell University) 2019-01-01

Coming Soon ...