NFDI4DS | UHH-SEMS - Publication Details

Chetan Bansal

ORCID: 0000-0003-0102-8139

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101967802

Research Areas

Software System Performance and Reliability
Software Engineering Research
Cloud Computing and Resource Management
Data Quality and Management
Anomaly Detection Techniques and Applications
Topic Modeling
Network Security and Intrusion Detection
Software Testing and Debugging Techniques
Software Engineering Techniques and Practices
Web Data Mining and Analysis
IoT and Edge/Fog Computing
Advanced Malware Detection Techniques
Cloud Data Security Solutions
Distributed and Parallel Computing Systems
Spam and Phishing Detection
Big Data and Business Intelligence
Information Retrieval and Search Behavior
Natural Language Processing Techniques
Security and Verification in Computing
Caching and Content Delivery
Web Application Security Vulnerabilities
Data Stream Mining Techniques
Advanced Computational Techniques and Applications
Open Source Software Innovations
Semantic Web and Ontologies

Microsoft (United States)
2012-2025

Google (United States)
2025

Seattle University
2025

University of Washington
2025

University of California, Los Angeles
2025

University of Illinois Urbana-Champaign
2025

Menlo School
2025

Microsoft (Germany)
2024

University of Chinese Academy of Sciences
2024

Microsoft Research (United Kingdom)
2019-2024

Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models

OPENALEX - Publications

Toufique Ahmed Supriyo Ghosh Chetan Bansal Thomas Zimmermann Xuchao Zhang and 1 more

Incident management for cloud services is a complex process involving several steps and has huge impact on both service health developer productivity. On-call engineers require significant amount of domain knowledge manual effort root causing mitigation production incidents. Recent advances in artificial intelligence resulted state-of-the-art large language models like GPT-3.x (both GPT-3.0 GPT-3.5), which have been used to solve variety problems ranging from question answering text...

10.1109/icse48619.2023.00149 article EN 2023-05-01

Discovering Concrete Attacks on Website Authorization by Formal Analysis

OPENALEX - Publications

Chetan Bansal Karthikeyan Bhargavan Sergio Maffeis

Social sign-on and social sharing are becoming an ever more popular feature of web applications. This success is largely due to the APIs support offered by prominent networks, such as Facebook, Twitter, Google, on basis new open standards OAuth 2.0 authorization protocol. A formal analysis these protocols must account for malicious websites common application vulnerabilities, cross-site request forgery redirectors. We model several configurations protocol in applied pi-calculus verify them...

10.1109/csf.2012.27 preprint EN 2012-06-01

Discovering concrete attacks on website authorization by formal analysis1

OPENALEX - Publications

Chetan Bansal Karthikeyan Bhargavan Antoine Delignat-Lavaud Sergio Maffeis

Social sign-on and social sharing are becoming an ever more popular feature of web applications. This success is largely due to the APIs support offered by prominent networks, such as Facebook, Twitter Google, on basis new open standards OAuth 2.0 authorization pro tocol. A formal analysis these protocols must account for malicious websites common application vulnerabilities, cross-site request forgery redirectors. We model several configurations protocol in applied pi-calculus verify them...

10.3233/jcs-140503 article EN Journal of Computer Security 2014-04-23

Automated Root Causing of Cloud Incidents using In-Context Learning with GPT-4

OPENALEX - Publications

Xuchao Zhang Supriyo Ghosh Chetan Bansal Rujia Wang Minghua Ma and 2 more

10.1145/3663529.3663846 article EN 2024-07-10

Designing Cloud Servers for Lower Carbon

OPENALEX - Publications

Jaylen Wang Daniel S. Berger Fiodar Kazhamiaka Celine Irvene Chaojie Zhang and 8 more

10.1109/isca59077.2024.00041 article EN 2024-06-29

WhoDo: automating reviewer suggestions at scale

OPENALEX - Publications

Sumit Asthana Rahul Kumar Ranjita Bhagwan Christian Bird Chetan Bansal and 3 more

Today's software development is distributed and involves continuous changes for new features yet, their cycle has to be fast agile. An important component of enabling this agility selecting the right reviewers every code-change - smallest unit cycle. Modern tool-based code review proven an effective way achieve appropriate changes. However, selection in these systems at best manual. As teams scale, poses challenge reviewers, which turn determines quality over time. While previous work...

10.1145/3338906.3340449 article EN 2019-08-09

Predicting pull request completion time: a case study on large scale cloud services

OPENALEX - Publications

Chandra Maddila Chetan Bansal Nachiappan Nagappan

Effort estimation models have been long studied in software engineering research. help organizations and individuals plan track progress of their projects individual tasks to delivery milestones better. Towards this end, there is a large body work that has done on effort for but little an checkin (Pull Request) level. In paper we present methodology provides estimates developer check-ins which displayed developers them items. Given the cloud development infrastructure pervasive companies, it...

10.1145/3338906.3340457 article EN 2019-08-09

How to fight production incidents?

OPENALEX - Publications

Supriyo Ghosh Manish Shetty Chetan Bansal Suman Nath

Production incidents in today's large-scale cloud services can be extremely expensive terms of customer impacts and engineering resources required to mitigate them. Despite continuous reliability efforts, still experience severe due various root-causes. Worse, many these last for a long period as existing techniques practices fail quickly detect To better understand the problems, we carefully study hundreds recent high severity their postmortems Microsoft-Teams, distributed based service...

10.1145/3542929.3563482 article EN 2022-11-07

Exploring LLM-Based Agents for Root Cause Analysis

OPENALEX - Publications

Devjeet Roy Xuchao Zhang Rashi Bhave Chetan Bansal Pedro Las-Casas and 2 more

10.1145/3663529.3663841 article EN 2024-07-10

DeCaf

OPENALEX - Publications

Chetan Bansal Sundararajan Renganathan Ashima Asudani Olivier Midy Mathru Janakiraman

Large scale cloud services use Key Performance Indicators (KPIs) for tracking and monitoring performance. They usually have Service Level Objectives (SLOs) baked into the customer agreements which are tied to these KPIs. Dependency failures, code bugs, infrastructure other problems can cause performance regressions. It is critical minimize time manual effort in diagnosing triaging such issues reduce impact. volume of logs mixed type attributes (categorical, continuous) makes diagnosis...

10.1145/3377813.3381353 preprint EN 2020-06-27

X-Lifecycle Learning for Cloud Incident Management using LLMs

OPENALEX - Publications

Drishti Goel Fiza Husain Aditya Singh Supriyo Ghosh Anjaly Parayil and 3 more

10.1145/3663529.3663861 article EN 2024-07-10

Nudge: Accelerating Overdue Pull Requests toward Completion

OPENALEX - Publications

Chandra Maddila Sai Surya Upadrasta Chetan Bansal Nachiappan Nagappan Georgios Gousios and 1 more

Pull requests are a key part of the collaborative software development and code review process today. However, pull can also slow down when reviewer(s) or author do not actively engage with request. In this work, we design an end-to-end service, Nudge, for accelerating overdue towards completion by reminding to their requests. First, use models based on effort estimation machine learning predict time given Second, activity detection filter out that may be overdue, but which sufficient action...

10.1145/3544791 article EN ACM Transactions on Software Engineering and Methodology 2022-06-25

Towards Cloud Efficiency with Large-scale Workload Characterization

OPENALEX - Publications

Anjaly Parayil Jue Zhang Xiaoting Qin Íñigo Goiri Lexiang Huang and 2 more

Cloud providers introduce features (e.g., Spot VMs, Harvest and Burstable VMs) optimizations oversubscription, auto-scaling, power harvesting, overclocking) to improve efficiency reliability. To effectively utilize these features, it's crucial understand the characteristics of workloads running in cloud. However, workload can be complex depend on multiple signals, making manual characterization difficult unscalable. In this study, we conduct first large-scale examination first-party at...

10.48550/arxiv.2405.07250 preprint EN arXiv (Cornell University) 2024-05-12

MonitorAssistant: Simplifying Cloud Service Monitoring via Large Language Models

OPENALEX - Publications

Zhaoyang Yu Minghua Ma Chaoyun Zhang Si Qin Yu Kang and 7 more

10.1145/3663529.3663826 article EN 2024-07-10

Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud Platforms

OPENALEX - Publications

Benjamin Reidys Pantea Zardoshti Íñigo Goiri Celine Irvene Daniel S. Berger and 14 more

Cloud platforms remain underutilized despite multiple proposals to improve their utilization (e.g., disaggregation, harvesting, and oversubscription). Our characterization of the resource virtual machines (VMs) in Azure reveals that, while CPU is main resource, we need provide a solution manage all resources holistically. We also observe that many VMs exhibit complementary temporal patterns, which can be leveraged oversubscription resources. Based on these insights, propose Coach: system...

10.1145/3669940.3707226 preprint EN 2025-02-03

Towards Efficient Large Multimodal Model Serving

OPENALEX - Publications

H. Qiu A. K. Biswas Zihan Zhao Jayashree Mohan Alind Khare and 7 more

Recent advances in generative AI have led to large multi-modal models (LMMs) capable of simultaneously processing inputs various modalities such as text, images, video, and audio. While these demonstrate impressive capabilities, efficiently serving them production environments poses significant challenges due their complex architectures heterogeneous resource requirements. We present the first comprehensive systems analysis two prominent LMM architectures, decoder-only cross-attention, on...

10.48550/arxiv.2502.00937 preprint EN arXiv (Cornell University) 2025-02-02

Verifiable Format Control for Large Language Model Generations

OPENALEX - Publications

Zhaoyang Wang Jinqi Jiang Huichi Zhou Wenhao Zheng Xuchao Zhang and 2 more

Recent Large Language Models (LLMs) have demonstrated satisfying general instruction following ability. However, small LLMs with about 7B parameters still struggle fine-grained format (e.g., JSON format), which seriously hinder the advancements of their applications. Most existing methods focus on benchmarking while overlook how to improve specific ability for LLMs. Besides, these often rely evaluations based advanced GPT-4), can introduce intrinsic bias and be costly due API calls. In this...

10.48550/arxiv.2502.04498 preprint EN arXiv (Cornell University) 2025-02-06

Intent-based System Design and Operation

OPENALEX - Publications

Vaastav Anand Yichen Li Alok Gautam Kumbhare Celine Irvene Chetan Bansal and 4 more

Cloud systems are the backbone of today's computing industry. Yet, these remain complicated to design, build, operate, and improve. All tasks require significant manual effort by both developers operators systems. To reduce this burden, in paper we set forth a vision for achieving holistic automation, intent-based system design operation. We propose intent as new abstraction within context Intent encodes functional operational requirements at high-level, which can be used automate...

10.48550/arxiv.2502.05984 preprint EN arXiv (Cornell University) 2025-02-09

Serving Models, Fast and Slow:Optimizing Heterogeneous LLM Inferencing Workloads at Scale

OPENALEX - Publications

Shashwat Jaiswal Kunal Jain Yogesh Simmhan Anjaly Parayil Ankur Mallick and 7 more

Large Language Model (LLM) inference workloads handled by global cloud providers can include both latency-sensitive and insensitive tasks, creating a diverse range of Service Level Agreement (SLA) requirements. Managing these mixed is challenging due to the complexity stack, which includes multiple LLMs, hardware configurations, geographic distributions. Current optimization strategies often silo tasks ensure that SLAs are met for but this leads significant under-utilization expensive GPU...

10.48550/arxiv.2502.14617 preprint EN arXiv (Cornell University) 2025-02-20

Performance Aware LLM Load Balancer for Mixed Workloads

OPENALEX - Publications

Kunal Jain Anjaly Parayil Ankur Mallick Esha Choukse Xiaoting Qin and 8 more

10.1145/3721146.3721947 article EN cc-by 2025-03-30

Towards Workload-aware Cloud Efficiency: A Large-scale Empirical Study of Cloud Workload Characteristics

OPENALEX - Publications

Anjaly Parayil Jue Zhang Xiaoting Qin Íñigo Goiri Lexiang Huang and 2 more

10.1145/3676151.3722008 article EN 2025-05-03

Enabling Sustainable Cloud Computing with Low-Carbon Server Design

OPENALEX - Publications

Jaylen Wang Daniel S. Berger Fiodar Kazhamiaka Celine Irvene Chaojie Zhang and 8 more

10.1109/mm.2025.3572955 article EN IEEE Micro 2025-01-01

Neural Knowledge Extraction From Cloud Service Incidents

OPENALEX - Publications

Manish Shetty Chetan Bansal Sumit Kumar Nikitha Rao Nachiappan Nagappan and 1 more

The move from boxed products to services and the widespread adoption of cloud computing has had a huge impact on software development life cycle DevOps processes. Particularly, incident management become critical for developing operating large-scale services. Prior work heavily focused challenges with triaging de-duplication. In this work, we address fundamental problem structured knowledge extraction service incidents. We have built SoftNER, framework unsupervised frame as Named-Entity...

10.1109/icse-seip52600.2021.00031 article EN 2021-05-01

Coming Soon ...