NFDI4DS | UHH-SEMS - Publication Details

Chanchal K. Roy

ORCID: 0000-0003-0519-6164

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5102756770

Research Areas

Software Engineering Research
Software Reliability and Analysis Research
Software Testing and Debugging Techniques
Advanced Malware Detection Techniques
Software System Performance and Reliability
Web Data Mining and Analysis
Software Engineering Techniques and Practices
Topic Modeling
Open Source Software Innovations
Advanced Software Engineering Methodologies
Scientific Computing and Data Management
Wikis in Education and Collaboration
Distributed and Parallel Computing Systems
Expert finding and Q&A systems
Web Application Security Vulnerabilities
Mobile Crowdsensing and Crowdsourcing
Cloud Computing and Resource Management
Research Data Management Practices
Data Visualization and Analytics
Artificial Intelligence in Healthcare and Education
Data Mining Algorithms and Applications
Natural Language Processing Techniques
Advanced Text Analysis Techniques
Data Stream Mining Techniques
Evolutionary Algorithms and Applications

University of Saskatchewan
2016-2025

McGill University
2020-2021

Polytechnique Montréal
2019-2021

Bangladesh Agricultural University
2021

Chattogram Veterinary and Animal Sciences University
2020

University of Waterloo
2013-2019

Università della Svizzera italiana
2019

Florida State University
2019

Drew University
2019

Wayne State University
2019

Comparison and evaluation of code clone detection techniques and tools: A qualitative approach

OPENALEX - Publications

Chanchal K. Roy James R. Cordy Rainer Koschke

10.1016/j.scico.2009.02.007 article EN publisher-specific-oa Science of Computer Programming 2009-03-11

NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization

OPENALEX - Publications

Chanchal K. Roy James R. Cordy

This paper examines the effectiveness of a new language- specific parser-based but lightweight clone detection approach. Exploiting novel application source transformation system, method accurately finds near-miss clones using an efficient text line comparison technique. The system assists in three ways. First, agile parsing it provides user-specified flexible pretty- printing to remove noise, standardize formatting and break program statements into parts such that potential changes can be...

10.1109/icpc.2008.41 article EN 2008-06-01

SourcererCC

OPENALEX - Publications

Hitesh Sajnani Vaibhav Saini Jeffrey Svajlenko Chanchal K. Roy Cristina Videira Lopes

Despite a decade of active research, there is marked lack in clone detectors that scale to very large repositories source code, particular for detecting near-miss clones where significant editing activities may take place the cloned code. We present SourcererCC, token-based detector targets three types, and exploits an index achieve scalability inter-project using standard workstation. SourcererCC uses optimized inverted-index quickly query potential given code block. Filtering heuristics...

10.1145/2884781.2884877 preprint EN Proceedings of the 44th International Conference on Software Engineering 2016-05-13

Towards a Big Data Curated Benchmark of Inter-project Code Clones

OPENALEX - Publications

Jeffrey Svajlenko Judith F. Islam Iman Keivanloo Chanchal K. Roy Mohammad Mamun Mia

Recently, new applications of code clone detection and search have emerged that rely upon clones detected across thousands software systems. Big data algorithms been proposed as an embedded part these applications. However, there exists no previous benchmark for evaluating the recall precision emerging techniques. In this paper, we present a Data consists known true false positive in inter-project Java repository. The was built by mining then manually checking ten common functionalities....

10.1109/icsme.2014.77 article EN 2014-09-01

Answering questions about unanswered questions of Stack Overflow

OPENALEX - Publications

Muhammad Asaduzzaman Ahmed Shah Mashiyat Chanchal K. Roy Kevin A. Schneider

Community-based question answering services accumulate large volumes of knowledge through the voluntary people across globe. Stack Overflow is an example such a service that targets developers and software engineers. In general, questions in are answered very short time. However, we found number unanswered has increased significantly past two years. Understanding why remain can help information seekers improve quality their questions, increase chances getting answers, better decide when to...

10.1109/msr.2013.6624015 article EN 2013-05-01

The NiCad Clone Detector

OPENALEX - Publications

James R. Cordy Chanchal K. Roy

The NiCad Clone Detector is a scalable, flexible clone detection tool designed to implement the (Automated Detection of Near-Miss Intentional Clones) hybrid method in convenient, easy-to-use command-line that can easily be embedded IDEs and other environments. It takes as input source directory or directories checked for clones configuration file specifying normalization filtering done, provides output results both XML form easy analysis HTML convenient browsing. handles range languages...

10.1109/icpc.2011.26 article EN 2011-06-01

A Mutation/Injection-Based Automatic Framework for Evaluating Code Clone Detection Tools

OPENALEX - Publications

Chanchal K. Roy James R. Cordy

In recent years many methods and tools for software clone detection have been proposed. While some work has done on assessing comparing performance of these tools, very little empirical evaluation done. particular, accuracy measures such as precision recall only roughly estimated, due both to problems in creating a validated benchmark against which can be compared, the manual effort required hand check large numbers candidate clones. this paper we propose an automated method empirically...

10.1109/icstw.2009.18 article EN International Conference on Software Testing, Verification and Validation Workshops 2009-01-01

Evaluating clone detection tools with BigCloneBench

OPENALEX - Publications

Jeffrey Svajlenko Chanchal K. Roy

Many clone detection tools have been proposed in the literature. However, our knowledge of their performance real software systems is limited, particularly recall. In this paper, we use big data benchmark, BigCloneBench, to evaluate recall ten tools. BigCloneBench a collection eight million validated clones within IJaDataset-2.0, repository containing 25,000 open-source Java systems. contains both intra-project and inter-project four primary types. We benchmark per type across entire range...

10.1109/icsm.2015.7332459 article EN 2015-09-01

RACK: Automatic API Recommendation Using Crowdsourced Knowledge

OPENALEX - Publications

Mohammad Masudur Rahman Chanchal K. Roy David Lo

Traditional code search engines often do not perform well with natural language queries since they mostly apply keyword matching. These thus need carefully designed containing information about programming APIs for search. Unfortunately, existing studies suggest that preparing an effective query is both challenging and time consuming the developers. In this paper, we propose a novel API recommendation technique -- RACK recommends list of relevant by exploiting keyword-API associations from...

10.1109/saner.2016.80 preprint EN 2016-03-01

An Empirical Study of Function Clones in Open Source Software

OPENALEX - Publications

Chanchal K. Roy James R. Cordy

The new hybrid clone detection tool NICAD combines the strengths and overcomes limitations of both text-based AST-based techniques to yield highly accurate identification cloned code in software systems. In this paper, we present a first empirical study function clones open source using NICAD. We examine more than 15 C Java systems, including entire Linux Kernel Apache httpd, analyze their use several different dimensions, language, size, location density by proportion functions. manually...

10.1109/wcre.2008.54 article EN 2008-10-01

Mining duplicate questions in stack overflow

OPENALEX - Publications

Muhammad Ahasanuzzaman Muhammad Asaduzzaman Chanchal K. Roy Kevin A. Schneider

Stack Overflow is a popular question answering site that focused on programming problems. Despite efforts to prevent asking questions have already been answered, the contains duplicate questions. This may cause developers unnecessarily wait for be answered when it has asked and answered. The currently depends its moderators users with high reputation manually mark those as duplicates, which not only results in delayed responses but also requires additional efforts. In this paper, we first...

10.1145/2901739.2901770 article EN 2016-05-14

Evaluating Modern Clone Detection Tools

OPENALEX - Publications

Jeffrey Svajlenko Chanchal K. Roy

Many clone detection tools and techniques have been introduced in the literature, these used to manage clones study their effects on software maintenance evolution. However, performance of modern is not well known, especially recall. In this paper, we evaluate compare recall eleven using four benchmark frameworks, including: (1) Bellon's Framework, (2) our modification Framework improve accuracy its matching metrics, (3) Murakamki et al.'s extension which adds type 3 gap awareness framework,...

10.1109/icsme.2014.54 article EN 2014-09-01

CCAligner

OPENALEX - Publications

Pengcheng Wang Jeffrey Svajlenko Yanzhao Wu Yun Xu Chanchal K. Roy

Copying code and then pasting with large number of edits is a common activity in software development, the pasted kind complicated Type-3 clone. Due to edits, we consider clone as large-gap Large-gap can reflect extension code, such change improvement. The existing state-of-the-art detectors suffer from several limitations detecting clones. In this paper, propose tool, CCAligner, using window that considers e edit distance for matching detect our approach, novel e-mismatch index designed...

10.1145/3180155.3180179 article EN Proceedings of the 44th International Conference on Software Engineering 2018-05-27

The vision of software clone management: Past, present, and future (Keynote paper)

OPENALEX - Publications

Chanchal K. Roy Minhaz F. Zibran Rainer Koschke

Duplicated code or clones are a kind of smell that have both positive and negative impacts on the development maintenance software systems.Software clone research in past mostly focused detection analysis clones, while recent years extends to whole spectrum management.In last decade, three surveys appeared literature, which cover detection, analysis, evolutionary characteristics clones.This paper presents comprehensive survey state art management, with in-depth investigation management...

10.1109/csmr-wcre.2014.6747168 preprint EN 2014-02-01

Improving IR-based bug localization with context-aware query reformulation

OPENALEX - Publications

Mohammad Masudur Rahman Chanchal K. Roy

Recent findings suggest that Information Retrieval (IR)-based bug localization techniques do not perform well if the report lacks rich structured information (e.g., relevant program entity names). Conversely, excessive stack traces) in might always help automated either. In this paper, we propose a novel technique--BLIZZARD-- automatically localizes buggy entities from project source using appropriate query reformulation and effective retrieval. particular, our technique determines whether...

10.1145/3236024.3236065 article EN 2018-10-26

CLCDSA: Cross Language Code Clone Detection using Syntactical Features and API Documentation

OPENALEX - Publications

Kawser Wazed Nafi Tonny Shekha Kar Banani Roy Chanchal K. Roy Kevin A. Schneider

Software clones are detrimental to software maintenance and evolution as a result many clone detectors have been proposed. These tools target detection in applications written single programming language. However, application may be different languages for platforms improve the application's platform compatibility adoption by users of platforms. Cross language (CLCs) introduce additional challenges when maintaining multi-platform would likely go undetected using existing tools. In this...

10.1109/ase.2019.00099 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2019-11-01

A systematic literature review on source code similarity measurement and clone detection: Techniques, applications, and challenges

OPENALEX - Publications

Morteza Zakeri‐Nasrabadi Saeed Parsa Mohammad Ramezani Chanchal K. Roy Masoud Ekhtiarzadeh

10.1016/j.jss.2023.111796 article EN Journal of Systems and Software 2023-07-05

C-SHAP: A Hybrid Method for Fast and Efficient Interpretability

OPENALEX - Publications

Golshid Ranjbaran Diego Reforgiato Recupero Chanchal K. Roy Kevin A. Schneider

Model interpretability is essential in machine learning, particularly for applications critical fields like healthcare, where understanding model decisions paramount. While SHAP (SHapley Additive exPlanations) has proven to be a robust tool explaining learning predictions, its high computational cost limits practicality real-time use. To address this, we introduce C-SHAP (Clustering-Boosted SHAP), hybrid method that combines with K-means clustering reduce execution times significantly while...

10.3390/app15020672 article EN cc-by Applied Sciences 2025-01-11

Detection and analysis of near-miss software clones

OPENALEX - Publications

Chanchal K. Roy

Software clones are considered harmful in software maintenance and evolution. However, despite a decade of active research, there is marked lack work the detection analysis near-miss clones, those where minor to extensive modifications have been made copied fragments. In this thesis, we advance state-of-the-art clone several ways. First, develop hybrid method. Second, address vagueness definition by proposing metamodel types. Third, conduct scenario-based comparison evaluation all currently...

10.1109/icsm.2009.5306301 article EN 2009-09-01

An insight into the pull requests of GitHub

OPENALEX - Publications

Mohammad Masudur Rahman Chanchal K. Roy

Given the increasing number of unsuccessful pull requests in GitHub projects, insights into success and failure these are essential for developers. In this paper, we provide a comparative study between successful made to 78 base projects by 20,142 developers from 103,192 forked projects. study, analyze request discussion texts, project specific information (e.g., domain, maturity), developer experience) order report useful insights, use them contrast requests. We believe our will help...

10.1145/2597073.2597121 preprint EN 2014-05-20

CoRReCT

OPENALEX - Publications

Mohammad Masudur Rahman Chanchal K. Roy Jason A. Collins

Peer code review locates common coding rule violations and simple logical errors in the early phases of software development, thus reduces overall cost. However, GitHub, identifying an appropriate reviewer for a pull request is non-trivial task given that reliable information identification often not readily available. In this paper, we propose recommendation technique considers only relevant cross-project work history (e.g., external library experience) but also experience developer certain...

10.1145/2889160.2889244 preprint EN 2016-05-14

Predicting Usefulness of Code Review Comments Using Textual Features and Developer Experience

OPENALEX - Publications

Mohammad Masudur Rahman Chanchal K. Roy Raula Gaikovina Kula

Although peer code review is widely adopted in both commercial and open source development, existing studies suggest that such reviews often contain a significant amount of non-useful comments. Unfortunately, to date, no tools or techniques exist can provide automatic support improving those In this paper, we first report comparative study between useful comments where contrast them using their textual characteristics, reviewers' experience. Then, based on the findings from study, develop...

10.1109/msr.2017.17 article EN 2017-05-01

Coming Soon ...