Chanchal K. Roy

ORCID: 0000-0003-0519-6164
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Software Engineering Research
  • Software Reliability and Analysis Research
  • Software Testing and Debugging Techniques
  • Advanced Malware Detection Techniques
  • Software System Performance and Reliability
  • Web Data Mining and Analysis
  • Software Engineering Techniques and Practices
  • Topic Modeling
  • Open Source Software Innovations
  • Advanced Software Engineering Methodologies
  • Scientific Computing and Data Management
  • Wikis in Education and Collaboration
  • Distributed and Parallel Computing Systems
  • Expert finding and Q&A systems
  • Web Application Security Vulnerabilities
  • Mobile Crowdsensing and Crowdsourcing
  • Cloud Computing and Resource Management
  • Research Data Management Practices
  • Data Visualization and Analytics
  • Artificial Intelligence in Healthcare and Education
  • Data Mining Algorithms and Applications
  • Natural Language Processing Techniques
  • Advanced Text Analysis Techniques
  • Data Stream Mining Techniques
  • Evolutionary Algorithms and Applications

University of Saskatchewan
2016-2025

McGill University
2020-2021

Polytechnique Montréal
2019-2021

Bangladesh Agricultural University
2021

Chattogram Veterinary and Animal Sciences University
2020

University of Waterloo
2013-2019

Università della Svizzera italiana
2019

Florida State University
2019

Drew University
2019

Wayne State University
2019

This paper examines the effectiveness of a new language- specific parser-based but lightweight clone detection approach. Exploiting novel application source transformation system, method accurately finds near-miss clones using an efficient text line comparison technique. The system assists in three ways. First, agile parsing it provides user-specified flexible pretty- printing to remove noise, standardize formatting and break program statements into parts such that potential changes can be...

10.1109/icpc.2008.41 article EN 2008-06-01

Despite a decade of active research, there is marked lack in clone detectors that scale to very large repositories source code, particular for detecting near-miss clones where significant editing activities may take place the cloned code. We present SourcererCC, token-based detector targets three types, and exploits an index achieve scalability inter-project using standard workstation. SourcererCC uses optimized inverted-index quickly query potential given code block. Filtering heuristics...

10.1145/2884781.2884877 preprint EN Proceedings of the 44th International Conference on Software Engineering 2016-05-13

Recently, new applications of code clone detection and search have emerged that rely upon clones detected across thousands software systems. Big data algorithms been proposed as an embedded part these applications. However, there exists no previous benchmark for evaluating the recall precision emerging techniques. In this paper, we present a Data consists known true false positive in inter-project Java repository. The was built by mining then manually checking ten common functionalities....

10.1109/icsme.2014.77 article EN 2014-09-01

Community-based question answering services accumulate large volumes of knowledge through the voluntary people across globe. Stack Overflow is an example such a service that targets developers and software engineers. In general, questions in are answered very short time. However, we found number unanswered has increased significantly past two years. Understanding why remain can help information seekers improve quality their questions, increase chances getting answers, better decide when to...

10.1109/msr.2013.6624015 article EN 2013-05-01

The NiCad Clone Detector is a scalable, flexible clone detection tool designed to implement the (Automated Detection of Near-Miss Intentional Clones) hybrid method in convenient, easy-to-use command-line that can easily be embedded IDEs and other environments. It takes as input source directory or directories checked for clones configuration file specifying normalization filtering done, provides output results both XML form easy analysis HTML convenient browsing. handles range languages...

10.1109/icpc.2011.26 article EN 2011-06-01

In recent years many methods and tools for software clone detection have been proposed. While some work has done on assessing comparing performance of these tools, very little empirical evaluation done. particular, accuracy measures such as precision recall only roughly estimated, due both to problems in creating a validated benchmark against which can be compared, the manual effort required hand check large numbers candidate clones. this paper we propose an automated method empirically...

10.1109/icstw.2009.18 article EN International Conference on Software Testing, Verification and Validation Workshops 2009-01-01

Many clone detection tools have been proposed in the literature. However, our knowledge of their performance real software systems is limited, particularly recall. In this paper, we use big data benchmark, BigCloneBench, to evaluate recall ten tools. BigCloneBench a collection eight million validated clones within IJaDataset-2.0, repository containing 25,000 open-source Java systems. contains both intra-project and inter-project four primary types. We benchmark per type across entire range...

10.1109/icsm.2015.7332459 article EN 2015-09-01

Traditional code search engines often do not perform well with natural language queries since they mostly apply keyword matching. These thus need carefully designed containing information about programming APIs for search. Unfortunately, existing studies suggest that preparing an effective query is both challenging and time consuming the developers. In this paper, we propose a novel API recommendation technique -- RACK recommends list of relevant by exploiting keyword-API associations from...

10.1109/saner.2016.80 preprint EN 2016-03-01

The new hybrid clone detection tool NICAD combines the strengths and overcomes limitations of both text-based AST-based techniques to yield highly accurate identification cloned code in software systems. In this paper, we present a first empirical study function clones open source using NICAD. We examine more than 15 C Java systems, including entire Linux Kernel Apache httpd, analyze their use several different dimensions, language, size, location density by proportion functions. manually...

10.1109/wcre.2008.54 article EN 2008-10-01

Stack Overflow is a popular question answering site that focused on programming problems. Despite efforts to prevent asking questions have already been answered, the contains duplicate questions. This may cause developers unnecessarily wait for be answered when it has asked and answered. The currently depends its moderators users with high reputation manually mark those as duplicates, which not only results in delayed responses but also requires additional efforts. In this paper, we first...

10.1145/2901739.2901770 article EN 2016-05-14

Many clone detection tools and techniques have been introduced in the literature, these used to manage clones study their effects on software maintenance evolution. However, performance of modern is not well known, especially recall. In this paper, we evaluate compare recall eleven using four benchmark frameworks, including: (1) Bellon's Framework, (2) our modification Framework improve accuracy its matching metrics, (3) Murakamki et al.'s extension which adds type 3 gap awareness framework,...

10.1109/icsme.2014.54 article EN 2014-09-01

Copying code and then pasting with large number of edits is a common activity in software development, the pasted kind complicated Type-3 clone. Due to edits, we consider clone as large-gap Large-gap can reflect extension code, such change improvement. The existing state-of-the-art detectors suffer from several limitations detecting clones. In this paper, propose tool, CCAligner, using window that considers e edit distance for matching detect our approach, novel e-mismatch index designed...

10.1145/3180155.3180179 article EN Proceedings of the 44th International Conference on Software Engineering 2018-05-27

Duplicated code or clones are a kind of smell that have both positive and negative impacts on the development maintenance software systems.Software clone research in past mostly focused detection analysis clones, while recent years extends to whole spectrum management.In last decade, three surveys appeared literature, which cover detection, analysis, evolutionary characteristics clones.This paper presents comprehensive survey state art management, with in-depth investigation management...

10.1109/csmr-wcre.2014.6747168 preprint EN 2014-02-01

Recent findings suggest that Information Retrieval (IR)-based bug localization techniques do not perform well if the report lacks rich structured information (e.g., relevant program entity names). Conversely, excessive stack traces) in might always help automated either. In this paper, we propose a novel technique--BLIZZARD-- automatically localizes buggy entities from project source using appropriate query reformulation and effective retrieval. particular, our technique determines whether...

10.1145/3236024.3236065 article EN 2018-10-26

Software clones are detrimental to software maintenance and evolution as a result many clone detectors have been proposed. These tools target detection in applications written single programming language. However, application may be different languages for platforms improve the application's platform compatibility adoption by users of platforms. Cross language (CLCs) introduce additional challenges when maintaining multi-platform would likely go undetected using existing tools. In this...

10.1109/ase.2019.00099 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2019-11-01

Model interpretability is essential in machine learning, particularly for applications critical fields like healthcare, where understanding model decisions paramount. While SHAP (SHapley Additive exPlanations) has proven to be a robust tool explaining learning predictions, its high computational cost limits practicality real-time use. To address this, we introduce C-SHAP (Clustering-Boosted SHAP), hybrid method that combines with K-means clustering reduce execution times significantly while...

10.3390/app15020672 article EN cc-by Applied Sciences 2025-01-11

Software clones are considered harmful in software maintenance and evolution. However, despite a decade of active research, there is marked lack work the detection analysis near-miss clones, those where minor to extensive modifications have been made copied fragments. In this thesis, we advance state-of-the-art clone several ways. First, develop hybrid method. Second, address vagueness definition by proposing metamodel types. Third, conduct scenario-based comparison evaluation all currently...

10.1109/icsm.2009.5306301 article EN 2009-09-01

Given the increasing number of unsuccessful pull requests in GitHub projects, insights into success and failure these are essential for developers. In this paper, we provide a comparative study between successful made to 78 base projects by 20,142 developers from 103,192 forked projects. study, analyze request discussion texts, project specific information (e.g., domain, maturity), developer experience) order report useful insights, use them contrast requests. We believe our will help...

10.1145/2597073.2597121 preprint EN 2014-05-20

Peer code review locates common coding rule violations and simple logical errors in the early phases of software development, thus reduces overall cost. However, GitHub, identifying an appropriate reviewer for a pull request is non-trivial task given that reliable information identification often not readily available. In this paper, we propose recommendation technique considers only relevant cross-project work history (e.g., external library experience) but also experience developer certain...

10.1145/2889160.2889244 preprint EN 2016-05-14

Although peer code review is widely adopted in both commercial and open source development, existing studies suggest that such reviews often contain a significant amount of non-useful comments. Unfortunately, to date, no tools or techniques exist can provide automatic support improving those In this paper, we first report comparative study between useful comments where contrast them using their textual characteristics, reviewers' experience. Then, based on the findings from study, develop...

10.1109/msr.2017.17 article EN 2017-05-01
Coming Soon ...