- Software Engineering Research
- Advanced Malware Detection Techniques
- Software Testing and Debugging Techniques
- Service-Oriented Architecture and Web Services
- Semantic Web and Ontologies
- Network Security and Intrusion Detection
- Software System Performance and Reliability
- Software Engineering Techniques and Practices
- Anomaly Detection Techniques and Applications
- Digital and Cyber Forensics
- Software Reliability and Analysis Research
- Advanced Software Engineering Methodologies
- Business Process Modeling and Analysis
- Adversarial Robustness in Machine Learning
- Canadian Identity and History
- Topic Modeling
- Natural Language Processing Techniques
- Safety Systems Engineering in Autonomy
- Manufacturing Process and Optimization
- Medical Image Segmentation Techniques
- Web Data Mining and Analysis
- Anatomy and Medical Technology
- Indigenous Studies and Ecology
- Web Applications and Data Management
- Information and Cyber Security
Defence Research and Development Canada
2013-2025
Université de Sherbrooke
2024
Bishop's University
2024
McGill University
1996-2020
Université du Québec à Montréal
2017-2020
Montreal Neurological Institute and Hospital
2005
University of British Columbia
1996
Reverse engineering is a manually intensive but necessary technique for understanding the inner workings of new malware, finding vulnerabilities in existing systems, and detecting patent infringements released software. An assembly clone search engine facilitates work reverse engineers by identifying those duplicated or known parts. However, it challenging to design robust engine, since there exist various compiler optimization options code obfuscation techniques that make logically similar...
To gain an in-depth understanding of the behaviour a malware, reverse engineers have to disassemble analyze resulting assembly code, and then archive commented code in malware repository for future reference. In this paper, we developed clone detection system called BinClone identify fragments from collection binaries with following major contributions. First, introduce two deterministic methods goals improving recall rate facilitating analysis. Second, our allow analysts discover both exact...
Assembly code analysis is one of the critical processes for detecting and proving software plagiarism patent infringements when source unavailable. It also a common practice to discover exploits vulnerabilities in existing software. However, it manually intensive time-consuming process even experienced reverse engineers. An effective efficient assembly clone search engine can greatly reduce effort this process, since identify cloned parts that have been previously analyzed. The problem...
Finding lines of code similar to a fragment across large knowledge bases in fractions second is new branch clone research also known as real-time search. Among the requirements search has meet are scalability, short response time, scalable incremental corpus updates, and support for type-1, type-2, type-3 clones. We conducted set empirical studies on open source gain insight about its characteristics. used these results design optimize multi-level indexing approach using hash table-based...
Software vulnerabilities have been posing tremendous reliability threats to the general public as well critical infrastructures, and there many studies aiming detect mitigate software defects at binary level. Most of standard practices leverage both static dynamic analysis, which several drawbacks like heavy manual workload high complexity. Existing deep learning-based solutions not only suffer capture complex relationships among different variables from raw code but also lack explainability...
This study investigates the acceptability of different artificial intelligence (AI) applications in education from a multi-stakeholder perspective, including students, teachers, and parents. Acknowledging transformative potential AI education, it addresses concerns related to data privacy, agency, transparency, explainability ethical deployment AI. Through vignette methodology, participants were presented with four scenarios where AI's explainability, privacy manipulated. After each...
Abstract Sequence diagrams can be valuable aids to software understanding. However, they extremely large and hard understand in spite of using modern tool support. Consequently, providing the right set features is important if tools are help rather than hinder user. This paper surveys research commercial sequence diagram determine provide support program Although there has been significant effort developing these tools, many them have not evaluated human subjects. To begin address this gap,...
Finding similar code is important for software engineering, defense of intellectual property, and security, one the increasingly common ways adversaries use to defeat detection through obfuscations such as transformation scattering they wish hide amongst long sequences. Moving far enough apart poses a specific challenge solutions with localized features (e.g., n-grams) or attention mechanisms parts are distributed beyond local context window. We introduce neural network solution pattern...
Linked Data is designed to support interoperability and sharing of open datasets by allowing on the fly inter-linking data using basic layers Semantic Web HTTP protocol. In our research, we focus providing a Uniform Resource Locator (URL) generation schema supporting ontological representation for extracted from source code ecosystems. As result, created Source ECOsystem (SECOLD) framework that adheres publication standard. The provides not only facts are usable both humans machines browsing...
Real-time code clone search is an emerging family of detection research that aims at finding pairs matching input fragment in fractions a second. For these techniques to meet actual real world requirements, they have be scalable and provide short response time. Our presents hybrid approach using source pattern indexing, information retrieval clustering, Semantic Web reasoning respectively achieve time, handle false positives, support automated grouping/querying.
Nowadays, software development and maintenance are highly distributed processes that involve a multitude of supporting tools resources. Knowledge relevant for particular task is typically dispersed over wide range artifacts in different representational formats at abstraction levels, resulting isolated 'information silos'. An increasing number task-specific aim to support developers, but this often results additional challenges, as not every project member can be familiar with tool its...
The process of evaluating, classifying, and assigning bugs to programmers is a difficult time consuming task which greatly depends on the quality bug report itself. It has been shown that reports originating from trackers or ticketing systems can vary significantly. In this research, we apply information retrieval (IR) natural language processing (NLP) techniques for mining repositories. We focus particularly measuring free form descriptions submitted as part used by open source trackers....
Software development and maintenance are highly distributed processes that involve a multitude of supporting tools resources. Knowledge relevant to these resources is typically dispersed over wide range artifacts, representation formats, abstraction levels. In order stay competitive, organizations often required assess provide evidence their software meets the expected requirements. our research, we focus on assessing non-functional quality requirements, specifically evolvability, through...
In this research, we present a novel approach that allows existing state of the art clone detection tools to scale very large datasets. A key benefit our is improved scalability achieved using standard hardware and without modifying original implementations subject tools. We use hybrid comprising shuffling, repetition, random subset generation dataset. As part experimental evaluation, applied shuffling randomization on two Our experience shows it possible classical dataset hardware,...