- Software Testing and Debugging Techniques
- Software Engineering Research
- Advanced Malware Detection Techniques
- Software Reliability and Analysis Research
- Natural Language Processing Techniques
- Adversarial Robustness in Machine Learning
- Machine Learning and Data Classification
- Text Readability and Simplification
- Viral Infectious Diseases and Gene Expression in Insects
- Software Engineering Techniques and Practices
- Scientific Computing and Data Management
- Formal Methods in Verification
- Advanced Data Storage Technologies
- Software System Performance and Reliability
- Speech and dialogue systems
- Privacy-Preserving Technologies in Data
- Distributed systems and fault tolerance
- Topic Modeling
Polytechnique Montréal
2023-2024
Jack Miller Center
2022
Automatic program synthesis is a long-lasting dream in software engineering. Recently, promising Deep Learning (DL) based solution, called Copilot, has been proposed by OpenAI and Microsoft as an industrial product. Although some studies evaluate the correctness of Copilot solutions report its issues, more empirical evaluations are necessary to understand how developers can benefit from it effectively. In this paper, we study capabilities two different programming tasks: (i) generating (and...
Testing Deep Learning (DL) systems is a complex task as they do not behave like traditional would, notably because of their stochastic nature. Nonetheless, being able to adapt existing testing techniques such Mutation (MT) DL settings would greatly improve potential verifiability. While some efforts have been made extend MT the Supervised paradigm, little work has gone into extending it Reinforcement (RL) which also an important component ecosystem but behaves very differently from SL. This...
Software systems are increasingly relying on deep learning components, due to their remarkable capability of identifying complex data patterns and powering intelligent behaviour. A core enabler this change in software development is the availability easy-to-use libraries. Libraries like PyTorch TensorFlow empower a large variety systems, offering multitude algorithms configuration options, applicable numerous domains systems. However, bugs those popular libraries also may have dire...
Code auditing ensures that the developed code adheres to standards, regulations, and copyright protection by verifying it does not contain from protected sources. The recent advent of Large Language Models (LLMs) as coding assistants in software development process poses new challenges for auditing. dataset training these models is mainly collected publicly available This raises issue intellectual property infringement developers' codes are already included dataset. Therefore, using LLMs...
Machine learning models trained on code and related artifacts offer valuable support for software maintenance but suffer from interpretability issues due to their complex internal variables. These concerns are particularly significant in safety-critical applications where the models' decision-making processes must be reliable. The specific features representations learned by these remain unclear, adding hesitancy adopting them widely. To address challenges, we introduce DeepCodeProbe, a...
Code auditing ensures that the developed code adheres to standards, regulations, and copyright protection by verifying it does not contain from protected sources. The recent advent of Large Language Models (LLMs) as coding assistants in software development process poses new challenges for auditing. dataset training these models is mainly collected publicly available This raises issue intellectual property infringement developers’ codes are already included dataset. Therefore, using LLMs...
One of the critical phases in software development is testing. Testing helps with identifying potential bugs and reducing maintenance costs. The goal automated test generation tools to ease tests by suggesting efficient bug-revealing tests. Recently, researchers have leveraged Large Language Models (LLMs) code generate unit While coverage generated was usually assessed, literature has acknowledged that weakly correlated efficiency bug detection. To improve over this limitation, paper, we...
Testing Deep Learning (DL) systems is a complex task as they do not behave like traditional would, notably because of their stochastic nature. Nonetheless, being able to adapt existing testing techniques such Mutation (MT) DL settings would greatly improve potential verifiability. While some efforts have been made extend MT the Supervised paradigm, little work has gone into extending it Reinforcement (RL) which also an important component ecosystem but behaves very differently from SL. This...
Software systems are increasingly relying on deep learning components, due to their remarkable capability of identifying complex data patterns and powering intelligent behaviour. A core enabler this change in software development is the availability easy-to-use libraries. Libraries like PyTorch TensorFlow empower a large variety systems, offering multitude algorithms configuration options, applicable numerous domains systems. However, bugs those popular libraries also may have dire...
AI-powered code generation models have been developing rapidly, allowing developers to expedite and thus improve their productivity. These are trained on large corpora of (primarily sourced from public repositories), which may contain bugs vulnerabilities. Several concerns raised about the security generated by these models. Recent studies investigated issues in tools such as GitHub Copilot Amazon CodeWhisperer, revealing several weaknesses tools. As evolve, it is expected that they will...