Vahid Majdinasab

ORCID: 0000-0003-4411-0810
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Software Testing and Debugging Techniques
  • Software Engineering Research
  • Advanced Malware Detection Techniques
  • Software Reliability and Analysis Research
  • Natural Language Processing Techniques
  • Adversarial Robustness in Machine Learning
  • Machine Learning and Data Classification
  • Text Readability and Simplification
  • Viral Infectious Diseases and Gene Expression in Insects
  • Software Engineering Techniques and Practices
  • Scientific Computing and Data Management
  • Formal Methods in Verification
  • Advanced Data Storage Technologies
  • Software System Performance and Reliability
  • Speech and dialogue systems
  • Privacy-Preserving Technologies in Data
  • Distributed systems and fault tolerance
  • Topic Modeling

Polytechnique Montréal
2023-2024

Jack Miller Center
2022

Automatic program synthesis is a long-lasting dream in software engineering. Recently, promising Deep Learning (DL) based solution, called Copilot, has been proposed by OpenAI and Microsoft as an industrial product. Although some studies evaluate the correctness of Copilot solutions report its issues, more empirical evaluations are necessary to understand how developers can benefit from it effectively. In this paper, we study capabilities two different programming tasks: (i) generating (and...

10.48550/arxiv.2206.15331 preprint EN other-oa arXiv (Cornell University) 2022-01-01

10.1109/saner60148.2024.00051 article EN 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2024-03-12

Testing Deep Learning (DL) systems is a complex task as they do not behave like traditional would, notably because of their stochastic nature. Nonetheless, being able to adapt existing testing techniques such Mutation (MT) DL settings would greatly improve potential verifiability. While some efforts have been made extend MT the Supervised paradigm, little work has gone into extending it Reinforcement (RL) which also an important component ecosystem but behaves very differently from SL. This...

10.1109/icst57152.2023.00026 article EN 2023-04-01

Software systems are increasingly relying on deep learning components, due to their remarkable capability of identifying complex data patterns and powering intelligent behaviour. A core enabler this change in software development is the availability easy-to-use libraries. Libraries like PyTorch TensorFlow empower a large variety systems, offering multitude algorithms configuration options, applicable numerous domains systems. However, bugs those popular libraries also may have dire...

10.1109/icsme58846.2023.00031 article EN 2023-10-01

Code auditing ensures that the developed code adheres to standards, regulations, and copyright protection by verifying it does not contain from protected sources. The recent advent of Large Language Models (LLMs) as coding assistants in software development process poses new challenges for auditing. dataset training these models is mainly collected publicly available This raises issue intellectual property infringement developers' codes are already included dataset. Therefore, using LLMs...

10.48550/arxiv.2402.09299 preprint EN arXiv (Cornell University) 2024-02-14

Machine learning models trained on code and related artifacts offer valuable support for software maintenance but suffer from interpretability issues due to their complex internal variables. These concerns are particularly significant in safety-critical applications where the models' decision-making processes must be reliable. The specific features representations learned by these remain unclear, adding hesitancy adopting them widely. To address challenges, we introduce DeepCodeProbe, a...

10.48550/arxiv.2407.08890 preprint EN arXiv (Cornell University) 2024-07-11

Code auditing ensures that the developed code adheres to standards, regulations, and copyright protection by verifying it does not contain from protected sources. The recent advent of Large Language Models (LLMs) as coding assistants in software development process poses new challenges for auditing. dataset training these models is mainly collected publicly available This raises issue intellectual property infringement developers’ codes are already included dataset. Therefore, using LLMs...

10.1145/3702980 article EN ACM Transactions on Software Engineering and Methodology 2024-11-02

One of the critical phases in software development is testing. Testing helps with identifying potential bugs and reducing maintenance costs. The goal automated test generation tools to ease tests by suggesting efficient bug-revealing tests. Recently, researchers have leveraged Large Language Models (LLMs) code generate unit While coverage generated was usually assessed, literature has acknowledged that weakly correlated efficiency bug detection. To improve over this limitation, paper, we...

10.48550/arxiv.2308.16557 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Testing Deep Learning (DL) systems is a complex task as they do not behave like traditional would, notably because of their stochastic nature. Nonetheless, being able to adapt existing testing techniques such Mutation (MT) DL settings would greatly improve potential verifiability. While some efforts have been made extend MT the Supervised paradigm, little work has gone into extending it Reinforcement (RL) which also an important component ecosystem but behaves very differently from SL. This...

10.48550/arxiv.2301.05651 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Software systems are increasingly relying on deep learning components, due to their remarkable capability of identifying complex data patterns and powering intelligent behaviour. A core enabler this change in software development is the availability easy-to-use libraries. Libraries like PyTorch TensorFlow empower a large variety systems, offering multitude algorithms configuration options, applicable numerous domains systems. However, bugs those popular libraries also may have dire...

10.48550/arxiv.2307.13777 preprint EN other-oa arXiv (Cornell University) 2023-01-01

AI-powered code generation models have been developing rapidly, allowing developers to expedite and thus improve their productivity. These are trained on large corpora of (primarily sourced from public repositories), which may contain bugs vulnerabilities. Several concerns raised about the security generated by these models. Recent studies investigated issues in tools such as GitHub Copilot Amazon CodeWhisperer, revealing several weaknesses tools. As evolve, it is expected that they will...

10.48550/arxiv.2311.11177 preprint EN cc-by arXiv (Cornell University) 2023-01-01
Coming Soon ...