Van-Thuan Pham

ORCID: 0000-0002-9871-3695
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Software Testing and Debugging Techniques
  • Advanced Malware Detection Techniques
  • Software Reliability and Analysis Research
  • Software Engineering Research
  • Software System Performance and Reliability
  • Web Application Security Vulnerabilities
  • Security and Verification in Computing
  • Network Packet Processing and Optimization
  • Software-Defined Networks and 5G
  • Information and Cyber Security
  • Cryptography and Data Security
  • Real-Time Systems Scheduling
  • Formal Methods in Verification
  • Computability, Logic, AI Algorithms
  • Explainable Artificial Intelligence (XAI)
  • Advanced Software Engineering Methodologies
  • Parallel Computing and Optimization Techniques
  • Blockchain Technology Applications and Security
  • Advanced Steganography and Watermarking Techniques
  • Adversarial Robustness in Machine Learning
  • Web Data Mining and Analysis
  • Natural Language Processing Techniques
  • Teaching and Learning Programming
  • VLSI and Analog Circuit Testing
  • Ethics and Social Impacts of AI

The University of Melbourne
2021-2025

Monash University
2018-2020

Australian Regenerative Medicine Institute
2020

National University of Singapore
2013-2017

Existing Greybox Fuzzers (GF) cannot be effectively directed, for instance, towards problematic changes or patches, critical system calls dangerous locations, functions in the stack-trace of a reported vulnerability that we wish to reproduce. In this paper, introduce Directed Fuzzing (DGF) which generates inputs with objective reaching given set target program locations efficiently. We develop and evaluate simulated annealing-based power schedule gradually assigns more energy seeds are...

10.1145/3133956.3134020 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2017-10-27

Coverage-based Greybox Fuzzing (CGF) is a random testing approach that requires no program analysis. A new test generated by slightly mutating seed input. If the exercises and interesting path, it added to set of seeds; otherwise, discarded. We observe most tests exercise same few "high-frequency" paths develop strategies explore significantly more with number gravitating towards low-frequency paths. explain challenges opportunities CGF using Markov chain model which specifies probability...

10.1145/2976749.2978428 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2016-10-24

Coverage-based Greybox Fuzzing (CGF) is a random testing approach that requires no program analysis. A new test generated by slightly mutating seed input. If the exercises and interesting path, it added to set of seeds; otherwise, discarded. We observe most tests exercise same few "high-frequency" paths develop strategies explore significantly more with number gravitating towards low-frequency paths. explain challenges opportunities CGF using Markov chain model which specifies probability...

10.1109/tse.2017.2785841 article EN IEEE Transactions on Software Engineering 2017-12-21

Server fuzzing is difficult. Unlike simple command-line tools, servers feature a massive state space that can be traversed effectively only with well-defined sequences of input messages. Valid are specified in protocol. In this paper, we present AFLNET, the first greybox fuzzer for protocol implementations. existing fuzzers, AFLNET takes mutational approach and uses state-feedback to guide process. seeded corpus recorded message exchanges between server an actual client. No specification or...

10.1109/icst46399.2020.00062 article EN 2020-08-05

Many real-world programs take highly structured and very complex inputs. The automated testing of such is non-trivial. If the test input does not adhere to a specific file format, program returns parser error. For symbolic execution-based whitebox fuzzing corresponding error handling code becomes significant time sink. Too much spent in exploring too many paths leading trivial errors. Naturally, better functional part where failure with valid exposes deep real bugs program. In this paper, we...

10.1145/2970276.2970316 article EN 2016-08-25

Coverage-based greybox fuzzing (CGF) is one of the most successful approaches for automated vulnerability detection. Given a seed file (as sequence bits), CGF randomly flips, deletes or copies some bits to generate new files. iteratively constructs (and fuzzes) corpus by retaining those generated files which enhance coverage. However, random bitflips are unlikely produce valid (or chunks in files), applications processing complex formats. In this work, we introduce smart (SGF) leverages...

10.1109/tse.2019.2941681 article EN IEEE Transactions on Software Engineering 2019-09-17

We present a new benchmark (ProFuzzBench) for stateful fuzzing of network protocols. The includes suite representative open-source servers popular protocols, and tools to automate experimentation. discuss challenges potential directions future research based on this benchmark.

10.1145/3460319.3469077 article EN 2021-07-08

10.1109/tse.2025.3535925 article EN cc-by IEEE Transactions on Software Engineering 2025-01-01

APIs often transmit far more data to client applications than they need, and in the context of web applications, do so over public channels. This issue, termed Excessive Data Exposure (EDE), was OWASP's third most significant API vulnerability 2019. However, there are few automated tools---either research or industry---to effectively find remediate such issues. is unsurprising as problem lacks an explicit test oracle: does not manifest through abnormal behaviours (e.g., program crashes...

10.1145/3597503.3608133 article EN 2024-02-06

We introduce LEARN2FIX, the first human-in-the-loop, semi-automatic repair technique when no bug oracle-except for user who is reporting bug-is available. Our approach negotiates with condition under which observed. Only a budget of queries to exhausted, it attempts bug. A query can be thought as following question: "When executing this alternative test input, program produces output; observed"? Through systematic queries, LEARN2FIX trains an automatic oracle that becomes increasingly more...

10.1109/icst46399.2020.00036 article EN 2020-08-05

Binary analysis is a well-investigated area in software engineering and security. Given real-world program binaries, generating test inputs which cause the binaries to crash crucial. Generation of crashing has many applications including off-line prior deployment, or online patches as they are inserted. In this work, we present method for reach given potentially location. Such locations can be found by separate static (or gleaning reports submitted internal / external users) serve input our...

10.5555/2818754.2818862 article EN International Conference on Software Engineering 2015-05-16

The statefulness property of network protocol implementations poses a unique challenge for testing and verification techniques, including Fuzzing. Stateful fuzzers tackle this by leveraging state models to partition the space assist test generation process. Since not all states are equally important fuzzing campaigns have time limits, need effective selection algorithms prioritize progressive over others. Several been proposed but they were implemented evaluated separately on different...

10.1109/saner53432.2022.00089 article EN 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2022-03-01

Binary analysis is a well-investigated area in software engineering and security. Given real-world program binaries, generating test inputs which cause the binaries to crash crucial. Generation of crashing has many applications including off-line prior deployment, or online patches as they are inserted. In this work, we present method for reach given "potentially crashing" location. Such potentially locations can be found by separate static (or gleaning reports submitted internal / external...

10.1109/icse.2015.99 article EN 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering 2015-05-01

<sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">LEARN</small> 2 xmlns:xlink="http://www.w3.org/1999/xlink">FIX</small> is a <italic xmlns:xlink="http://www.w3.org/1999/xlink">human-in-the-loop interactive program repair</i> technique, which can be applied when no bug oracle—except the user who reporting bug—is available. This approach incrementally learns condition under observed by systematic negotiation with user. In this process, generates...

10.1109/tse.2023.3305052 article EN cc-by-nc-nd IEEE Transactions on Software Engineering 2023-08-21

Parallel coverage-guided greybox fuzzing is the most common setup for vulnerability discovery at scale. However, so far it has received little attention from research community compared to single-mode fuzzing, leaving open several problems particularly in its task allocation strategies. Current approaches focus on managing micro tasks, seed input level, and their division algorithms are either ad-hoc or static. In this paper, we leverage graph partitioning search propose a systematic dynamic...

10.1109/ase51524.2021.9678810 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2021-11-01

Abstract Identifying security issues early is encouraged to reduce the latent negative impacts on software systems. Code review a widely-used method that allows developers manually inspect modified code, catching during development cycle. However, existing code studies often focus known vulnerabilities, neglecting coding weaknesses, which can introduce real-world are more visible through review. The practices of reviews in identifying such weaknesses not yet fully investigated. To better...

10.1007/s10664-024-10496-y article EN cc-by Empirical Software Engineering 2024-06-08

How can we automatically repair semantic bugs in string-processing programs? A bug is an unexpected program state: The does not crash (which be easily detected). Instead, the processes input incorrectly. It produces output which users identify as unexpected. We envision a fully automated debugging process for where user reports behavior given and machine negotiates condition under fails. During negotiation, learns to predict user's response this oracle bugs.

10.1145/3533767.3534406 article EN 2022-07-15

Real-time embedded software often runs on a supervisory operating system layer top of modern processor. Thus, to give timing guarantees the execution time and response such applications, one needs consider effects system, as calls interrupts - over above modeling micro-architectural features pipeline cache. Previous works Worst-case Execution Time (WCET) analysis have focused while ignoring system's effects. As result, WCET analyzers only estimate maximum un-interrupted program. In this...

10.1109/rtss.2013.21 preprint EN 2013-12-01

Early identification of security issues in software development is vital to minimize their unanticipated impacts. Code review a widely used manual analysis method that aims uncover along with other coding projects. While some studies suggest automated static application testing tools (SASTs) could enhance issue identification, there limited understanding SAST's practical effectiveness supporting secure code review. Moreover, most SAST rely on synthetic or fully vulnerable versions the...

10.48550/arxiv.2407.12241 preprint EN arXiv (Cornell University) 2024-07-16

Protocol implementations are stateful which makes them difficult to test: Sending the same test input message twice might yield a different response every time. Our proposal consider sequence of messages as seed for coverage-directed greybox fuzzing, associate each with corresponding protocol state, and maximize coverage both state space code was first published in 2020 short tool demonstration paper. AFLNet code- state-coverage-guided fuzzer; it used an indicator current state. Over past...

10.48550/arxiv.2412.20324 preprint EN arXiv (Cornell University) 2024-12-28

APIs often transmit far more data to client applications than they need, and in the context of web applications, do so over public channels. This issue, termed Excessive Data Exposure (EDE), was OWASP's third most significant API vulnerability 2019. However, there are few automated tools -- either research or industry effectively find remediate such issues. is unsurprising as problem lacks an explicit test oracle: does not manifest through abnormal behaviours (e.g., program crashes memory...

10.48550/arxiv.2301.09258 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01
Coming Soon ...