- Topic Modeling
- Adversarial Robustness in Machine Learning
- Advanced Malware Detection Techniques
- Natural Language Processing Techniques
- Software Engineering Research
- Hate Speech and Cyberbullying Detection
- Information and Cyber Security
- Multimodal Machine Learning Applications
- Text Readability and Simplification
- Artificial Intelligence in Healthcare and Education
- Particle accelerators and beam dynamics
- Digital and Cyber Forensics
- Advanced Image and Video Retrieval Techniques
- Digital Media Forensic Detection
- Spam and Phishing Detection
- Cryptography and Data Security
- Speech Recognition and Synthesis
- Imbalanced Data Classification Techniques
- Explainable Artificial Intelligence (XAI)
- Stuttering Research and Treatment
- Modular Robots and Swarm Intelligence
- Bacillus and Francisella bacterial research
- Software Reliability and Analysis Research
- Advanced Electrical Measurement Techniques
- Magnetic confinement fusion research
Nanyang Technological University
2021-2024
Nanyang Institute of Technology
2021
Large Language Models (LLMs), like ChatGPT, have demonstrated vast potential but also introduce challenges related to content constraints and misuse. Our study investigates three key research questions: (1) the number of different prompt types that can jailbreak LLMs, (2) effectiveness prompts in circumventing LLM constraints, (3) resilience ChatGPT against these prompts. Initially, we develop a classification model analyze distribution existing prompts, identifying ten distinct patterns...
lenge in ensuring the secure and ethical usage of LLMs [31].Jailbreaking, this context, refers to strategic manipulation input prompts LLMs, devised outsmart chatbots' safeguards generate content otherwise moderated or blocked.By exploiting such carefully crafted prompts, a malicious user can induce LLM chatbots produce harmful outputs that contravene defined policies.Past efforts have been made investigate jailbreak vulnerabilities [31], [27], [62], [51].However, with rapid evolution...
Ponzi schemes, a form of scam, have been discovered in Ethereum smart contracts recent years, causing massive financial losses. Rule-based detection approaches rely on pre-defined rules with limited capabilities and domain knowledge dependency. Additionally, using static information like opcodes transactions for machine learning models fails to effectively characterize the contracts, resulting poor reliability interpretability.
Large Language Models (LLMs), renowned for their superior proficiency in language comprehension and generation, stimulate a vibrant ecosystem of applications around them. However, extensive assimilation into various services introduces significant security risks. This study deconstructs the complexities implications prompt injection attacks on actual LLM-integrated applications. Initially, we conduct an exploratory analysis ten commercial applications, highlighting constraints current attack...
To support software developers in understanding and maintaining programs, various automatic code summarization techniques have been proposed to generate a concise natural language comment for given snippet. Recently, the emergence of large models (LLMs) has led great boost performance processing tasks. Among them, ChatGPT is most popular one which attracted wide attention from engineering community. However, it still remains unclear how performs (automatic) summarization. Therefore, this...
RESTful APIs are arguably the most popular endpoints for accessing Web services. Blackbox testing is one of emerging techniques ensuring reliability APIs. The major challenge in need correct sequences API operation calls in-depth testing. To build meaningful call sequences, researchers have proposed to learn and utilize dependencies based on OpenAPI specifications. However, these either lack overall awareness how all connected or flexibility adaptively fixing learned knowledge.
Penetration testing, a crucial industrial practice for ensuring system security, has traditionally resisted automation due to the extensive expertise required by human professionals. Large Language Models (LLMs) have shown significant advancements in various domains, and their emergent abilities suggest potential revolutionize industries. In this research, we evaluate performance of LLMs on real-world penetration testing tasks using robust benchmark created from test machines with platforms....
Large Language Models (LLMS) have increasingly become central to generating content with potential societal impacts. Notably, these models demonstrated capabilities for that could be deemed harmful. To mitigate risks, researchers adopted safety training techniques align model outputs values curb the generation of malicious content. However, phenomenon "jailbreaking", where carefully crafted prompts elicit harmful responses from models, persists as a significant challenge. This research...
Pre-training, which utilizes extensive and varied datasets, is a critical factor in the success of Large Language Models (LLMs) across numerous applications. However, detailed makeup these datasets often not disclosed, leading to concerns about data security potential misuse. This particularly relevant when copyrighted material, still under legal protection, used inappropriately, either intentionally or unintentionally, infringing on rights authors. In this paper, we introduce framework...
Robotic Vehicles (RVs) have gained great popularity over the past few years. Meanwhile, they are also demonstrated to be vulnerable sensor spoofing attacks. Although a wealth of research works presented various attacks, some key questions remain unanswered: these existing complete enough cover all threats? If not, how many attacks not explored, and difficult is it realize them?This paper answers above by comprehensively systematizing knowledge against RVs. Our contributions threefold. (1) We...
Large Language Models (LLMs) have revolutionized Artificial Intelligence (AI) services due to their exceptional proficiency in understanding and generating human-like text. LLM chatbots, particular, seen widespread adoption, transforming human-machine interactions. However, these chatbots are susceptible "jailbreak" attacks, where malicious users manipulate prompts elicit inappropriate or sensitive responses, contravening service policies. Despite existing attempts mitigate such threats, our...
Large language models (LLMs) have transformed the field of natural processing, but they remain susceptible to jailbreaking attacks that exploit their capabilities generate unintended and potentially harmful content. Existing token-level techniques, while effective, face scalability efficiency challenges, especially as undergo frequent updates incorporate advanced defensive measures. In this paper, we introduce JailMine, an innovative manipulation approach addresses these limitations...
Large language models (LLMs) are vital for a wide range of applications yet remain susceptible to jailbreak threats, which could lead the generation inappropriate responses. Conventional defenses, such as refusal and adversarial training, often fail cover corner cases or rare domains, leaving LLMs still vulnerable more sophisticated attacks. We propose novel defense strategy, Safety Chain-of-Thought (SCoT), harnesses enhanced \textit{reasoning capabilities} proactive assessment harmful...
CAPTCHAs have long been essential tools for protecting applications from automated bots. Initially designed as simple questions to distinguish humans bots, they become increasingly complex keep pace with the proliferation of CAPTCHA-cracking techniques employed by malicious actors. However, advent advanced large language models (LLMs), effectiveness existing is now being undermined. To address this issue, we conducted an empirical study evaluate performance multimodal LLMs in solving and...
With the expanding application of Large Language Models (LLMs) in various domains, it becomes imperative to comprehensively investigate their unforeseen behaviors and consequent outcomes. In this study, we introduce systematically explore phenomenon “glitch tokens”, which are anomalous tokens produced by established tokenizers could potentially compromise models’ quality response. Specifically, experiment on seven top popular LLMs utilizing three distinct involving a totally 182,517 tokens....
Augmented generation techniques such as Retrieval-Augmented Generation (RAG) and Cache-Augmented (CAG) have revolutionized the field by enhancing large language model (LLM) outputs with external knowledge cached information. However, integration of vector databases, which serve a backbone for these augmentations, introduces critical challenges, particularly in ensuring accurate matching. False matching databases can significantly compromise integrity reliability LLM outputs, leading to...
Robot Operating System (ROS) has been the mainstream platform for research and development of robotic applications. This is well-known lacking security features efficiency distributed computations. To address these issues, ROS2 recently developed by utilizing Data Distribution Service (DDS) to provide support. Integrated with DDS, expected establish basis trustworthy ecosystems.
Large Language Models~(LLMs) have gained immense popularity and are being increasingly applied in various domains. Consequently, ensuring the security of these models is paramount importance. Jailbreak attacks, which manipulate LLMs to generate malicious content, recognized as a significant vulnerability. While existing research has predominantly focused on direct jailbreak attacks LLMs, there been limited exploration indirect methods. The integration plugins into notably Retrieval Augmented...
With the prevalence of text-to-image generative models, their safety becomes a critical concern. adversarial testing techniques have been developed to probe whether such models can be prompted produce Not-Safe-For-Work (NSFW) content. However, existing solutions face several challenges, including low success rate and inefficiency. We introduce Groot, first automated framework leveraging tree-based semantic transformation for models. Groot employs decomposition sensitive element drowning...
Large language models (LLMs) have exhibited remarkable capabilities in natural generation, but they also been observed to magnify societal biases, particularly those related gender. In response this issue, several benchmarks proposed assess gender bias LLMs. However, these often lack practical flexibility or inadvertently introduce biases. To address shortcomings, we GenderCARE, a comprehensive framework that encompasses innovative Criteria, Assessment, Reduction techniques, and Evaluation...
Multi-Robot Systems (MRSs) show significant advantages to deal with complex tasks efficiently. However, the system complexity inevitably enlarges attack surface and adds difficulty in guaranteeing security safety of MRSs. In this paper, we present an in-depth investigation about Byzantine threats MRSs, where some robot is untrusted. We design a practical methodology identify potential risks given MRS workload built from Robot Operating System (ROS). It consists three novel steps (requirement...