- Reinforcement Learning in Robotics
- Stock Market Forecasting Methods
- Artificial Intelligence in Games
- Financial Markets and Investment Strategies
- Network Security and Intrusion Detection
- Sports Analytics and Performance
- Information and Cyber Security
- Topic Modeling
- Complex Systems and Time Series Analysis
- Natural Language Processing Techniques
- Multi-Agent Systems and Negotiation
- Infrastructure Resilience and Vulnerability Analysis
- Time Series Analysis and Forecasting
- Model Reduction and Neural Networks
- Economic theories and models
- Domain Adaptation and Few-Shot Learning
- Evolutionary Algorithms and Applications
- Advanced Graph Neural Networks
- Generative Adversarial Networks and Image Synthesis
- Auction Theory and Applications
- Explainable Artificial Intelligence (XAI)
- Opinion Dynamics and Social Influence
- Simulation Techniques and Applications
- Game Theory and Applications
- Maritime Security and History
Singapore Management University
2025
Yangzhou University
2025
Nanyang Technological University
2017-2024
Chinese Research Academy of Environmental Sciences
2020
High-frequency trading (HFT) is using computer algorithms to make decisions in short time scales (e.g., second-level), which widely used the Cryptocurrency (Crypto) market, Bitcoin). Reinforcement learning (RL) financial research has shown stellar performance on many quantitative tasks. However, most methods focus low-frequency trading, e.g., day-level, cannot be directly applied HFT because of two challenges. First, RL for involves dealing with extremely long trajectories 2.4 million steps...
Financial trading is a crucial component of the markets, informed by multimodal information landscape encompassing news, prices, and Kline charts, encompasses diverse tasks such as quantitative high-frequency with various assets. While advanced AI techniques like deep learning reinforcement are extensively utilized in finance, their application financial often faces challenges due to inadequate handling data limited generalizability across tasks. To address these challenges, we present...
To explore the feasibility of color Doppler ultrasound in detecting middle meningeal artery (MMA) and evaluate clinical application this method chronic subdural hematoma (CSDH). This study collected data from 100 patients with CSDH who were admitted to our hospital between January 2023 March 2024. Among these patients, 87 underwent drilling drainage surgery. Postoperative follow-up categorized into a recurrence group non-recurrence group. Additionally, 80 healthy volunteers 60 acute (ASDH)...
In this paper, we propose a new approach to train deep learning models using game theory concepts including Generative Adversarial Networks (GANs) and Training (AT) where deploy double-oracle framework best response oracles. GAN is essentially two-player zero-sum between the generator discriminator. The same concept can be applied AT with attacker classifier as players. these challenging pure Nash equilibrium may not exist even finding mixed difficult training algorithms for both have...
With the rise of online e-commerce platforms, more and customers prefer to shop online. To sell products, platforms introduce various modules recommend items with different properties such as huge discounts. A web page often consists independent modules. The ranking policies these are decided by teams optimized individually without cooperation, which might result in competition between Thus, global policy whole could be sub-optimal. In this paper, we propose a novel multi-agent cooperative...
Large language models (LLMs) have exhibited remarkable performance on various natural processing (NLP) tasks, especially for question answering. However, in the face of problems beyond scope knowledge, these LLMs tend to talk nonsense with a straight face, where potential solution could be incorporating an Information Retrieval (IR) module and generating response based retrieved knowledge. In this paper, we present novel framework assist LLMs, such as ChatGPT, retrieve question-related...
Quantitative stock investment is a fundamental financial task that highly relies on accurate prediction of market status and profitable decision making. Despite recent advances in deep learning (DL) have shown stellar performance capturing trading opportunities the stochastic market, existing DL methods unstable with sensitivity to network initialization hyperparameter selection. One major limitation works decisions are made based one individual neural predictor high uncertainty, which...
Pirate syndicates capturing tankers to siphon oil, causing an estimated cost of $5 billion a year, has become serious security issue for maritime traffic. In response the threat, coast guards and navies deploy patrol boats protect international oil trade. However, given vast area sea highly time space dependent behaviors both players, it remains significant challenge find efficient ways resources. this paper, we address research challenges provide four key contributions. First, construct...
Current value-based multi-agent reinforcement learning methods optimize individual Q values to guide individuals' behaviours via centralized training with decentralized execution (CTDE). However, such expected, i.e., risk-neutral, value is not sufficient even CTDE due the randomness of rewards and uncertainty in environments, which causes failure these train coordinating agents complex environments. To address issues, we propose RMIX, a novel cooperative MARL method Conditional Value at Risk...
Financial simulators play an important role in enhancing forecasting accuracy, managing risks, and fostering strategic financial decision-making. Despite the development of market simulation methodologies, existing frameworks often struggle with adapting to specialized context. We pinpoint challenges as i) current datasets do not contain context labels; ii) techniques are designed generate data control, which demands greater precision compared other modalities; iii) inherent difficulties...
Portfolio management (PM) is a fundamental financial trading task, which explores the optimal periodical reallocation of capitals into different stocks to pursue long-term profits. Reinforcement learning (RL) has recently shown its potential train profitable agents for PM through interacting with markets. However, existing work mostly focuses on fixed stock pools, inconsistent investors' practical demand. Specifically, target pool investors varies dramatically due their discrepancy market...
In this paper, the reoxidation behaviours of CrOOH and Cr(OH)3 are investigated as major reduction products Cr(vi). The atmosphere oxidation Cr(iii) is studied in environment soil without manganese hydrogen peroxide. influence temperature pH value on rate examined by Experiment methods details. According to experimental results, promoted with high value, however, process stable temperature. theoretically researched thermodynamic calculation density functional theory (DFT) simulation. results...
Pursuit-evasion games on graphs model the coordination of police forces chasing a fleeing felon in real-world urban settings, using standard framework imperfect-information extensive-form (EFGs). In recent years, solving EFGs has been largely dominated by Policy-Space Response Oracle (PSRO) methods due to their modularity, scalability, and favorable convergence properties. However, even these quickly reach limits when facing large combinatorial strategy spaces pursuit-evasion games. To...
High-frequency trading (HFT) uses computer algorithms to make decisions in short time scales (e.g., second-level), which is widely used the Cryptocurrency (Crypto) market Bitcoin). Reinforcement learning (RL) financial research has shown stellar performance on many quantitative tasks. However, most methods focus low-frequency trading, e.g., day-level, cannot be directly applied HFT because of two challenges. First, RL for involves dealing with extremely long trajectories 2.4 million steps...
Securing networked infrastructures is important in the real world. The problem of deploying security resources to protect against an attacker domains can be modeled as Network Security Games (NSGs). Unfortunately, existing approaches, including deep learning-based are inefficient solve large-scale extensive-form NSGs. In this paper, we propose a novel learning paradigm, NSG-NFSP, NSGs based on Neural Fictitious Self-Play (NFSP). Our main contributions include: i) reforming best response (BR)...
Since 2003, the U.S. government has spent $850 million on Megaport Initiative which aims at stopping nuclear smuggling in international container shipping through advanced inspection facilities including Non-Intrusive Inspection (NII) and Mobile Radiation Detection Identification System (MRDIS). Unfortunately, it remains a significant challenge to efficiently inspect more than 11.7 containers imported due limited resources. Moreover, existing work neglects sophisticated behavior of smuggler...
Ad hoc teamwork requires an agent to cooperate with unknown teammates without prior coordination. Many works propose abstract teammate instances into high-level representation of types and then pre-train the best response for each type. However, most them do not consider distribution within a This could expose hidden risk \emph{type confounding}. In worst case, type be all specific that work addresses issue from lens causal inference. We first theoretically demonstrate this phenomenon is due...
With the rise of online e-commerce platforms, more and customers prefer to shop online. To sell products, platforms introduce various modules recommend items with different properties such as huge discounts. A web page often consists independent modules. The ranking policies these are decided by teams optimized individually without cooperation, which might result in competition between Thus, global policy whole could be sub-optimal. In this paper, we propose a novel multi-agent cooperative...
Cyber attacks and the associated costs made cybersecurity a vital part of any system. User behavior decisions are still major in coping with these risks. We developed model optimal investment human security measures, given that effectiveness each measure depends partly on performance others. In an online experiment, participants classified events as malicious or non-malicious, based value observed variable. Prior to making decisions, they had invested three measures - firewall, IDS...
Dams impact downstream river dynamics through flow regulation and disruption of upstream-downstream linkages. However, current dam operation is far from satisfactory due to the inability respond complicated uncertain system various usages reservoir. Even further, unsatisfactory can cause floods in areas. Therefore, we leverage reinforcement learning (RL) methods compute efficient guidelines this work. Specifically, build offline simulators with real data different mathematical models for...
Distributed constraint optimization problems (DCOPs) are a powerful model for multi-agent coordination and optimization, where information controls distributed among multiple agents by nature. Sampling-based algorithms important incomplete techniques solving medium-scale DCOPs. However, they use tables to exactly store all the (e.g., costs, confidence bounds) facilitate sampling, which limits their scalability. This paper tackles limitation incorporating deep neural networks in DCOPs first...
Despite the impressive performance across numerous tasks, large language models (LLMs) often fail in solving simple decision-making tasks due to misalignment of knowledge LLMs with environments. On contrary, reinforcement learning (RL) agents learn policies from scratch, which makes them always align environments but difficult incorporate prior for efficient explorations. To narrow gap, we propose TWOSOME, a novel general online framework that deploys as efficiently interact and embodied via...
Recent studies have demonstrated the success of foundation agents in specific tasks or scenarios. However, existing cannot generalize across different scenarios, mainly due to their diverse observation and action spaces semantic gaps, reliance on task-specific resources. In this work, we propose General Computer Control (GCC) setting: building that can master any computer task by taking only screen images (and possibly audio) as input, producing keyboard mouse operations output, similar...