NFDI4DS | UHH-SEMS - Publication Details

EarnHFT: Efficient Hierarchical Reinforcement Learning for High Frequency Trading

OPENALEX - Publications

Molei Qin Shuo Sun Wentao Zhang Haochong Xia Xinrun Wang and 1 more

High-frequency trading (HFT) is using computer algorithms to make decisions in short time scales (e.g., second-level), which widely used the Cryptocurrency (Crypto) market, Bitcoin). Reinforcement learning (RL) financial research has shown stellar performance on many quantitative tasks. However, most methods focus low-frequency trading, e.g., day-level, cannot be directly applied HFT because of two challenges. First, RL for involves dealing with extremely long trajectories 2.4 million steps...

10.1609/aaai.v38i13.29384 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist

OPENALEX - Publications

Wentao Zhang Lingxuan Zhao Haochong Xia Shuo Sun Jiaze Sun and 8 more

Financial trading is a crucial component of the markets, informed by multimodal information landscape encompassing news, prices, and Kline charts, encompasses diverse tasks such as quantitative high-frequency with various assets. While advanced AI techniques like deep learning reinforcement are extensively utilized in finance, their application financial often faces challenges due to inadequate handling data limited generalizability across tasks. To address these challenges, we present...

10.1145/3637528.3671801 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2024-08-24

Ultrasonic Hemodynamics of Middle Meningeal Artery in Chronic Subdural Hematoma

OPENALEX - Publications

Xinrun Wang Tingyue Qi Yanhui Shi Wanwan Hou Zhensheng Liu and 1 more

To explore the feasibility of color Doppler ultrasound in detecting middle meningeal artery (MMA) and evaluate clinical application this method chronic subdural hematoma (CSDH). This study collected data from 100 patients with CSDH who were admitted to our hospital between January 2023 March 2024. Among these patients, 87 underwent drilling drainage surgery. Postoperative follow-up categorized into a recurrence group non-recurrence group. Additionally, 80 healthy volunteers 60 acute (ASDH)...

10.1016/j.wneu.2025.123793 article EN cc-by World Neurosurgery 2025-03-14

Double Oracle Neural Architecture Search for Game Theoretic Deep Learning Models

OPENALEX - Publications

Aye Phyu Phyu Aung Xinrun Wang Ruiyu Wang Hau Chan Bo An and 2 more

In this paper, we propose a new approach to train deep learning models using game theory concepts including Generative Adversarial Networks (GANs) and Training (AT) where deploy double-oracle framework best response oracles. GAN is essentially two-player zero-sum between the generator discriminator. The same concept can be applied AT with attacker classifier as players. these challenging pure Nash equilibrium may not exist even finding mixed difficult training algorithms for both have...

10.1109/tip.2025.3558420 article EN IEEE Transactions on Image Processing 2025-01-01

Learning to Collaborate in Multi-Module Recommendation via Multi-Agent Reinforcement Learning without Communication

OPENALEX - Publications

Xu He Bo An Yanghua Li Haikai Chen Rundong Wang and 4 more

With the rise of online e-commerce platforms, more and customers prefer to shop online. To sell products, platforms introduce various modules recommend items with different properties such as huge discounts. A web page often consists independent modules. The ranking policies these are decided by teams optimized individually without cooperation, which might result in competition between Thus, global policy whole could be sub-optimal. In this paper, we propose a novel multi-agent cooperative...

10.1145/3383313.3412233 article EN 2020-09-19

keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM

OPENALEX - Publications

Chaojie Wang Yishi Xu Zhong Peng Chenxi Zhang Bo Chen and 3 more

Large language models (LLMs) have exhibited remarkable performance on various natural processing (NLP) tasks, especially for question answering. However, in the face of problems beyond scope knowledge, these LLMs tend to talk nonsense with a straight face, where potential solution could be incorporating an Information Retrieval (IR) module and generating response based retrieved knowledge. In this paper, we present novel framework assist LLMs, such as ChatGPT, retrieve question-related...

10.48550/arxiv.2401.00426 preprint EN cc-by arXiv (Cornell University) 2024-01-01

Mastering Stock Markets with Efficient Mixture of Diversified Trading Experts

OPENALEX - Publications

Shuo Sun Xinrun Wang Wanqi Xue Xiaoxuan Lou Bo An

Quantitative stock investment is a fundamental financial task that highly relies on accurate prediction of market status and profitable decision making. Despite recent advances in deep learning (DL) have shown stellar performance capturing trading opportunities the stochastic market, existing DL methods unstable with sensitivity to network initialization hyperparameter selection. One major limitation works decisions are made based one individual neural predictor high uncertainty, which...

10.1145/3580305.3599424 article EN cc-by Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023-08-04

Catching Captain Jack: Efficient Time and Space Dependent Patrols to Combat Oil-Siphoning in International Waters

OPENALEX - Publications

Xinrun Wang Bo An Martin Strobel Fookwai Kong

Pirate syndicates capturing tankers to siphon oil, causing an estimated cost of $5 billion a year, has become serious security issue for maritime traffic. In response the threat, coast guards and navies deploy patrol boats protect international oil trade. However, given vast area sea highly time space dependent behaviors both players, it remains significant challenge find efficient ways resources. this paper, we address research challenges provide four key contributions. First, construct...

10.1609/aaai.v32i1.11291 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-25

RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents

OPENALEX - Publications

Wei Qiu Xinrun Wang Runsheng Yu Xu He Rundong Wang and 3 more

Current value-based multi-agent reinforcement learning methods optimize individual Q values to guide individuals' behaviours via centralized training with decentralized execution (CTDE). However, such expected, i.e., risk-neutral, value is not sufficient even CTDE due the randomness of rewards and uncertainty in environments, which causes failure these train coordinating agents complex environments. To address issues, we propose RMIX, a novel cooperative MARL method Conditional Value at Risk...

10.48550/arxiv.2102.08159 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Market-GAN: Adding Control to Financial Market Data Generation with Semantic Context

OPENALEX - Publications

Haochong Xia Shuo Sun Xinrun Wang Bo An

Financial simulators play an important role in enhancing forecasting accuracy, managing risks, and fostering strategic financial decision-making. Despite the development of market simulation methodologies, existing frameworks often struggle with adapting to specialized context. We pinpoint challenges as i) current datasets do not contain context labels; ii) techniques are designed generate data control, which demands greater precision compared other modalities; iii) inherent difficulties...

10.1609/aaai.v38i14.29531 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Reinforcement Learning with Maskable Stock Representation for Portfolio Management in Customizable Stock Pools

OPENALEX - Publications

Wentao Zhang Yilei Zhao Shuo Sun Jie Ying Yonggang Xie and 3 more

Portfolio management (PM) is a fundamental financial trading task, which explores the optimal periodical reallocation of capitals into different stocks to pursue long-term profits. Reinforcement learning (RL) has recently shown its potential train profitable agents for PM through interacting with markets. However, existing work mostly focuses on fixed stock pools, inconsistent investors' practical demand. Specifically, target pool investors varies dramatically due their discrepancy market...

10.1145/3589334.3645615 article EN Proceedings of the ACM Web Conference 2022 2024-05-08

Thermodynamic investigation with chemical kinetic analysis on the reoxidation phenomenon of the Cr(iii) in air

OPENALEX - Publications

Qining Liu Honghui Liu Hui-Xia Chen Xinrun Wang Dahai Hu and 2 more

In this paper, the reoxidation behaviours of CrOOH and Cr(OH)3 are investigated as major reduction products Cr(vi). The atmosphere oxidation Cr(iii) is studied in environment soil without manganese hydrogen peroxide. influence temperature pH value on rate examined by Experiment methods details. According to experimental results, promoted with high value, however, process stable temperature. theoretically researched thermodynamic calculation density functional theory (DFT) simulation. results...

10.1039/d0ra01403f article EN cc-by-nc RSC Advances 2020-01-01

Solving Large-Scale Pursuit-Evasion Games Using Pre-trained Strategies

OPENALEX - Publications

Shuxin Li Xinrun Wang Youzhi Zhang Wanqi Xue Jakub Černý and 1 more

Pursuit-evasion games on graphs model the coordination of police forces chasing a fleeing felon in real-world urban settings, using standard framework imperfect-information extensive-form (EFGs). In recent years, solving EFGs has been largely dominated by Policy-Space Response Oracle (PSRO) methods due to their modularity, scalability, and favorable convergence properties. However, even these quickly reach limits when facing large combinatorial strategy spaces pursuit-evasion games. To...

10.1609/aaai.v37i10.26369 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

EarnHFT: Efficient Hierarchical Reinforcement Learning for High Frequency Trading

OPENALEX - Publications

Molei Qin Shuo Sun Wentao Zhang Haochong Xia Xinrun Wang and 1 more

High-frequency trading (HFT) uses computer algorithms to make decisions in short time scales (e.g., second-level), which is widely used the Cryptocurrency (Crypto) market Bitcoin). Reinforcement learning (RL) financial research has shown stellar performance on many quantitative tasks. However, most methods focus low-frequency trading, e.g., day-level, cannot be directly applied HFT because of two challenges. First, RL for involves dealing with extremely long trajectories 2.4 million steps...

10.48550/arxiv.2309.12891 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Solving Large-Scale Extensive-Form Network Security Games via Neural Fictitious Self-Play

OPENALEX - Publications

Wanqi Xue Youzhi Zhang Shuxin Li Xinrun Wang Bo An and 1 more

Securing networked infrastructures is important in the real world. The problem of deploying security resources to protect against an attacker domains can be modeled as Network Security Games (NSGs). Unfortunately, existing approaches, including deep learning-based are inefficient solve large-scale extensive-form NSGs. In this paper, we propose a novel learning paradigm, NSG-NFSP, NSGs based on Neural Fictitious Self-Play (NFSP). Our main contributions include: i) reforming best response (BR)...

10.24963/ijcai.2021/511 article EN 2021-08-01

Stop Nuclear Smuggling Through Efficient Container Inspection

OPENALEX - Publications

Xinrun Wang Qingyu Guo Bo An

Since 2003, the U.S. government has spent $850 million on Megaport Initiative which aims at stopping nuclear smuggling in international container shipping through advanced inspection facilities including Non-Intrusive Inspection (NII) and Mobile Radiation Detection Identification System (MRDIS). Unfortunately, it remains a significant challenge to efficiently inspect more than 11.7 containers imported due limited resources. Moreover, existing work neglects sophisticated behavior of smuggler...

10.5555/3091125.3091221 article EN Adaptive Agents and Multi-Agents Systems 2017-05-08

Controlling Type Confounding in Ad Hoc Teamwork with Instance-wise Teammate Feedback Rectification

OPENALEX - Publications

Xing Dong Pengjie Gu Zheng Qian Xinrun Wang Shanqi Liu and 3 more

Ad hoc teamwork requires an agent to cooperate with unknown teammates without prior coordination. Many works propose abstract teammate instances into high-level representation of types and then pre-train the best response for each type. However, most them do not consider distribution within a This could expose hidden risk \emph{type confounding}. In worst case, type be all specific that work addresses issue from lens causal inference. We first theoretically demonstrate this phenomenon is due...

10.48550/arxiv.2306.10944 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Learning to Collaborate in Multi-Module Recommendation via Multi-Agent Reinforcement Learning without Communication

OPENALEX - Publications

Xu He Bo An Yanghua Li Haikai Chen Rundong Wang and 4 more

With the rise of online e-commerce platforms, more and customers prefer to shop online. To sell products, platforms introduce various modules recommend items with different properties such as huge discounts. A web page often consists independent modules. The ranking policies these are decided by teams optimized individually without cooperation, which might result in competition between Thus, global policy whole could be sub-optimal. In this paper, we propose a novel multi-agent cooperative...

10.48550/arxiv.2008.09369 preprint EN other-oa arXiv (Cornell University) 2020-01-01

User detection of threats with different security measures

OPENALEX - Publications

Yoav Ben-Yaakov Joachim Meyer Xinrun Wang Bo An

Cyber attacks and the associated costs made cybersecurity a vital part of any system. User behavior decisions are still major in coping with these risks. We developed model optimal investment human security measures, given that effectiveness each measure depends partly on performance others. In an online experiment, participants classified events as malicious or non-malicious, based value observed variable. Prior to making decisions, they had invested three measures - firewall, IDS...

10.1109/ichms49158.2020.9209426 article EN 2020 IEEE International Conference on Human-Machine Systems (ICHMS) 2020-09-01

Efficient Reservoir Management through Deep Reinforcement Learning

OPENALEX - Publications

Xinrun Wang Tarun Nair Haoyang Li Yuh Sheng Reuben Wong Nachiket Kelkar and 5 more

Dams impact downstream river dynamics through flow regulation and disruption of upstream-downstream linkages. However, current dam operation is far from satisfactory due to the inability respond complicated uncertain system various usages reservoir. Even further, unsatisfactory can cause floods in areas. Therefore, we leverage reinforcement learning (RL) methods compute efficient guidelines this work. Specifically, build offline simulators with real data different mathematical models for...

10.48550/arxiv.2012.03822 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Neural Regret-Matching for Distributed Constraint Optimization Problems

OPENALEX - Publications

Yanchen Deng Runsheng Yu Xinrun Wang Bo An

Distributed constraint optimization problems (DCOPs) are a powerful model for multi-agent coordination and optimization, where information controls distributed among multiple agents by nature. Sampling-based algorithms important incomplete techniques solving medium-scale DCOPs. However, they use tables to exactly store all the (e.g., costs, confidence bounds) facilitate sampling, which limits their scalability. This paper tackles limitation incorporating deep neural networks in DCOPs first...

10.24963/ijcai.2021/21 article EN 2021-08-01

True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning

OPENALEX - Publications

Weihao Tan Wentao Zhang Shanqi Liu Longtao Zheng Xinrun Wang and 1 more

Despite the impressive performance across numerous tasks, large language models (LLMs) often fail in solving simple decision-making tasks due to misalignment of knowledge LLMs with environments. On contrary, reinforcement learning (RL) agents learn policies from scratch, which makes them always align environments but difficult incorporate prior for efficient explorations. To narrow gap, we propose TWOSOME, a novel general online framework that deploys as efficiently interact and embodied via...

10.48550/arxiv.2401.14151 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Towards General Computer Control: A Multimodal Agent for Red Dead Redemption II as a Case Study

OPENALEX - Publications

Weihao Tan Ziluo Ding W. Zhang Boyu Li Bohan Zhou and 11 more

Recent studies have demonstrated the success of foundation agents in specific tasks or scenarios. However, existing cannot generalize across different scenarios, mainly due to their diverse observation and action spaces semantic gaps, reliance on task-specific resources. In this work, we propose General Computer Control (GCC) setting: building that can master any computer task by taking only screen images (and possibly audio) as input, producing keyboard mouse operations output, similar...

10.48550/arxiv.2403.03186 preprint EN arXiv (Cornell University) 2024-03-05