- Reinforcement Learning in Robotics
- Opinion Dynamics and Social Influence
- Autonomous Vehicle Technology and Safety
- Network Security and Intrusion Detection
- Complex Network Analysis Techniques
- Advanced Multi-Objective Optimization Algorithms
- Transportation and Mobility Innovations
- Robotic Path Planning Algorithms
- Military Defense Systems Analysis
- Advanced Decision-Making Techniques
- Vehicular Ad Hoc Networks (VANETs)
- Economic theories and models
- Traffic control and management
- Advanced Control Systems Optimization
- Mobile Agent-Based Network Management
- Game Theory and Applications
- Evolutionary Algorithms and Applications
- Optimization and Search Problems
- Network Traffic and Congestion Control
- Anomaly Detection Techniques and Applications
- Metaheuristic Optimization Algorithms Research
- Advanced Text Analysis Techniques
- Neural Networks and Reservoir Computing
- Traffic Prediction and Management Techniques
- AI and Big Data Applications
Beihang University
2023-2025
State Grid Corporation of China (China)
2024
Ministry of Education of the People's Republic of China
2023-2024
Ji Hua Laboratory
2023
Nanjing University of Posts and Telecommunications
2023
Peng Cheng Laboratory
2023
The sparsity of reward feedback remains a challenging problem in online deep reinforcement learning (DRL). Previous approaches have utilized temporal credit assignment (CA) to achieve impressive results multiple hard tasks. However, many CA methods relied on complex architectures or introduced sensitive hyperparameters estimate the impact state-action pairs. Meanwhile, premise feasibility is obtain trajectories with sparse rewards, which can be troublesome sparse-reward environments large...
Aiming at the problem of integrated task assignment and trajectory planning a massive number agents in scenario with different priority nodes multiple static obstacles, this paper proposes general framework based on bilayer-coupled mean field games, which couples minimum cost an agent process to achieve reasonable, globally optimal, targeted adjustable result. In proposed framework, firstly, multi-population game is used plan optimal between each pair adjacent nodes, costs are calculated....
Link flooding attacks (LFAs) have always been a security concern as the impact of volumetric on transit links are increasingly severe. Capacity expansion, while being effective in combating LFAs, involves considerable deployment costs. Therefore, how to efficiently manage link resource among spatio-temporal dynamic customers remains challenge for Internet service providers (ISPs). In this paper, we study differential pricing strategy bandwidth allocation with LFA resilience by leveraging...
Current adaptive traffic signal control methods based on centralized deep reinforcement learning are not applicable in large-scale environment. The scalability problem is overcome by assigning global to each local RL agent through multi-intelligence learning, but the environment now becomes partially visible ami non-stationarity from perspective of due limited communication between agents. In this paper, we propose a multi-agent framework called Forgetful Priority Weighed Double Deep...
Reinforcement learning (RL) with sparse and deceptive rewards is challenging because non-zero are rarely obtained. Hence, the gradient calculated by agent can be stochastic without valid information. Recent studies that utilize memory buffers of previous experiences lead to a more efficient process. However, existing methods often require these successful may overly exploit them, which cause adopt suboptimal behaviors. This paper develops an approach uses diverse past trajectories for faster...
In this paper, we consider the social optimal problem of discrete time finite state space mean field games (referred to as [1]). Unlike individual optimization their own cost function in competitive models, consider, individuals aim optimize by finding a fixed point distribution achieve equilibrium game. We provide sufficient condition for existence and uniqueness strategies used minimize cost. According definition optimum derived properties cost, conditions solutions under initial-terminal...
After the rise of education reform in China, more and schools began to pay attention construction practice bases. As an indispensable auxiliary part each enterprise's development, finance naturally received high attention. Especially with advent big data era, demand for high-quality compound financial talents is growing rapidly. However, practical skills cultivated from traditional bases built by various universities are relatively weak, it difficult really fill required modern enterprises....
Deep reinforcement learning (DRL) faces significant challenges in addressing the hard-exploration problems tasks with sparse or deceptive rewards and large state spaces. These severely limit practical application of DRL. Most previous exploration methods relied on complex architectures to estimate novelty introduced sensitive hyperparameters, resulting instability. To mitigate these issues, we propose an efficient adaptive trajectory-constrained strategy for The proposed method guides policy...