- Reinforcement Learning in Robotics
- Cloud Computing and Resource Management
- IoT and Edge/Fog Computing
- Distributed and Parallel Computing Systems
- Mobile Crowdsensing and Crowdsourcing
- Distributed Control Multi-Agent Systems
- Adversarial Robustness in Machine Learning
- Ethics and Social Impacts of AI
- Image and Video Quality Assessment
- Advanced Neural Network Applications
- Model-Driven Software Engineering Techniques
- Hydrological Forecasting Using AI
- Blockchain Technology Applications and Security
- Scheduling and Optimization Algorithms
- Cognitive Science and Mapping
- Topic Modeling
- Explainable Artificial Intelligence (XAI)
- Adaptive Dynamic Programming Control
- Semantic Web and Ontologies
- Multi-Criteria Decision Making
- Neural Networks and Reservoir Computing
- Evaluation Methods in Various Fields
- Retinal Imaging and Analysis
- Advanced Graph Neural Networks
- Domain Adaptation and Few-Shot Learning
East China Normal University
2021-2023
Shanghai Maritime University
2023
Large language models (LLMs) have demonstrated remarkable capabilities across various domains, especially in text processing and generative tasks. Recent advancements the reasoning of state-of-the-art LLMs, such as OpenAI-o1, significantly broadened their applicability, particularly complex problem-solving logical inference. However, most existing LLMs struggle with notable limitations handling graph combinatorial optimization (GCO) problems. To bridge this gap, we formally define Optimal...
The Satisfiability (SAT) problem is a core challenge with significant applications in software engineering, including automated testing, configuration management, and program verification. This paper presents SolSearch, novel framework that harnesses large language models (LLMs) to discover optimize SAT-solving strategies automatically. Leveraging curriculum-based, trial-and-error process, SolSearch enables the LLM iteratively modify generate SAT solver code, thereby improving solving...
This work explores the large-scale multi-agent communication mechanism under a reinforcement learning (MARL) setting. We summarize general categories of topology for structures in MARL literature, which are often manually specified. Then we propose novel framework termed as Learning Structured Communication (LSC) by using more flexible and efficient topology. Our allows adaptive agent grouping to form different hierarchical formations over episodes, is generated an auxiliary task combined...
Oversubscription is a common practice for improving cloud resource utilization. It allows the service provider to sell more resources than physical limit, assuming not all users would fully utilize simultaneously. However, how design an oversubscription policy that improves utilization while satisfying some safety constraints remains open problem. Existing methods and industrial practices are over-conservative, ignoring coordination of diverse usage patterns probabilistic constraints. To...
Oversubscription is a prevalent practice in cloud services where the system offers more virtual resources, such as cores machines, to users or applications than its available physical capacity for reducing revenue loss due unused/redundant capacity. While oversubscription can potentially lead significant enhancement efficient resource utilization, caveat that it comes with risks of overloading and introducing jitter at level nodes if all co-located machines have high utilization. Thus...
A novel simulator called VMAgent is introduced to help RL researchers better explore new methods, especially for virtual machine scheduling. inspired by practical (VM) scheduling tasks and provides an efficient simulation platform that can reflect the real situations of cloud computing. Three scenarios (fading, recovering, expansion) are concluded from computing corresponds many reinforcement learning challenges (high dimensional state action spaces, high non-stationarity, life-long demand)....
Fairness has been taken as a critical metric in machine learning models, which is considered an important component of trustworthy learning. In this paper, we focus on obtaining fairness for popular link prediction tasks, are measured by dyadic fairness. A novel pre-processing methodology proposed to establish through data repairing based optimal transport theory. With the well-established theoretical connection between graph and conditional distribution alignment problem, scheme can be...
Non-stationarity is one thorny issue in cooperative multi-agent reinforcement learning (MARL). One of the reasons policy changes agents during process. Some existing works have discussed various consequences caused by non-stationarity with several kinds measurement indicators. This makes objectives or goals algorithms are inevitably inconsistent and disparate. In this paper, we introduce a novel notion, $\delta$-measurement, to explicitly measure sequence, which can be further proved bounded...
Virtual machine (VM) scheduling is one of the critical tasks in cloud computing. Many works have attempted to incorporate learning, especially reinforcement empower VM procedures. Although improved results are shown several demo simulators, performances real-world scenarios still underexploited. In this paper, we design a practical platform, i.e., VMAgent, assist researchers developing their methods on problem. VMAgent consists three components: simulator, scheduler, and visualizer. The...
When solving a complex task, humans will spontaneously form teams and to complete different parts of the whole respectively. Meanwhile, cooperation between teammates improve efficiency. However, for current cooperative MARL methods, team is constructed through either heuristics or end-to-end blackbox optimization. In order efficiency exploration, we propose structured diversification emergence framework named {\sc{Rochico}} based on reinforced organization control hierarchical consensus...
Oil detection technology improves the reliability of machinery or equipment. The physical and chemical indicators fluid can reflect cause failure in various aspects, which prevent major accidents to greatest extent by setting up a fault tree. Owing lack data, it is difficult accurately obtain basic event probabilities, makes diagnose faults. expert evaluation method aggregated fuzzy numbers are used exact probability, where probability evaluated as subjective will expert. To improve...
Over-generalization is a thorny issue in cognitive science, where people may become overly cautious due to past experiences. Agents multi-agent reinforcement learning (MARL) also have been found suffer relative over-generalization (RO) as do and stuck sub-optimal cooperation. Recent methods shown that assigning reasoning ability agents can mitigate RO algorithmically empirically, but there has lack of theoretical understanding RO, let alone designing provably RO-free methods. This paper...
Minimizing the impact of clean energy volatility and maximizing benefits to owners are hot issues be solved in current smart systems. This paper introduces a risk factor based on master-slave game theory, establishes an optimal scheduling model system with as leader end-users followers, uses conditional value-at-risk theory economics quantitatively analyze cost brought by uncertainty output before day. Using this factor, two-tier dispatching for systems is established introduction model....
The formidable capacity for zero- or few-shot decision-making in language agents encourages us to pose a compelling question: Can be alternatives PPO traditional sequential tasks? To investigate this, we first take environments collected OpenAI Gym as our testbeds and ground them textual that construct the TextGym simulator. This allows straightforward efficient comparisons between agents, given widespread adoption of Gym. ensure fair effective benchmarking, introduce $5$ levels scenario...
The complexity and randomness of power load data lead to the prediction accuracy a single forecasting model cannot meet requirements current grid. In this paper, aiming at inherent nonlinear characteristics residential data, an accurate method based on ARIMA-BPNN combined is proposed. Firstly, by treating sequence formed resident over time as random sequence, ARIMA used approximate secondly, series has defect "neglecting characteristics", BP neural network introduced. law mining establish...
Fairness has been taken as a critical metric in machine learning models, which is considered an important component of trustworthy learning. In this paper, we focus on obtaining fairness for popular link prediction tasks, are measured by dyadic fairness. A novel pre-processing methodology proposed to establish through data repairing based optimal transport theory. With the well-established theoretical connection between graph and conditional distribution alignment problem, scheme can be...
Oversubscription is a common practice for improving cloud resource utilization. It allows the service provider to sell more resources than physical limit, assuming not all users would fully utilize simultaneously. However, how design an oversubscription policy that improves utilization while satisfying some safety constraints remains open problem. Existing methods and industrial practices are over-conservative, ignoring coordination of diverse usage patterns probabilistic constraints. To...
With the rapid development of cloud computing, virtual machine scheduling has become one most important but challenging issues for computing community, especially practical heterogeneous request sequences. By analyzing impact heterogeneity on some popular heuristic schedulers, it can be found that existing algorithms not handle properly and efficiently. In this paper, a plug-and-play intensifier, called Resource Assigner (ReAssigner), is proposed to enhance efficiency any given scheduler...
With the rapid development of cloud computing, virtual machine scheduling has become one most important but challenging issues for computing community, especially practical heterogeneous request sequences. By analyzing impact heterogeneity on some popular heuristic schedulers, it can be found that existing algorithms not handle properly and efficiently. In this paper, a plug-and-play intensifier, called Resource Assigner (ReAssigner), is proposed to enhance efficiency any given scheduler...