- Reinforcement Learning in Robotics
- Machine Learning and ELM
- Domain Adaptation and Few-Shot Learning
- Topic Modeling
- Adaptive Dynamic Programming Control
- Sparse and Compressive Sensing Techniques
- Distributed Control Multi-Agent Systems
- Medical Image Segmentation Techniques
- Advanced Neural Network Applications
- Cyclone Separators and Fluid Dynamics
- Advanced Fluorescence Microscopy Techniques
- Multimodal Machine Learning Applications
- Metaheuristic Optimization Algorithms Research
- Photoacoustic and Ultrasonic Imaging
- Text and Document Classification Technologies
- Aerodynamics and Acoustics in Jet Flows
- Explainable Artificial Intelligence (XAI)
- Aerosol Filtration and Electrostatic Precipitation
- Advanced Algorithms and Applications
- Natural Language Processing Techniques
- Model Reduction and Neural Networks
- Optimization and Search Problems
- Grey System Theory Applications
- Artificial Intelligence in Games
- Neural dynamics and brain function
Chinese University of Hong Kong, Shenzhen
2024
East China Normal University
2019-2023
Anhui University
2023
Meizu (China)
2023
Tsinghua University
2018-2022
Center for Information Technology
2022
Beijing Academy of Artificial Intelligence
2022
Tongji University
2019
Pennsylvania State University
2019
Shanghai Key Laboratory of Trustworthy Computing
2019
Existing automatic 3D image segmentation methods usually fail to meet the clinic use. Many studies have explored an interactive strategy improve performance by iteratively incorporating user hints. However, dynamic process for successive interactions is largely ignored. We here propose model of iterative as a Markov decision (MDP) and solve it with reinforcement learning (RL). Unfortunately, intractable use single-agent RL voxel-wise prediction due large exploration space. To reduce space...
Poetry is one of the most beautiful forms human language art. As a crucial step towards computer creativity, automatic poetry generation has drawn researchers' attention for decades. In recent years, some neural models have made remarkable progress in this task. However, they are all based on maximum likelihood estimation, which only learns common patterns corpus and results loss-evaluation mismatch. Human experts evaluate terms specific criteria, instead word-level likelihood. To handle...
Large language models (LLMs) have demonstrated remarkable capabilities across various domains, especially in text processing and generative tasks. Recent advancements the reasoning of state-of-the-art LLMs, such as OpenAI-o1, significantly broadened their applicability, particularly complex problem-solving logical inference. However, most existing LLMs struggle with notable limitations handling graph combinatorial optimization (GCO) problems. To bridge this gap, we formally define Optimal...
Recently, transformer-based methods have been introduced to estimate 3D human pose from multiple views by aggregating the spatial-temporal information of joints achieve lifting 2D 3D. However, previous approaches cannot model inter-frame correspondence each view's joint individually, nor can they directly consider all view interactions at time, leading insufficient learning multi-view associations. To address this issue, we propose a Spatial-View-Temporal transformer (SVTformer) decouple...
Multi-scenario & multi-task learning has been widely applied to many recommendation systems in industrial applications, wherein an effective and practical approach is carry out multi-scenario transfer on the basis of Mixture-of-Expert (MoE) architecture. However, MoE-based method, which aims project all information same feature space, cannot effectively deal with complex relationships inherent among various scenarios tasks, resulting unsatisfactory performance. To tackle problem, we propose...
The nonconvex optimization problems have recently attracted significant attention. However, both efficient algorithm and solid theory are still very limited. difficulty is even pronounced for structured large-scale in many real-world applications. This article proposes an application-driven algorithmic framework with distributed parallel techniques, which jointly handles the high dimensionality of model parameters training data. theoretical convergence our established under moderate...
Existing multi-hypothesis (MH) prediction algorithms in compressed video sensing (CVS) are all deployed measurement domain, which restricts the flexibility of block partitioning reconstruction process and decreases accuracy. To address this issue, paper proposes a two-stage (2sMHR) scheme deploys MH domain pixel successively. Two implementation schemes, GOP-wise frame-wise scheme, developed for 2sMHR. Furthermore, new weighted metric combining Euclidean distance correlation coefficient is...
Traditional centralized multi-agent reinforcement learning (MARL) algorithms are sometimes unpractical in complicated applications, due to non-interactivity between agents, curse of dimensionality and computation complexity. Hence, several decentralized MARL motivated. However, existing methods only handle the fully cooperative setting where massive information needs be transmitted training. The block coordinate gradient descent scheme they used for successive independent actor critic steps...
In recent years, reinforcement learning has achieved excellent results in low-dimensional static action spaces such as games and simple robotics. However, the space is usually composite, composed of multiple sub-action with different functions, time-varying for practical tasks. The existing sub-actions might be temporarily invalid due to external environment, while unseen can added current system. To solve robustness transferability problems composite spaces, we propose a structured...
The difficulty of appropriately assigning credit is particularly heightened in cooperative MARL with sparse reward, due to the concurrent time and structural scales involved. Automatic subgoal generation (ASG) has recently emerged as a viable approach inspired by utilizing subgoals intrinsically motivated reinforcement learning. However, end-to-end learning complex task planning from rewards without prior knowledge, undoubtedly requires massive training samples. Moreover, diversity-promoting...
An information designer has precise about consumers' preferences over products sold by oligopolists. The chooses what to reveal differentiated frms who, then, compete on price making personalized offers. We ask market outcomes the can achieve. is a metaphor for an internet platform who collects data users and sells it firms can, in turn, target discounts promotions towards different consumers. Our analysis provides new benchmarks demonstrating power that users' endow platforms with. These...
Social dilemmas can be considered situations where individual rationality leads to collective irrationality. The multi-agent reinforcement learning community has leveraged ideas from social science, such as value orientations (SVO), solve in complex cooperative tasks. In this paper, by first introducing the typical "division of labor or roles" mechanism human society, we provide a promising solution for intertemporal (ISD) with SVOs. A novel framework, called Learning Roles Emergent SVOs...
This work explores the large-scale multi-agent communication mechanism under a reinforcement learning (MARL) setting. We summarize general categories of topology for structures in MARL literature, which are often manually specified. Then we propose novel framework termed as Learning Structured Communication (LSC) by using more flexible and efficient topology. Our allows adaptive agent grouping to form different hierarchical formations over episodes, is generated an auxiliary task combined...
A multihypothesis-based residual reconstruction scheme (MHRR) is presented in compressed video sensing (CVS). The first predicted by a novel multihypothesis (MH) prediction method and the second then reconstructed independently. In proposed MHRR, of generating hypothesis blocks domain offered, are obtained pixel-domain ME technique linear weights calculated measurement-domain, which can combine advantages measurement MH prediction. Simulation results show that MHRR achieve higher performance...
In spite of the success existing meta reinforcement learning methods, they still have difficulty in a policy effectively for RL problems with sparse reward. this respect, we develop novel framework called Hyper-Meta RL(HMRL), reward problems. It is consisted three modules including cross-environment state embedding module which constructs common space to adapt different environments; based environment-specific shaping extends original trajectory by cross-environmental knowledge...