- Advanced Neural Network Applications
- Domain Adaptation and Few-Shot Learning
- Advanced Image Processing Techniques
- Multimodal Machine Learning Applications
- Advanced Vision and Imaging
- Advanced Image and Video Retrieval Techniques
- Image and Signal Denoising Methods
- Image Processing Techniques and Applications
- Generative Adversarial Networks and Image Synthesis
- Adversarial Robustness in Machine Learning
- CCD and CMOS Imaging Sensors
- Interconnection Networks and Systems
- Human Pose and Action Recognition
- Video Surveillance and Tracking Methods
- Advanced Computational Techniques and Applications
- Geological Modeling and Analysis
- Software-Defined Networks and 5G
- Optical measurement and interference techniques
- Advanced Memory and Neural Computing
- Embedded Systems Design Techniques
- Visual Attention and Saliency Detection
- Anomaly Detection Techniques and Applications
- Simulation and Modeling Applications
- Crystallography and molecular interactions
- CO2 Reduction Techniques and Catalysts
University of Technology Sydney
2022-2025
University of Electronic Science and Technology of China
2018-2025
Peptcell (United Kingdom)
2024
Peking University
2023-2024
Beijing National Laboratory for Molecular Sciences
2023-2024
Beijing Institute of Technology
2022-2023
Civil Aviation University of China
2023
Australian Regenerative Medicine Institute
2022
Monash University
2019-2022
Kuaishou (China)
2022
Object detection in high-resolution aerial images is a challenging task because of 1) the large variation object size, and 2) non-uniform distribution objects. A common solution to divide image into small (uniform) crops then apply on each crop. In this paper, we investigate cropping strategy address these challenges. Specifically, propose Density-Map guided Network (DMNet), which inspired from observation that density map an presents how objects distribute terms pixel intensity map. As...
Current dynamic networks and pruning methods have shown their promising capability in reducing theoretical computation complexity. However, sparse patterns on convolutional filters fail to achieve actual acceleration real-world implementation, due the extra burden of indexing, weight-copying, or zero-masking. Here, we explore a network slimming regime, named Dynamic Slimmable Network (DS-Net), which aims good hardware-efficiency via dynamically adjusting filter numbers at test time with...
Temporal modeling is crucial for video super-resolution. Most of the super-resolution methods adopt optical flow or deformable convolution explicitly motion compensation. However, such temporal techniques increase model complexity and might fail in case occlusion complex motion, resulting serious distortion artifacts. In this paper, we propose to explore role explicit difference both LR HR space. Instead directly feeding consecutive frames into a VSR model, compute between divide those...
Dynamic networks have shown their promising capability in reducing theoretical computation complexity by adapting architectures to the input during inference. However, practical runtime usually lags behind acceleration due inefficient sparsity. In this paper, we explore a hardware-efficient dynamic inference regime, named weight slicing, that can generalized well on multiple dimensions both CNNs and transformers (e.g. kernel size, embedding dimension, number of heads, etc.). Instead...
Continual learning is a challenging real-world problem for constructing mature AI system when data are provided in streaming fashion. Despite recent progress continual classification, the researches of object detection impeded by diverse sizes and numbers objects each image. Different from previous works that tune whole network all tasks, this work, we present simple flexible framework via pRotOtypical taSk corrElaTion guided gaTing mechAnism (ROSETTA). Concretely, unified shared tasks while...
Recently, a surge of interest in visual transformers is to reduce the computational cost by limiting calculation self-attention local window. Most current work uses fixed single-scale window for modeling default, ignoring impact size on model performance. How-ever, this may limit potential these window-based models multi-scale information. In paper, we propose novel method, named Dynamic Window Vision Transformer (DW-ViT). The dynamic strategy proposed DW- ViT goes beyond that employs single...
In this study, single Ni2 clusters (two Ni atoms bridged by a lattice oxygen) are successfully synthesized on monolayered CuO. They exhibit remarkable activity toward low-temperature CO2 thermal dissociation, in contrast to cationic that nondissociatively adsorb and metallic ones chemically inert for adsorption. Density functional theory calculations reveal the can significantly alter spatial symmetry of their unoccupied frontier orbitals match occupied counterpart molecule enable its...
This study aims to explore the performance and challenges of an end-to-end autonomous driving decision model based on a deep convolutional neural network (CNN) in practical applications. Firstly, this paper introduces basic principles networks their application background driving. Subsequently, it describes detail spatial feature extraction models networks, including PilotNet baseline transfer learning. Based this, for longitudinal lateral control intelligent vehicles is constructed. The are...
The dynamic inference, which adaptively allocates computational budgets for different samples, is a prevalent approach achieving efficient action recognition. Current studies primarily focus on data-efficient regime that reduces spatial or temporal redundancy, their combination, by selecting partial video data, such as clips, frames, patches. However, these approaches often utilize fixed and computationally expensive networks. From perspective, this article introduces novel model-efficient...
Recent advances in hand-crafted neural architectures for visual recognition underscore the pressing need to explore architecture designs comprising diverse building blocks. Concurrently, search (NAS) methods have gained traction as a means alleviate human efforts. Nevertheless, question of whether NAS can efficiently and effectively manage diversified spaces featuring disparate candidates, such Convolutional Neural Networks (CNNs) transformers, remains an open question. In this work, we...
Recent advances in vision Transformers (ViTs) have come with a voracious appetite for computing power, highlighting the urgent need to develop efficient training methods ViTs. Progressive learning, scheme where model capacity grows progressively during training, has started showing its ability training. In this paper, we take practical step towards of ViTs by customizing and automating progressive learning. First, strong manual baseline learning ViTs, introducing momentum growth (MoGrow)...
Recently, great success has been made in learning visual representations from text supervision, facilitating the emergence of text-supervised semantic segmentation. However, existing works focus on pixel grouping and cross-modal alignment, while ignoring correspondence among multiple augmented views same image. To overcome such limitation, we propose multi-\textbf{View} \textbf{Co}nsistent (ViewCo) for Specifically, first text-to-views consistency modeling to learn input Additionally,...
Recently, tremendous human-designed and automatically searched neural networks have been applied to image denoising. However, previous works intend handle all noisy images in a pre-defined static network architecture, which inevitably leads high computational complexity for good denoising quality. Here, we present dynamic slimmable (DDS-Net), general method achieve quality with less complexity, via dynamically adjusting the channel configurations of at test time respect different images. Our...
This paper proposes a novel video inpainting method. We make three main contributions: First, we extended previous Transformers with patch alignment by introducing Deformed Patch-based Homography (DePtH), which improves patch-level feature alignments without additional supervision and benefits challenging scenes various deformation. Second, introduce Mask Pruning-based Patch Attention (MPPA) to improve patch-wised matching pruning out less essential features using saliency map. MPPA enhances...
Recently, semantic segmentation models trained with image-level text supervision have shown promising results in challenging open-world scenarios. However, these still face difficulties learning fine-grained alignment at the pixel level and predicting accurate object masks. To address this issue, we propose MixReorg, a novel straightforward pre-training paradigm for that enhances model's ability to reorganize patches mixed across images, exploring both local visual relevance global...
The rapid advancements in Large Vision Models (LVMs), such as Transformers (ViTs) and diffusion models, have led to an increasing demand for computational resources, resulting substantial financial environmental costs. This growing challenge highlights the necessity of developing efficient training methods LVMs. Progressive learning, a strategy which model capacity gradually increases during training, has shown potential addressing these challenges. In this paper, we present advanced...
Image inpainting aims to inpaint missing pixels of an image naturally and realistically. Previous deep learning approaches typically require specific design for different types masks cannot generalize well multiple scenarios simultaneously. Thus on top most common stroke-type mask approaches, we in this paper pro-pose a unified framework handle simultaneously (e.g. strokes, object shapes, extrapolation, dense periodic grids et al). We address problem by proposing progressive scheme Semantic...
Software-Defined Networking (SDN) brings new opportunities to improve network performance of Wide Area Networks (WANs). To enhance the control plane's processing ability, a Network (SD-WAN) usually employs multiple SDN controllers large scale network. The keep consistent state with each other via controller synchronization. A synchronization involves all controllers, and cannot operate until is done. Therefore, existing schemes could affect increase high resource consumption, thus increasing...