- Visual Attention and Saliency Detection
- Advanced Image and Video Retrieval Techniques
- Colorectal Cancer Screening and Detection
- Image Enhancement Techniques
- Radiomics and Machine Learning in Medical Imaging
- COVID-19 diagnosis using AI
- Multimodal Machine Learning Applications
- Advanced Neural Network Applications
- Olfactory and Sensory Function Studies
- Image and Video Quality Assessment
- Domain Adaptation and Few-Shot Learning
- Adversarial Robustness in Machine Learning
- Image Retrieval and Classification Techniques
- Handwritten Text Recognition Techniques
- Face Recognition and Perception
- AI in cancer detection
- Generative Adversarial Networks and Image Synthesis
- Digital Media Forensic Detection
- Video Surveillance and Tracking Methods
- Mycobacterium research and diagnosis
- Advanced Image Processing Techniques
- Anomaly Detection Techniques and Applications
- Vehicle License Plate Recognition
- Speech and dialogue systems
- Gaze Tracking and Assistive Technology
Australian National University
2022-2024
Wuhan University
2020-2023
Alibaba Group (China)
2023
Inception Institute of Artificial Intelligence
2021
University of Hong Kong
2021
Hong Kong University of Science and Technology
2021
Coronavirus Disease 2019 (COVID-19) spread globally in early 2020, causing the world to face an existential health crisis. Automated detection of lung infections from computed tomography (CT) images offers a great potential augment traditional healthcare strategy for tackling COVID-19. However, segmenting infected regions CT slices faces several challenges, including high variation infection characteristics, and low intensity contrast between normal tissues. Further, collecting large amount...
We present a comprehensive study on new task named camouflaged object detection (COD), which aims to identify objects that are "seamlessly" embedded in their surroundings. The high intrinsic similarities between the target and background make COD far more challenging than traditional task. To address this issue, we elaborately collect novel dataset, called COD10K, comprises 10,000 images covering various natural scenes, over 78 categories. All densely annotated with category, bounding-box,...
We present the first systematic study on concealed object detection (COD), which aims to identify objects that are visually embedded in their background. The high intrinsic similarities between and background make COD far more challenging than traditional detection/segmentation. To better understand this task, we collect a large-scale dataset, called COD10K, consists of 10,000 images covering diverse real-world scenarios from 78 categories. Further, provide rich annotations including...
Camouflaged object segmentation (COS) aims to identify objects that are "perfectly" assimilate into their surroundings, which has a wide range of valuable applications. The key challenge COS is there exist high intrinsic similarities between the candidate and noise background. In this paper, we strive embrace challenges towards effective efficient COS. To end, develop bio-inspired framework, termed Positioning Focus Network (PFNet), mimics process predation in nature. Specifically, our PFNet...
This paper proposes a novel joint learning and densely-cooperative fusion (JL-DCF) architecture for RGB-D salient object detection. Existing models usually treat RGB depth as independent information design separate networks feature extraction from each. Such schemes can easily be constrained by limited amount of training data or over-reliance on an elaborately-designed process. In contrast, our JL-DCF learns both inputs through Siamese network. To this end, we propose two effective...
Existing RGB-D salient object detection (SOD) models usually treat RGB and depth as independent information design separate networks for feature extraction from each. Such schemes can easily be constrained by a limited amount of training data or over-reliance on an elaborately designed process. Inspired the observation that modalities actually present certain commonality in distinguishing objects, novel joint learning densely cooperative fusion (JL-DCF) architecture is to learn both inputs...
Camouflaged object detection (COD) aims to identify the objects that conceal themselves in natural scenes. Accurate COD suffers from a number of challenges associated with low boundary contrast and large variation appearances, e.g., size shape. To address these challenges, we propose novel Context-aware Cross-level Fusion Network ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\text{C}^{2}\text{F}$...
Abstract This paper introduces deep gradient network (DGNet), a novel framework that exploits object supervision for camouflaged detection (COD). It decouples the task into two connected branches, i.e., context and texture encoder. The essential connection is gradient-induced transition, representing soft grouping between features. Benefiting from simple but efficient framework, DGNet outperforms existing state-of-the-art COD models by large margin. Notably, our version, DGNet-S, runs in...
In this article, we conduct a comprehensive study on the co-salient object detection (CoSOD) problem for images. CoSOD is an emerging and rapidly growing extension of salient (SOD), which aims to detect co-occurring objects in group However, existing datasets often have serious data bias, assuming that each images contains similar visual appearances. This bias can lead ideal settings effectiveness models trained datasets, being impaired real-life situations, where similarities are usually...
Abstract We present the first comprehensive video polyp segmentation (VPS) study in deep learning era. Over years, developments VPS are not moving forward with ease due to lack of a large-scale dataset fine-grained annotations. To address this issue, we introduce high-quality frame-by-frame annotated dataset, named SUN-SEG, which contains 158 690 colonoscopy frames from well-known SUN-database. provide additional annotation covering diverse types, i.e., attribute, object mask, boundary,...
Concealed scene understanding (CSU) is a hot computer vision topic aiming to perceive objects exhibiting camouflage. The current boom in terms of techniques and applications warrants an up-to-date survey. This can help researchers better understand the global CSU field, including both achievements remaining challenges. paper makes four contributions: (1) For first time, we present comprehensive survey deep learning aimed at CSU, taxonomy, task-specific challenges, ongoing developments. (2)...
Appearance and motion are two important sources of information in video object segmentation (VOS). Previous methods mainly focus on using simplex solutions, lowering the upper bound feature collaboration among across these cues. In this paper, we study a novel framework, termed FSNet (Full-duplex Strategy Network), which designs relational cross-attention module (RCAM) to achieve bidirectional message propagation embedding subspaces. Furthermore, purification (BPM) is introduced update...
RGB-D salient object detection (SOD) recently has attracted increasing research interest by benefiting conventional RGB SOD with extra depth information. However, existing models often fail to perform well in terms of both efficiency and accuracy, which hinders their potential applications on mobile devices real-world problems. An underlying challenge is that the model accuracy usually degrades when simplified have few parameters. To tackle this dilemma also inspired fact quality a key...
Abstract Coronavirus Disease 2019 (COVID-19) spread globally in early 2020, causing the world to face an existential health crisis. Automated detection of lung infections from computed tomography (CT) images offers a great potential augment traditional healthcare strategy for tackling COVID-19. However, segmenting infected regions CT slices faces several challenges, including high variation infection characteristics, and low intensity contrast between normal tissues. Further, collecting...
Co-salient object detection (CoSOD) is a newly emerging and rapidly growing branch of salient (SOD), which aims to detect the co-occurring objects in multiple images. However, existing CoSOD datasets often have serious data bias, assumes that each group images contains similar visual appearances. This bias results ideal settings effectiveness models, trained on datasets, may be impaired real-life situations, where similarity usually semantic or conceptual. To tackle this issue, we first...
Google's Bard has emerged as a formidable competitor to OpenAI's ChatGPT in the field of conversational AI. Notably, recently been updated handle visual inputs alongside text prompts during conversations. Given Bard's impressive track record handling textual inputs, we explore its capabilities understanding and interpreting data (images) conditioned by questions. This exploration holds potential unveil new insights challenges for other forthcoming multi-modal Generative models, especially...
Abstract We present a masked vision-language transformer (MVLT) for fashion-specific multi-modal representation. Technically, we simply utilize the vision architecture replacing bidirectional encoder representations from Transformers (BERT) in pre-training model, making MVLT first end-to-end framework fashion domain. Besides, designed image reconstruction (MIR) fine-grained understanding of fashion. is an extensible and convenient that admits raw inputs without extra pre-processing models...
Abstract The advent of large vision-language models (LVLMs) represents a remarkable advance in the quest for artificial general intelligence. However, models’ effectiveness both specialized and tasks warrants further investigation. This paper endeavors to evaluate competency popular LVLMs tasks, respectively, aiming offer comprehensive understanding these novel models. To gauge their we employ six challenging three different application scenarios: natural, healthcare, industrial. These...
Camouflaged Object Detection (COD) aims to detect objects with similar patterns (e.g., texture, intensity, colour, etc) their surroundings, and recently has attracted growing research interest. As camouflaged often present very ambiguous boundaries, how determine object locations as well weak boundaries is challenging also the key this task. Inspired by biological visual perception process when a human observer discovers objects, paper proposes novel edge-based reversible re-calibration...