- AI in cancer detection
- Advanced Neural Network Applications
- Cell Image Analysis Techniques
- Advanced Image and Video Retrieval Techniques
- Digital Imaging for Blood Diseases
- Image Retrieval and Classification Techniques
- Domain Adaptation and Few-Shot Learning
- Radiomics and Machine Learning in Medical Imaging
- Medical Image Segmentation Techniques
- Image Processing Techniques and Applications
- Multimodal Machine Learning Applications
- Generative Adversarial Networks and Image Synthesis
- COVID-19 diagnosis using AI
- Advanced Neuroimaging Techniques and Applications
- Topic Modeling
- Anomaly Detection Techniques and Applications
- Adversarial Robustness in Machine Learning
- Brain Tumor Detection and Classification
- Video Surveillance and Tracking Methods
- Advanced Graph Neural Networks
- Human Pose and Action Recognition
- Advanced Image Processing Techniques
- Natural Language Processing Techniques
- Video Analysis and Summarization
- Face recognition and analysis
UNSW Sydney
2018-2025
Hebei Agricultural University
2025
Northeast Forestry University
2025
Google (United States)
2010-2024
Siemens (China)
2023-2024
Tongji University
2024
Anhui Water Conservancy and Hydropower Survey and Design Institute
2022-2024
First Affiliated Hospital of Anhui Medical University
2024
Anhui Medical University
2024
Shanghai Changzheng Hospital
2024
The goal of this paper is to serve as a guide for selecting detection architecture that achieves the right speed/memory/accuracy balance given application and platform. To end, we investigate various ways trade accuracy speed memory usage in modern convolutional object systems. A number successful systems have been proposed recent years, but apples-toapples comparisons are difficult due different base feature extractors (e.g., VGG, Residual Networks), default image resolutions, well hardware...
With the rapid increase of large-scale, real-world datasets, it becomes critical to address problem long-tailed data distribution (i.e., a few classes account for most data, while are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on number observations each class. In this work, we argue that samples increases, additional benefit newly added point will diminish. We introduce novel theoretical framework measure...
Learning fine-grained image similarity is a challenging task. It needs to capture between-class and within-class differences. This paper proposes deep ranking model that employs learning techniques learn metric directly from images. has higher capability than models based on hand-crafted features. A novel multiscale network structure been developed describe the images effectively. An efficient triplet sampling algorithm also proposed with distributed asynchronized stochastic gradient....
Existing image classification datasets used in computer vision tend to have a uniform distribution of images across object categories. In contrast, the natural world is heavily imbalanced, as some species are more abundant and easier photograph than others. To encourage further progress challenging real conditions we present iNaturalist detection dataset, consisting 859,000 from over 5,000 different plants animals. It features visually similar species, captured wide variety situations, all...
If I provide you a face image of mine (without telling the actual age when took picture) and large amount images that crawled (containing labeled faces different ages but not necessarily paired), can show me what would look like am 80 or was 5? The answer is probably No. Most existing aging works attempt to learn transformation between groups thus require paired samples as well query image. In this paper, we at problem from generative modeling perspective such no required. addition, given an...
In this paper we address the issue of output instability deep neural networks: small perturbations in visual input can significantly distort feature embeddings and a network. Such affects many architectures with state-of-the-art performance on wide range computer vision tasks. We present general stability training method to stabilize networks against distortions that result from various types common image processing, such as compression, rescaling, cropping. validate our by stabilizing state...
Transferring the knowledge learned from large scale datasets (e.g., ImageNet) via fine-tuning offers an effective solution for domain-specific fine-grained visual categorization (FGVC) tasks recognizing bird species or car make & model). In such scenarios, data annotation often calls specialized domain and thus is difficult to scale. this work, we first tackle a problem in FGVC. Our method won place iNaturalist 2017 classification challenge. Central success of our approach training scheme...
The accurate identification of malignant lung nodules on chest CT is critical for the early detection cancer, which also offers patients best chance cure. Deep learning methods have recently been successfully introduced to computer vision problems, although substantial challenges remain in due lack large training data sets. In this paper, we propose a multi-view knowledge-based collaborative (MV-KBC) deep model separate from benign using limited data. Our learns 3-D nodule characteristics by...
Discrete point cloud objects lack sufficient shape descriptors of 3D geometries. In this paper, we present a novel method for aggregating hypothetical curves in clouds. Sequences connected points (curves) are initially grouped by taking guided walks the clouds, and then subsequently aggregated back to augment their pointwise features. We provide an effective implementation proposed aggregation strategy including curve grouping operator followed operator. Our was benchmarked on several...
Given an arbitrary face image and speech clip, the proposed work attempts to generate talking video with accurate lip synchronization. Existing works either do not consider temporal dependency across frames thus yielding abrupt facial movement or are limited generation of for a specific person lacking generalization capacity. We propose novel conditional recurrent network that incorporates both audio features in unit dependency. To achieve image- video-realism, pair spatial-temporal...
Guided image synthesis enables everyday users to create and edit photo-realistic images with minimum effort. The key challenge is balancing faithfulness the user input (e.g., hand-drawn colored strokes) realism of synthesized image. Existing GAN-based methods attempt achieve such balance using either conditional GANs or GAN inversions, which are challenging often require additional training data loss functions for individual applications. To address these issues, we introduce a new editing...
Abstract Background and Hypothesis Neuroimaging studies investigating the neural substrates of auditory verbal hallucinations (AVH) in schizophrenia have yielded mixed results, which may be reconciled by network localization. We sought to examine whether AVH-state AVH-trait brain alterations localize common or distinct networks. Study Design initially identified reported 48 previous studies. By integrating these affected locations with large-scale discovery validation resting-state...
In this paper, we propose a new classification method for five categories of lung tissues in high-resolution computed tomography (HRCT) images, with feature-based image patch approximation. We design two feature descriptors higher descriptiveness, namely the rotation-invariant Gabor-local binary patterns (RGLBP) texture descriptor and multi-coordinate histogram oriented gradients (MCHOG) gradient descriptor. Together intensity features, each is then labeled based on its approximation from...
The goal of this paper is to serve as a guide for selecting detection architecture that achieves the right speed/memory/accuracy balance given application and platform. To end, we investigate various ways trade accuracy speed memory usage in modern convolutional object systems. A number successful systems have been proposed recent years, but apples-to-apples comparisons are difficult due different base feature extractors (e.g., VGG, Residual Networks), default image resolutions, well...
Human actions capture a wide variety of interactions between people and objects. As result, the set possible is extremely large it difficult to obtain sufficient training examples for all actions. However, we could compensate this sparsity in supervision by leveraging rich semantic relationship different A single action often composed other smaller exclusive certain others. We need method which can reason about such relationships extrapolate unobserved from known Hence, propose novel neural...
Accurate and reliable segmentation of the prostate gland using magnetic resonance (MR) imaging has critical importance for diagnosis treatment diseases, especially cancer. Although many automated approaches, including those based on deep learning have been proposed, performance still room improvement due to large variability in image appearance, interference, anisotropic spatial resolution. In this paper, we propose 3D adversarial pyramid convolutional neural network (3D APA-Net) MR images....
"If I provide you a face image of mine (without telling the actual age when took picture) and large amount images that crawled (containing labeled faces different ages but not necessarily paired), can show me what would look like am 80 or was 5?" The answer is probably "No." Most existing aging works attempt to learn transformation between groups thus require paired samples as well query image. In this paper, we at problem from generative modeling perspective such no required. addition,...
Learning fine-grained image similarity is a challenging task. It needs to capture between-class and within-class differences. This paper proposes deep ranking model that employs learning techniques learn metric directly from images.It has higher capability than models based on hand-crafted features. A novel multiscale network structure been developed describe the images effectively. An efficient triplet sampling algorithm proposed with distributed asynchronized stochastic gradient. Extensive...
With the rapid increase of large-scale, real-world datasets, it becomes critical to address problem long-tailed data distribution (i.e., a few classes account for most data, while are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on number observations each class. In this work, we argue that samples increases, additional benefit newly added point will diminish. We introduce novel theoretical framework measure...
Most existing recommender systems leverage user behavior data of one type only, such as the purchase in E-commerce that is directly related to business Key Performance Indicator (KPI) conversion rate. Besides key behavioral data, we argue other forms behaviors also provide valuable signal, views, clicks, adding a product shopping carts and so on. They should be taken into account properly quality recommendation for users. In this work, contribute new solution named short Neural Multi-Task...