Wenming Tan

ORCID: 0000-0003-1338-4536
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Domain Adaptation and Few-Shot Learning
  • Advanced Image and Video Retrieval Techniques
  • Video Surveillance and Tracking Methods
  • Multimodal Machine Learning Applications
  • Human Pose and Action Recognition
  • Anomaly Detection Techniques and Applications
  • Image Retrieval and Classification Techniques
  • Machine Learning and ELM
  • Robotics and Sensor-Based Localization
  • Handwritten Text Recognition Techniques
  • AI in cancer detection
  • Radiomics and Machine Learning in Medical Imaging
  • Remote Sensing and LiDAR Applications
  • Sparse and Compressive Sensing Techniques
  • Advanced Image Processing Techniques
  • Medical Imaging Techniques and Applications
  • Image Enhancement Techniques
  • 3D Surveying and Cultural Heritage
  • Parallel Computing and Optimization Techniques
  • Infrared Thermography in Medicine
  • Currency Recognition and Detection
  • Analytical Methods in Pharmaceuticals
  • Natural Language Processing Techniques
  • Flavonoids in Medical Research

Jiangmen Central Hospital
2009-2024

BeiGene (China)
2023

Hikvision (China)
2020-2022

Zhejiang University
2022

InferVision (China)
2021

University of Science and Technology of China
2021

Sichuan Center for Disease Control and Prevention
2020

Vancouver Coastal Health
2016

Radiation Oncology Institute
2008

Deutsches Institut für Normung
2005

Current methods of multi-person pose estimation typically treat the localization and association body joints separately. In this paper, we propose first fully end-to-end Pose Estimation framework with TRansformers, termed PETR. Our method views as a hierarchical set prediction problem effectively removes need for many hand-crafted modules like RoI cropping, NMS grouping post-processing. PETR, multiple queries are learned to directly reason full-body poses. Then joint decoder is utilized...

10.1109/cvpr52688.2022.01079 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Fast and precise object detection for hgigh-resolution aerial images has been a challenging task over the years. Due to sharp variations in scale, rotation, aspect ratio, most existing methods are inefficient imprecise. In this paper, we propose different approach polar method. We locate an by centre-point, direct it four angles, measure ratio system. Our coordinate-based method, PolarDet, is faster, simpler, more accurate one-stage detector. Also, our detector introduces sub-pixel centre...

10.1080/01431161.2021.1931535 article EN cc-by-nc-nd International Journal of Remote Sensing 2021-06-23

Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhou Zhao, Jiaxu Miao, Wenqiao Wenming Tan, Jin Peng Shiliang Pu, Fei Wu. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.

10.18653/v1/2022.acl-long.596 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Multi-person pose estimation is an attractive and challenging task. Existing methods are mostly based on two-stage frameworks, which include top-down bottom-up methods. Two-stage either suffer from high computational redundancy for additional person detectors or they need to group keypoints heuristically after predicting all the instance-agnostic keypoints. The single-stage paradigm aims simplify multi-person pipeline receives a lot of attention. However, recent have limitation low...

10.1145/3474085.3475447 article EN Proceedings of the 30th ACM International Conference on Multimedia 2021-10-17

Camera and LIDAR are both important sensor modalities for real-world applications, especially autonomous driving. The sensors provide complementary information make fusion possible. However, the progress of early-fusion is very slow due to limitations viewpoint misalignment, feature misalignment data volume alignment, so that its performance also low. In this work, we propose a novel pipeline: an method range image RGB enhance 3D object detection. It takes full advantage LIDAR's view, point...

10.1109/jsen.2021.3127626 article EN IEEE Sensors Journal 2021-11-11

This paper presents an end-to-end instance segmentation framework, termed SOIT, that Segments Objects with Instance-aware Transformers. Inspired by DETR, our method views as a direct set prediction problem and effectively removes the need for many hand-crafted components like RoI cropping, one-to-many label assignment, non-maximum suppression (NMS). In multiple queries are learned to directly reason of object embeddings semantic category, bounding-box location, pixel-wise mask in parallel...

10.1609/aaai.v36i3.20227 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

3D vehicle detection based on multi-modal fusion is an important task of many applications such as autonomous driving. Although significant progress has been made, we still observe two aspects that calls for further improvement: First, what extra information can be obtained from the images to complement point clouds in tasks seldom explored by previous works. Second, most modules only used their designed network, lacking universality. In this work, propose PointAttentionFusion and...

10.1109/itsc55140.2022.9922104 article EN 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC) 2022-10-08

It is still challenging to detect and locate anomalies by models trained only with normal samples. Methods using image reconstruction as a pretext task can provide precise localization but suffer from harnessing the capability on unseen anomalies. This paper proposes new framework of Multi-Task Hard example Mining (MTHM) for anomaly detection localization. The self-supervised multi-task setting creatively takes advantage competition among different tasks learn more compact efficient...

10.1109/tim.2023.3276529 article EN IEEE Transactions on Instrumentation and Measurement 2023-01-01

Current 6D pose estimation methods focus on handling objects that are previously trained, which limits their applications in real dynamic world. To this end, we propose a geometry correspondence-based framework, termed GCPose, to estimate of arbitrary unseen without any re-training. Specifically, the proposed method draws idea from point cloud registration and resorts object-agnostic features establish 3D-3D correspondences between object-scene object-model cloud. Then parameters solved by...

10.1109/iccv51070.2023.01291 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Most existing knowledge distillation methods for semantic segmentation focus on extracting various sophisticated from raw features. However, such is usually manually designed and relies prior as in traditional feature engineering. In this paper, we aim to propose a simple effective method using To end, revisit the pioneering work distillation, FitNets, which simply minimizes mean squared error (MSE) loss between teacher student Our experiments show that naive yields good results, even...

10.1109/wacv57701.2024.00119 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024-01-03

Performance of current point cloud-based outdoor 3D object detection relies heavily on large-scale high-quality annotations. However, such annotations are usually expensive to collect and scenes easily accumulate massive unlabeled data containing rich scenes. Semi-supervised learning is a effective alternative utilize both labeled data, but remains unexplored in detection. Inspired by indoor semi-supervised methods, SESS 3DIoUMatch, we propose ATF-3D, framework for Specifically, design...

10.1109/lra.2022.3187496 article EN IEEE Robotics and Automation Letters 2022-06-30

The benign and malignant (BM) classification of breast masses based on mammography is a key step in the diagnosis early cancer an effective way to improve survival rate patients. Nevertheless, due differences size, shape texture visual similarity between same category, it difficult obtain robust model using conventional deep learning methods. To address this problem, we proposed Multi-Tasking U-shaped Network (MT-UNet), which contains three ideas: 1) architecture constructed can well adapt...

10.1109/access.2020.3042889 article EN cc-by IEEE Access 2020-01-01

Pseudo bounding boxes from the self-training paradigm are inevitably noisy for semi-supervised object detection. To cope with that, a dual decoupling training framework is proposed in present study, i.e. clean and data decoupling, classification localization task decoupling. In first two-level thresholds used to categorize pseudo into three groups, backgrounds, foregrounds foregrounds. With specially designed noise-bypass head focusing on data, backbone networks can extract coarse but...

10.1609/aaai.v36i3.20264 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Post-training quantization (PTQ) is an effective compression method to reduce the model size and computational cost. However, quantizing a into low-bit one, e.g., lower than 4, difficult often results in non-negligible performance degradation. To address this, we investigate loss landscapes of quantized networks with various bit-widths. We show that network more ragged surface, easily trapped bad local minima, which mostly appears quantization. A deeper analysis indicates, surface caused by...

10.1109/cvpr52729.2023.01554 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Phosphodiesterase 5 (PDE5) is a cGMP-specific hydrolytic enzyme and widely distributed in versatile tissues. PDE5 has been identified as valid therapeutic target for treating erectile dysfunction pulmonary arterial hypertension (PAH). Herein, hit-to-lead structural optimizations were performed on the PDE1 inhibitor

10.1021/acs.jmedchem.4c02123 article EN Journal of Medicinal Chemistry 2024-12-05

In recent years, significant progress has been made on the research of crowd counting. However, as challenging scale variations and complex scenes existed in crowds, neither traditional convolution networks nor Transformer architectures with fixed-size attention could handle task well. To address this problem, paper proposes a scene-adaptive network, termed SAANet. First all, we design deformable in-built backbone, which learns adaptive feature representations sampling locations dynamic...

10.48550/arxiv.2112.15509 preprint EN cc-by arXiv (Cornell University) 2021-01-01

3D vehicle detection based on multi-modal fusion is an important task of many applications such as autonomous driving. Although significant progress has been made, we still observe two aspects that need to be further improvement: First, the specific gain camera images can bring seldom explored by previous works. Second, algorithms run slowly, which essential for with high real-time requirements(autonomous driving). To this end, propose end-to-end trainable single-stage feature adaptive...

10.48550/arxiv.2009.10945 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Current methods for open-vocabulary object detection (OVOD) rely on a pre-trained vision-language model (VLM) to acquire the recognition ability. In this paper, we propose simple yet effective framework Distill Knowledge from VLM DETR-like detector, termed DK-DETR. Specifically, present two ingenious distillation schemes named semantic knowledge (SKD) and relational (RKD). To utilize rich systematically, SKD transfers explicitly, while RKD exploits implicit relationship information between...

10.1109/iccv51070.2023.00598 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

The accurate segmentation of breast masses in mammography images is a key step the diagnosis early cancer. To solve problem various shapes and sizes masses, this paper proposes cascaded UNet architecture, which referred to as CasUNet. CasUNet contains six subnetworks, network depth increases from 1 6, output features between adjacent subnetworks are cascaded. Furthermore, we have integrated channel attention mechanism based on CasUNet, hoping that it can focus important feature maps. Aiming...

10.1109/icip42928.2021.9506159 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2021-08-23
Coming Soon ...