Kan Wu

ORCID: 0000-0002-0663-3410
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Advanced Image and Video Retrieval Techniques
  • Visual Attention and Saliency Detection
  • Video Surveillance and Tracking Methods
  • Topic Modeling
  • Image Retrieval and Classification Techniques
  • Expert finding and Q&A systems
  • CCD and CMOS Imaging Sensors
  • Mobile Crowdsensing and Crowdsourcing
  • Medical Image Segmentation Techniques
  • Image and Object Detection Techniques
  • Dental Radiography and Imaging
  • Advanced Memory and Neural Computing
  • Advanced Algorithms and Applications
  • Advanced Vision and Imaging
  • Advanced Graph Neural Networks
  • Parallel Computing and Optimization Techniques
  • Human Mobility and Location-Based Analysis
  • Robotics and Sensor-Based Localization
  • Remote Sensing and Land Use
  • Advanced Data Storage Technologies
  • Complex Network Analysis Techniques
  • Advanced Sensor and Control Systems
  • Evaluation Methods in Various Fields
  • 3D Shape Modeling and Analysis

Sun Yat-sen University
2018-2022

Microsoft Research (United Kingdom)
2022

Microsoft Research Asia (China)
2021

University of Hong Kong
2018-2019

Chinese University of Hong Kong
2019

Tsinghua University
2013-2018

Relative position encoding (RPE) is important for transformer to capture sequence ordering of input tokens. General efficacy has been proven in natural language processing. However, computer vision, its not well studied and even remains controversial, e.g., whether relative can work equally as absolute position? In order clarify this, we first review existing methods analyze their pros cons when applied vision transformers. We then propose new dedicated 2D images, called image RPE (iRPE)....

10.1109/iccv48922.2021.00988 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Object tracking has achieved significant progress over the past few years. However, state-of-the-art trackers become increasingly heavy and expensive, which limits their deployments in resource-constrained applications. In this work, we present LightTrack, uses neural architecture search (NAS) to design more lightweight efficient object trackers. Comprehensive experiments show that our LightTrack is effective. It can find achieve superior performance compared handcrafted SOTA trackers, such...

10.1109/cvpr46437.2021.01493 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Vision Transformer (ViT) models have recently drawn much attention in computer vision due to their high model capability. However, ViT suffer from huge number of parameters, restricting applicability on devices with limited memory. To alleviate this problem, we propose MiniViT, a new compression framework, which achieves parameter reduction transformers while retaining the same performance. The central idea MiniViT is multiplex weights consecutive transformer blocks. More specifically, make...

10.1109/cvpr52688.2022.01183 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

The authors propose an automatic method for extracting objects with fine quality from photographs. authors’ starts finding bounding boxes that enclose potential objects, which is achievable by state‐of‐the‐art object proposal methods. To further segment within obtained boxes, the a new multi‐pass level‐set based on saliency detection and foreground pixel classification. function initially constructed respect to automatically detected salient parts box, eliminates user interaction predicts...

10.1049/iet-ipr.2017.1144 article EN IET Image Processing 2018-02-15

Vision Transformer has shown great visual representation power in substantial vision tasks such as recognition and detection, thus been attracting fast-growing efforts on manually designing more effective architectures. In this paper, we propose to use neural architecture search automate process, by searching not only the but also space. The central idea is gradually evolve different dimensions guided their E-T Error computed using a weight-sharing supernet. Moreover, provide design...

10.48550/arxiv.2111.14725 preprint EN other-oa arXiv (Cornell University) 2021-01-01

In online question-and-answer (QA) websites like Quora, one central issue is to find (invite) users who are able provide answers a given question and at the same time would be unlikely say "no" invitation. The challenge how trade off matching degree between users’ expertise topic, likelihood of positive response from invited users. this paper, we formally formulate problem develop weakly supervised factor graph (WeakFG) model address problem. explicitly captures questions To that an user...

10.24963/ijcai.2018/534 article EN 2018-07-01

Code size is an increasing concern on resource constrained systems, ranging from embedded devices to cloud servers. To address the issue, lowering memory occupancy has become a priority in developing and deploying applications, accordingly compiler-based optimizations have been proposed reduce program footprint. However, prior arts are generally dealing with source codes or intermediate representations, thus very limited scope real scenarios where only binary files commonly provided. fill...

10.1145/3519941.3535072 article EN 2022-06-10

A person’s career trajectory is composed of her/his past work or educational affiliations (institutions) at different points times. Knowing people’s, especially scholars’, trajectories can help the government make more scientific strategies to allocate resources and attract talent companies smart recruiting plans. It could also support individuals find appropriate co-researchers job opportunities. The paper focuses on inferring in academic social network. For about 1/3 authors not having any...

10.24963/ijcai.2018/499 article EN 2018-07-01

Relative position encoding (RPE) is important for transformer to capture sequence ordering of input tokens. General efficacy has been proven in natural language processing. However, computer vision, its not well studied and even remains controversial, e.g., whether relative can work equally as absolute position? In order clarify this, we first review existing methods analyze their pros cons when applied vision transformers. We then propose new dedicated 2D images, called image RPE (iRPE)....

10.48550/arxiv.2107.14222 preprint EN cc-by arXiv (Cornell University) 2021-01-01

In online question-and-answer (QA) websites like Quora, one central issue is to find (invite) users who are able provide answers a given question and at the same time would be unlikely say "no" invitation. The challenge how trade off matching degree between users' expertise topic, likelihood of positive response from invited users. this paper, we formally formulate problem develop weakly supervised factor graph (WeakFG) model address problem. explicitly captures questions To that an user...

10.48550/arxiv.1611.04363 preprint EN other-oa arXiv (Cornell University) 2016-01-01

The International Aerial Robotics Competition (IARC) aims to move the state-of-the-art in aerial robotics forward through mission challenges. In IARC Mission 7, robot will navigate without external navigation aids, interact with autonomous ground robots, and avoid dynamic obstacles herding problem. 2017 competition, our team firstly accomplished interactively herd one iRobot end of arena accurate rapid touch won first place system control. This paper presents self-localization, control...

10.1109/rcar.2018.8621785 article EN 2022 IEEE International Conference on Real-time Computing and Robotics (RCAR) 2018-08-01

The collection of internet images has been growing in an astonishing speed. It is undoubted that these contain rich visual information can be useful many applications, such as media creation and data-driven image synthesis. In this paper, we focus on the methodologies for building a object database from images. Such built to large number high-quality objects help with various applications. Our method based dense proposal generation objectness-based re-ranking. A novel deep convolutional...

10.48550/arxiv.1904.00641 preprint EN other-oa arXiv (Cornell University) 2019-01-01

The collection of internet images has been growing in an astonishing speed. It is undoubted that these contain rich visual information can be useful many applications, such as media creation and data-driven image synthesis. In this article, we focus on the methodologies for building a object database from images. Such built to large number high-quality objects help with various applications. Our method based dense proposal generation objectness-based re-ranking. A novel deep convolutional...

10.1145/3318463 article EN ACM Transactions on Multimedia Computing Communications and Applications 2019-08-08

Object tracking has achieved significant progress over the past few years. However, state-of-the-art trackers become increasingly heavy and expensive, which limits their deployments in resource-constrained applications. In this work, we present LightTrack, uses neural architecture search (NAS) to design more lightweight efficient object trackers. Comprehensive experiments show that our LightTrack is effective. It can find achieve superior performance compared handcrafted SOTA trackers, such...

10.48550/arxiv.2104.14545 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...