- Advanced Neural Network Applications
- Advanced Image and Video Retrieval Techniques
- Visual Attention and Saliency Detection
- Video Surveillance and Tracking Methods
- Topic Modeling
- Image Retrieval and Classification Techniques
- Expert finding and Q&A systems
- CCD and CMOS Imaging Sensors
- Mobile Crowdsensing and Crowdsourcing
- Medical Image Segmentation Techniques
- Image and Object Detection Techniques
- Dental Radiography and Imaging
- Advanced Memory and Neural Computing
- Advanced Algorithms and Applications
- Advanced Vision and Imaging
- Advanced Graph Neural Networks
- Parallel Computing and Optimization Techniques
- Human Mobility and Location-Based Analysis
- Robotics and Sensor-Based Localization
- Remote Sensing and Land Use
- Advanced Data Storage Technologies
- Complex Network Analysis Techniques
- Advanced Sensor and Control Systems
- Evaluation Methods in Various Fields
- 3D Shape Modeling and Analysis
Sun Yat-sen University
2018-2022
Microsoft Research (United Kingdom)
2022
Microsoft Research Asia (China)
2021
University of Hong Kong
2018-2019
Chinese University of Hong Kong
2019
Tsinghua University
2013-2018
Relative position encoding (RPE) is important for transformer to capture sequence ordering of input tokens. General efficacy has been proven in natural language processing. However, computer vision, its not well studied and even remains controversial, e.g., whether relative can work equally as absolute position? In order clarify this, we first review existing methods analyze their pros cons when applied vision transformers. We then propose new dedicated 2D images, called image RPE (iRPE)....
Object tracking has achieved significant progress over the past few years. However, state-of-the-art trackers become increasingly heavy and expensive, which limits their deployments in resource-constrained applications. In this work, we present LightTrack, uses neural architecture search (NAS) to design more lightweight efficient object trackers. Comprehensive experiments show that our LightTrack is effective. It can find achieve superior performance compared handcrafted SOTA trackers, such...
Vision Transformer (ViT) models have recently drawn much attention in computer vision due to their high model capability. However, ViT suffer from huge number of parameters, restricting applicability on devices with limited memory. To alleviate this problem, we propose MiniViT, a new compression framework, which achieves parameter reduction transformers while retaining the same performance. The central idea MiniViT is multiplex weights consecutive transformer blocks. More specifically, make...
The authors propose an automatic method for extracting objects with fine quality from photographs. authors’ starts finding bounding boxes that enclose potential objects, which is achievable by state‐of‐the‐art object proposal methods. To further segment within obtained boxes, the a new multi‐pass level‐set based on saliency detection and foreground pixel classification. function initially constructed respect to automatically detected salient parts box, eliminates user interaction predicts...
Vision Transformer has shown great visual representation power in substantial vision tasks such as recognition and detection, thus been attracting fast-growing efforts on manually designing more effective architectures. In this paper, we propose to use neural architecture search automate process, by searching not only the but also space. The central idea is gradually evolve different dimensions guided their E-T Error computed using a weight-sharing supernet. Moreover, provide design...
In online question-and-answer (QA) websites like Quora, one central issue is to find (invite) users who are able provide answers a given question and at the same time would be unlikely say "no" invitation. The challenge how trade off matching degree between users’ expertise topic, likelihood of positive response from invited users. this paper, we formally formulate problem develop weakly supervised factor graph (WeakFG) model address problem. explicitly captures questions To that an user...
Code size is an increasing concern on resource constrained systems, ranging from embedded devices to cloud servers. To address the issue, lowering memory occupancy has become a priority in developing and deploying applications, accordingly compiler-based optimizations have been proposed reduce program footprint. However, prior arts are generally dealing with source codes or intermediate representations, thus very limited scope real scenarios where only binary files commonly provided. fill...
A person’s career trajectory is composed of her/his past work or educational affiliations (institutions) at different points times. Knowing people’s, especially scholars’, trajectories can help the government make more scientific strategies to allocate resources and attract talent companies smart recruiting plans. It could also support individuals find appropriate co-researchers job opportunities. The paper focuses on inferring in academic social network. For about 1/3 authors not having any...
Relative position encoding (RPE) is important for transformer to capture sequence ordering of input tokens. General efficacy has been proven in natural language processing. However, computer vision, its not well studied and even remains controversial, e.g., whether relative can work equally as absolute position? In order clarify this, we first review existing methods analyze their pros cons when applied vision transformers. We then propose new dedicated 2D images, called image RPE (iRPE)....
In online question-and-answer (QA) websites like Quora, one central issue is to find (invite) users who are able provide answers a given question and at the same time would be unlikely say "no" invitation. The challenge how trade off matching degree between users' expertise topic, likelihood of positive response from invited users. this paper, we formally formulate problem develop weakly supervised factor graph (WeakFG) model address problem. explicitly captures questions To that an user...
The International Aerial Robotics Competition (IARC) aims to move the state-of-the-art in aerial robotics forward through mission challenges. In IARC Mission 7, robot will navigate without external navigation aids, interact with autonomous ground robots, and avoid dynamic obstacles herding problem. 2017 competition, our team firstly accomplished interactively herd one iRobot end of arena accurate rapid touch won first place system control. This paper presents self-localization, control...
The collection of internet images has been growing in an astonishing speed. It is undoubted that these contain rich visual information can be useful many applications, such as media creation and data-driven image synthesis. In this paper, we focus on the methodologies for building a object database from images. Such built to large number high-quality objects help with various applications. Our method based dense proposal generation objectness-based re-ranking. A novel deep convolutional...
The collection of internet images has been growing in an astonishing speed. It is undoubted that these contain rich visual information can be useful many applications, such as media creation and data-driven image synthesis. In this article, we focus on the methodologies for building a object database from images. Such built to large number high-quality objects help with various applications. Our method based dense proposal generation objectness-based re-ranking. A novel deep convolutional...
Object tracking has achieved significant progress over the past few years. However, state-of-the-art trackers become increasingly heavy and expensive, which limits their deployments in resource-constrained applications. In this work, we present LightTrack, uses neural architecture search (NAS) to design more lightweight efficient object trackers. Comprehensive experiments show that our LightTrack is effective. It can find achieve superior performance compared handcrafted SOTA trackers, such...