- Generative Adversarial Networks and Image Synthesis
- Multimodal Machine Learning Applications
- 3D Shape Modeling and Analysis
- Human Pose and Action Recognition
- Advanced Neural Network Applications
- Video Surveillance and Tracking Methods
- Robotics and Sensor-Based Localization
- Advanced Image and Video Retrieval Techniques
- Topic Modeling
- Gait Recognition and Analysis
- Face recognition and analysis
- Autonomous Vehicle Technology and Safety
- Intelligent Tutoring Systems and Adaptive Learning
- Advanced Vision and Imaging
- Recommender Systems and Techniques
- Vehicle License Plate Recognition
- Online Learning and Analytics
- Speech and Audio Processing
- Advanced Computing and Algorithms
- Image and Video Stabilization
- Fashion and Cultural Textiles
- Domain Adaptation and Few-Shot Learning
- AI-based Problem Solving and Planning
- Video Analysis and Summarization
- Emotion and Mood Recognition
National Chin-Yi University of Technology
2024
Institute for Infocomm Research
2019-2024
Dalian Polytechnic University
2024
Agency for Science, Technology and Research
2019-2024
A*STAR Graduate Academy
2023
Nanyang Technological University
2015-2020
National Cheng Kung University
2017
With the increasing global popularity of self-driving cars, there is an immediate need for challenging real-world datasets benchmarking and training various computer vision tasks such as 3D object detection. Existing either represent simple scenarios or provide only day-time data. In this paper, we introduce a new A*3D dataset which consists RGB images LiDAR data with significant diversity scene, time, weather. The high-density (≈ 10 times more than pioneering KITTI dataset), heavy...
With the booming development of online fashion industry, effective personalized recommender systems have become indispensable for convenience they brought to customers and profits e-commercial platforms. Estimating user’s preference towards outfit is at core a recommendation system. Existing works on are largely centering modelling clothing compatibility without considering user factor or characterizing over single item. However, how effectively model outfits with either few even none...
We tackle the task of street-to-shop clothing image synthesis. Given a daily person with particular item captured in street scenario, we aim to synthesize frontal facing view that shop scenario. This problem has following challenges: 1) distinct visual discrepancy between and scenario; 2) severe shape deformation presence an arbitrary human pose; 3) preservation fine-grained details during process generation. In this paper, jointly solve these difficulties by proposing Pose-Normalized...
Cross-domain shoe image retrieval is a challenging problem, because the query photo from street domain (daily life scenario) and reference in online (online shop images) have significant visual differences due to viewpoint scale variation, self-occlusion, cluttered background. This paper proposes semantic hierarchy of attribute convolutional neural network (SHOE-CNN) with three-level feature representation for discriminative expression efficient retrieval. The SHOE-CNN its newly designed...
Explainable recommender system has recently drawn increasing attention due to its capability of providing justification recommendation. Rather than focusing on certain topics or specific item features, the explanation generated by existing works are too general without guidance aspects. However, such information is not given in practical scenario. To address this issue, we propose a novel with BERT-guided generator, named ExBERT generate reliable finer granularity. More specifically,...
In this paper we aim to find exactly the same shoes given a daily shoe photo (street scenario) that matches online shop (shop scenario). There are large visual differences between street and scenario images. To handle discrepancy of different scenarios, learn feature embedding for via viewpoint-invariant triplet network, activations which reflect inherent similarity any two Specifically, propose new loss function minimizes distances images captured from viewpoints. Moreover, train proposed...
With the increasing global popularity of self-driving cars, there is an immediate need for challenging real-world datasets benchmarking and training various computer vision tasks such as 3D object detection. Existing either represent simple scenarios or provide only day-time data. In this paper, we introduce a new A*3D dataset which consists RGB images LiDAR data with significant diversity scene, time, weather. The high-density ($\approx~10$ times more than pioneering KITTI dataset), heavy...
In this paper, we address the problem of matching shoes from daily life photos to exactly same online shops. The is extremely challenging because significant visual differences between street domain images (shoe captured in scenario) and (images shops taken controlled environment). This paper presents a semantic Shoe Attribute-Guided Convolutional Neural Network (SAG-CNN) extract deep features. Moreover, develop three-level feature representation based on SAG-CNN. features extracted image,...
Learning the compatibility relationship is of vital importance to a fashion recommendation system, while existing works achieve this merely on product images but not street in complex daily life scenario. In paper, we propose novel system: Given query item interest scenario, system can return compatible items. More specifically, two-stage curriculum learning scheme developed transfer semantics from outfit images. We also domain-specific missing imputation method based style and color...
With the rapid proliferation of Internet, it becomes a great challenge to annotate explosive number objects manually. Especially for fashion domain where massive collection new products come up everyday. Therefore, save human labor, is essential develop an automatic tagging system those in variety appearances. In this paper, we focus on addressing issue shoe novel proposed predict semantic attributes images. Given image unknown viewpoint, our first classify into one 6 pre-defined...
Recent years have witnessed the dramatic development of e-fashion industry, it becomes essential to build an intelligent fashion recommender system. Most existing works on recommendation focus modeling general compatibility while ignoring user preferences. In this paper, we present a Personalized Attention Network (PAN) for recommendation. The key component PAN includes encoder, item encoder and preference predictor. To users' diverse interests, develop attention network incorporate learnt...
In this paper we deal with two image-based object search tasks in the fashion domain, clothing attribute prediction and cross-domain shoe retrieval. Clothing is about describing appearances of clothes via semantic attributes retrieval aims at retrieving same items from online stores given a daily life photo. We jointly solve these problems by novel Subordinate Attribute Convolutional Neural Network (SA-CNN), newly designed loss function that systematically merges closer visual appearance to...
Multimodal sentiment analysis aims to identify the emotions expressed by individuals through visual, language, and acoustic cues. However, most of existing research efforts assume that all modalities are available during both training testing, making their algorithms susceptible missing modality scenario. In this paper, we propose a novel knowledge-transfer network translate between different reconstruct audio modalities. Moreover, develop cross-modality attention mechanism retain maximal...
Knowledge tracing aims to predict students' probability of correctly answering the next question based on their interaction history. Previous methods employ left-to-right unidirectional transformers encode historical behaviors into hidden representations, especially with contrastive learning methods. Using uni-directional models model student can only learn representation from its previous items, which restricts power capability. Inspired by success BERT in text understanding, we propose a...
Denoising Diffusion Probabilistic Model (DDPM) has shown great competence in image and audio generation tasks. However, there exist few attempts to employ DDPM the text generation, especially review under recommendation systems. Fueled by predicted reviews explainability that justifies recommendations could assist users better understand recommended items increase transparency of system, we propose a Model-based Review Generation towards EXplainable Recommendation named Diffusion-EXR....