- Handwritten Text Recognition Techniques
- Image Retrieval and Classification Techniques
- Semantic Web and Ontologies
- Hand Gesture Recognition Systems
- Machine Learning and Data Classification
- Video Surveillance and Tracking Methods
- Natural Language Processing Techniques
- Text and Document Classification Technologies
- Topic Modeling
- AI-based Problem Solving and Planning
- Human Pose and Action Recognition
- Image Processing and 3D Reconstruction
- Machine Learning and Algorithms
Nanyang Technological University
2020
Connectionist Temporal Classification (CTC) and attention mechanism are two main approaches used in recent scene text recognition works. Compared with attention-based methods, CTC decoder has a much shorter inference time, yet lower accuracy. To design an efficient effective model, we propose the guided training of (GTC), where model learns better alignment feature representations from more powerful attentional guidance. With benefit training, achieves robust accurate prediction for both...
Large language models (LLMs) have shown impressive capabilities in real-world applications. The capability of in-context learning (ICL) allows us to adapt an LLM downstream tasks by including input-label exemplars the prompt without model fine-tuning. However, quality these greatly impacts performance, highlighting need for effective automated exemplar selection method. Recent studies explored retrieval-based approaches select tailored individual test queries, which can be undesirable due...
Connectionist Temporal Classification (CTC) and attention mechanism are two main approaches used in recent scene text recognition works. Compared with attention-based methods, CTC decoder has a much shorter inference time, yet lower accuracy. To design an efficient effective model, we propose the guided training of (GTC), where model learns better alignment feature representations from more powerful attentional guidance. With benefit training, achieves robust accurate prediction for both...
Large language models (LLMs) have shown remarkable instruction-following capabilities and achieved impressive performances in various applications. However, the of LLMs depend heavily on instructions given to them, which are typically manually tuned with substantial human efforts. Recent work has used query-efficient Bayesian optimization (BO) algorithm automatically optimize black-box LLMs. BO usually falls short when optimizing highly sophisticated (e.g., high-dimensional) objective...
Human pose assessment and correction play a crucial role in applications across various fields, including computer vision, robotics, sports analysis, healthcare, entertainment. In this paper, we propose Spatial-Temporal Transformer based Framework (STTF) for human education scenarios such as physical exercises science experiment. The framework comprising skeletal tracking, estimation, posture assessment, modules to educate students with professional, quick-to-fix feedback. We also create...