Wenyang Hu

ORCID: 0009-0008-6189-7890
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Handwritten Text Recognition Techniques
  • Image Retrieval and Classification Techniques
  • Semantic Web and Ontologies
  • Hand Gesture Recognition Systems
  • Machine Learning and Data Classification
  • Video Surveillance and Tracking Methods
  • Natural Language Processing Techniques
  • Text and Document Classification Technologies
  • Topic Modeling
  • AI-based Problem Solving and Planning
  • Human Pose and Action Recognition
  • Image Processing and 3D Reconstruction
  • Machine Learning and Algorithms

Nanyang Technological University
2020

Connectionist Temporal Classification (CTC) and attention mechanism are two main approaches used in recent scene text recognition works. Compared with attention-based methods, CTC decoder has a much shorter inference time, yet lower accuracy. To design an efficient effective model, we propose the guided training of (GTC), where model learns better alignment feature representations from more powerful attentional guidance. With benefit training, achieves robust accurate prediction for both...

10.1609/aaai.v34i07.6735 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Large language models (LLMs) have shown impressive capabilities in real-world applications. The capability of in-context learning (ICL) allows us to adapt an LLM downstream tasks by including input-label exemplars the prompt without model fine-tuning. However, quality these greatly impacts performance, highlighting need for effective automated exemplar selection method. Recent studies explored retrieval-based approaches select tailored individual test queries, which can be undesirable due...

10.48550/arxiv.2405.16122 preprint EN arXiv (Cornell University) 2024-05-25

Connectionist Temporal Classification (CTC) and attention mechanism are two main approaches used in recent scene text recognition works. Compared with attention-based methods, CTC decoder has a much shorter inference time, yet lower accuracy. To design an efficient effective model, we propose the guided training of (GTC), where model learns better alignment feature representations from more powerful attentional guidance. With benefit training, achieves robust accurate prediction for both...

10.48550/arxiv.2002.01276 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Large language models (LLMs) have shown remarkable instruction-following capabilities and achieved impressive performances in various applications. However, the of LLMs depend heavily on instructions given to them, which are typically manually tuned with substantial human efforts. Recent work has used query-efficient Bayesian optimization (BO) algorithm automatically optimize black-box LLMs. BO usually falls short when optimizing highly sophisticated (e.g., high-dimensional) objective...

10.48550/arxiv.2310.02905 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Human pose assessment and correction play a crucial role in applications across various fields, including computer vision, robotics, sports analysis, healthcare, entertainment. In this paper, we propose Spatial-Temporal Transformer based Framework (STTF) for human education scenarios such as physical exercises science experiment. The framework comprising skeletal tracking, estimation, posture assessment, modules to educate students with professional, quick-to-fix feedback. We also create...

10.48550/arxiv.2311.00401 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...