NFDI4DS | UHH-SEMS - Publication Details

GTC: Guided Training of CTC towards Efficient and Accurate Scene Text Recognition

OPENALEX - Publications

Wenyang Hu Xiaocong Cai Jun Hou Shuai Yi Zhiping Lin

Connectionist Temporal Classification (CTC) and attention mechanism are two main approaches used in recent scene text recognition works. Compared with attention-based methods, CTC decoder has a much shorter inference time, yet lower accuracy. To design an efficient effective model, we propose the guided training of (GTC), where model learns better alignment feature representations from more powerful attentional guidance. With benefit training, achieves robust accurate prediction for both...

10.1609/aaai.v34i07.6735 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars

OPENALEX - Publications

Zhaoxuan Wu Xiaoqiang Lin Zhongxiang Dai Wenyang Hu Yao Shu and 3 more

Large language models (LLMs) have shown impressive capabilities in real-world applications. The capability of in-context learning (ICL) allows us to adapt an LLM downstream tasks by including input-label exemplars the prompt without model fine-tuning. However, quality these greatly impacts performance, highlighting need for effective automated exemplar selection method. Recent studies explored retrieval-based approaches select tailored individual test queries, which can be undesirable due...

10.48550/arxiv.2405.16122 preprint EN arXiv (Cornell University) 2024-05-25

GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition

OPENALEX - Publications

Wenyang Hu Xiaocong Cai Jun Hou Shuai Yi Zhiping Lin

Connectionist Temporal Classification (CTC) and attention mechanism are two main approaches used in recent scene text recognition works. Compared with attention-based methods, CTC decoder has a much shorter inference time, yet lower accuracy. To design an efficient effective model, we propose the guided training of (GTC), where model learns better alignment feature representations from more powerful attentional guidance. With benefit training, achieves robust accurate prediction for both...

10.48550/arxiv.2002.01276 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Use Your INSTINCT: INSTruction optimization usIng Neural bandits Coupled with Transformers

OPENALEX - Publications

Xiaoqiang Lin Zhaoxuan Wu Zhongxiang Dai Wenyang Hu Yao Shu and 3 more

Large language models (LLMs) have shown remarkable instruction-following capabilities and achieved impressive performances in various applications. However, the of LLMs depend heavily on instructions given to them, which are typically manually tuned with substantial human efforts. Recent work has used query-efficient Bayesian optimization (BO) algorithm automatically optimize black-box LLMs. BO usually falls short when optimizing highly sophisticated (e.g., high-dimensional) objective...

10.48550/arxiv.2310.02905 preprint EN other-oa arXiv (Cornell University) 2023-01-01

A Spatial-Temporal Transformer based Framework For Human Pose Assessment And Correction in Education Scenarios

OPENALEX - Publications

Wenyang Hu Kai Liu Libin Liu Huiliang Shang

Human pose assessment and correction play a crucial role in applications across various fields, including computer vision, robotics, sports analysis, healthcare, entertainment. In this paper, we propose Spatial-Temporal Transformer based Framework (STTF) for human education scenarios such as physical exercises science experiment. The framework comprising skeletal tracking, estimation, posture assessment, modules to educate students with professional, quick-to-fix feedback. We also create...

10.48550/arxiv.2311.00401 preprint EN other-oa arXiv (Cornell University) 2023-01-01