Wenhao Yu

ORCID: 0000-0002-9671-8652
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Multimodal Machine Learning Applications
  • Autonomous Vehicle Technology and Safety
  • Reinforcement Learning in Robotics
  • Energetic Materials and Combustion
  • Combustion and Detonation Processes
  • Advanced Image and Video Retrieval Techniques
  • Rocket and propulsion systems research
  • Machine Learning and Data Classification
  • Modular Robots and Swarm Intelligence
  • Intelligent Tutoring Systems and Adaptive Learning
  • Traffic and Road Safety
  • Robotic Locomotion and Control
  • Adversarial Robustness in Machine Learning
  • Traffic control and management
  • Traffic Prediction and Management Techniques
  • Robot Manipulation and Learning
  • Anomaly Detection Techniques and Applications
  • Power System Optimization and Stability
  • Online Learning and Analytics
  • Semantic Web and Ontologies
  • Robotic Path Planning Algorithms
  • Robotics and Sensor-Based Localization
  • Magnesium Alloys: Properties and Applications

Tsinghua University
2023-2025

University of Science and Technology of China
2024

Google (United States)
2023-2024

DeepMind (United Kingdom)
2024

Imperial College London
2023-2024

Shandong University of Technology
2024

Engineering Systems (United States)
2020-2023

University of Memphis
2020-2023

Institute of Electrical and Electronics Engineers
2020-2023

University of Notre Dame
2022-2023

The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilities fast large-batch inference enabled by multi-query attention. StarCoderBase is trained 1 trillion tokens sourced from Stack, a large collection permissively licensed GitHub repositories inspection tools opt-out process. We fine-tuned 35B Python...

10.48550/arxiv.2305.06161 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

Abstract Defined traffic laws must be respected by all vehicles when driving on the road, including self-driving without human drivers. Nevertheless, ambiguity of human-oriented laws, particularly compliance thresholds, poses a significant challenge to implementation regulations vehicles, especially in detecting illegal behaviors. To address these challenges, here we present trigger-based hierarchical online monitor for self-assessment behavior, which aims improve rationality and real-time...

10.1038/s41467-024-44694-5 article EN cc-by Nature Communications 2024-01-09

Large language models (LLMs) have demonstrated exciting progress in acquiring diverse new capabilities through in-context learning, ranging from logical reasoning to code-writing. Robotics researchers also explored using LLMs advance the of robotic control. However, since low-level robot actions are hardware-dependent and underrepresented LLM training corpora, existing efforts applying robotics largely treated as semantic planners or relied on human-engineered control primitives interface...

10.48550/arxiv.2306.08647 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Mathematical reasoning is a fundamental aspect of human intelligence and applicable in various fields, including science, engineering, finance, everyday life. The development artificial (AI) systems capable solving math problems proving theorems language has garnered significant interest the fields machine learning natural processing. For example, mathematics serves as testbed for aspects that are challenging powerful deep models, driving new algorithmic modeling advances. On other hand,...

10.18653/v1/2023.acl-long.817 article EN cc-by 2023-01-01

Humanoid robots have great potential to perform various human-level skills. These skills involve locomotion, manipulation, and cognitive capabilities. Driven by advances in machine learning the strength of existing model-based approaches, these capabilities progressed rapidly, but often separately. Therefore, a timely overview current progress future trends this fast-evolving field is essential. This survey first summarizes planning control that been backbone humanoid robotics for past three...

10.48550/arxiv.2501.02116 preprint EN arXiv (Cornell University) 2025-01-03

Answering open-domain questions requires world knowledge about in-context entities. As pre-trained Language Models (LMs) lack the power to store all required knowledge, external sources, such as graphs, are often used augment LMs. In this work, we propose knOwledge REasOning empowered Model(OREO-LM), which consists of a novel Knowledge Interaction Layer that can be flexibly plugged into existing Transformer-based LMs interact with differentiable Graph Reasoning module collaboratively. way,...

10.18653/v1/2022.emnlp-main.650 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2022-01-01

In this paper, we present a novel approach for distilling math word problem solving capabilities from large language models (LLMs) into smaller, more efficient student models. Our is designed to consider the model's weaknesses and foster tailored learning experience by generating targeted exercises aligned with educational science principles, such as knowledge tracing personalized learning. Concretely, let GPT-3 be tutor run two steps iteratively: 1) assessing current status on GPT-generated...

10.18653/v1/2023.emnlp-main.889 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

10.18653/v1/2024.emnlp-main.813 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2024-01-01

Current Open-Domain Question Answering (ODQA) model paradigm often contains a retrieving module and reading module. Given an input question, the predicts answer from relevant passages which are retrieved by retriever. The recent proposed Fusion-in-Decoder (FiD), is built on top of pretrained generative T5, achieves state-of-the-art performance in Although being effective, it remains constrained inefficient attention all contain lot noise. In this work, we propose novel method KG-FiD, filters...

10.48550/arxiv.2110.04330 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Modern text classification methods heavily rely on contextual embeddings from large language models (LLMs). Compared to human-engineered features, these provide automatic and effective representations for model training. However, they also introduce a challenge: we lose the ability manually remove unintended such as sensitive or task-irrelevant guarantee regulatory compliance improve generalizability of models. This limitation arises because LLM are opaque difficult interpret. In this paper,...

10.48550/arxiv.2502.14133 preprint EN arXiv (Cornell University) 2025-02-19

Driving comfort is a crucial consideration in the automotive industry. In realm of autonomous driving, has always been factor that requires continuous improvement. A common approach to improving driving through optimization local path planning. Nevertheless, it imperative recognize macroscopic factors, including traffic flow and road conditions, wield substantial influence on comfort. For instance, complex scenarios increase possibility emergency braking, thereby affecting Consequently,...

10.1038/s41467-025-57845-z article EN cc-by-nc-nd Nature Communications 2025-03-19

Manipulating human poses based on natural language is an emerging research field that has traditionally focused coarse commands such as “walking” or “dancing.” However, fine-grained pose manipulation, like instructing “put both hands in front of the stomach,” remains underexplored. In this paper, we introduce PoseLLaVA, a pioneering model integrates SMPL-based representations into multimodal LLaVA framework. Through novel encoder decoder mechanism, PoseLLaVA achieves precise alignment...

10.1609/aaai.v39i3.32302 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Retrieval-augmented generation (RAG) methods have been receiving increasing attention from the NLP community and achieved state-of-the-art performance on many downstream tasks. Compared with conventional pre-trained models, RAG remarkable advantages such as easy knowledge acquisition, strong scalability, low training cost. Although existing models applied to various knowledge-intensive tasks, open-domain QA dialogue systems, most of work has focused retrieving unstructured text documents...

10.18653/v1/2022.naacl-srw.7 article EN cc-by 2022-01-01

Large language models (LLMs) exhibit remarkable performance across various NLP tasks. However, they often generate incorrect or hallucinated information, which hinders their practical applicability in real-world scenarios. Human feedback has been shown to effectively enhance the factuality and quality of generated content, addressing some these limitations. this approach is resource-intensive, involving manual input supervision, can be time-consuming expensive. Moreover, it cannot provided...

10.48550/arxiv.2305.14002 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Large language models (LLMs) have demonstrated the potential to perform high-level planning. Yet, it remains a challenge for LLMs comprehend low-level commands, such as joint angle targets or motor torques. This paper proposes an approach use foot contact patterns interface that bridges human commands in natural and locomotion controller outputs these commands. results interactive system quadrupedal robots allows users craft diverse behaviors flexibly. We contribute LLM prompt design, reward...

10.48550/arxiv.2306.07580 preprint EN cc-by arXiv (Cornell University) 2023-01-01

In unanticipated obstacle scenarios at intersections, the safety and mobility of autonomous vehicles (AVs) are negatively impacted due to conflict between traffic law compliance avoidance. To solve this problem, an avoidance motion planning algorithm based on artificial potential field (APF) is proposed. An APF-switching logic utilized design framework. Collision risk travel delay quantified as switching triggers. The intersection laws digitalized classified construct compliance-oriented...

10.3390/app14041626 article EN cc-by Applied Sciences 2024-02-17

10.18653/v1/2024.emnlp-main.845 article NL Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2024-01-01

A common thread of open-domain question answering (QA) models employs a retriever-reader pipeline that first retrieves handful relevant passages from Wikipedia and then peruses the to produce an answer. However, even state-of-the-art readers fail capture complex relationships between entities appearing in questions retrieved passages, leading answers contradict facts. In light this, we propose novel knowledge graph enhanced passage reader, namely Grape, improve reader performance for QA....

10.18653/v1/2022.findings-emnlp.13 article EN cc-by 2022-01-01
Coming Soon ...