Wenkai Chen

ORCID: 0000-0003-0169-8896
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Multimodal Machine Learning Applications
  • Robot Manipulation and Learning
  • Hand Gesture Recognition Systems
  • Topic Modeling
  • Natural Language Processing Techniques
  • Soft Robotics and Applications
  • Image and Video Stabilization
  • Advanced Image and Video Retrieval Techniques
  • Muscle activation and electromyography studies
  • Advanced Memory and Neural Computing
  • Image Retrieval and Classification Techniques
  • Thermochemical Biomass Conversion Processes
  • Image and Signal Denoising Methods
  • Advanced Algorithms and Applications
  • AI in cancer detection
  • COVID-19 Pandemic Impacts
  • Reinforcement Learning in Robotics
  • Higher Education and Teaching Methods
  • Phonetics and Phonology Research
  • Biodiesel Production and Applications
  • Video Surveillance and Tracking Methods
  • Colorectal Cancer Screening and Detection
  • Advanced Image Processing Techniques
  • Catalysts for Methane Reforming
  • Energy Load and Power Forecasting

Kunshan Govisionox Optoelectronic (China)
2024

Tsinghua University
2024

Universität Hamburg
2022-2024

Center for Information Technology
2024

Tianjin University of Technology
2023

Vector Institute
2023

University of British Columbia
2023

Beijing University of Posts and Telecommunications
2021

Shanghai Jiao Tong University
2020

Harvard University Press
2019

Although great progress has been made in generic object detection by advanced deep learning techniques, detecting small objects from images is still a difficult and challenging problem the field of computer vision due to limited size, less appearance, geometry cues, lack large-scale datasets targets. Improving performance wider significance many real-world applications, such as self-driving cars, unmanned aerial vehicles, robotics. In this article, first-ever survey recent studies...

10.1109/tsmc.2020.3005231 article EN IEEE Transactions on Systems Man and Cybernetics Systems 2020-07-17

With the advanced development of image processing technology, visible light positioning (VLP) system based on sensors has attracted more and attention. However, as a commonly used receiver, traditional CMOS camera limited dynamic range high latency, which is susceptible to various lighting environmental factors. Moreover, computational cost from unavoidable for most systems. In our work, novel VLP using an event-based neuromorphic vision sensor (event camera) receiver proposed. Due low...

10.1109/jsen.2020.2990752 article EN IEEE Sensors Journal 2020-04-27

Robotic rigid contact-rich manipulation in an unstructured dynamic environment requires effective resolution for smart manufacturing. As the most common use case intelligence industry, a lot of studies based on reinforcement learning (RL) algorithms have been conducted to improve performances single peg-in-hole assembly. However, existing RL methods are difficult apply multiple issues due more complicated geometric and physical constraints. In addition, previously limited solutions assembly...

10.1109/tcyb.2023.3310505 article EN IEEE Transactions on Cybernetics 2023-09-15

Abstract Currently, robotic grasping methods based on sparse partial point clouds have attained excellent performance various objects. However, they often generate wrong candidates due to the lack of geometric information object. In this work, we propose a novel and robust shape completion model (TransSC). This has transformer-based encoder explore more point-wise features manifold-based decoder exploit object details using segmented cloud as input. Quantitative experiments verify...

10.1007/s10846-022-01586-4 article EN cc-by Journal of Intelligent & Robotic Systems 2022-02-26

Currently, task-oriented grasp detection approaches are mostly based on pixel-level affordance and semantic segmentation. These heavily rely the accuracy of a 2D mask, generated candidates restricted to small workspace. To mitigate these limitations, we firstly construct novel affordance-based dataset propose 6-DoF framework, which takes observed object point cloud as input predicts diverse poses for different tasks. Specifically, our implicit estimation network visual in this framework...

10.1109/iros47612.2022.9981900 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022-10-23

As an alternative liquid fuel, Fischer–Tropsch (FT) diesel has received significant attentions due to its characteristics of high efficiency and low emission. In this study, a surrogate fuel containing iso-hexadecane n-dodecane with mole ratio 0.16:0.84 is formulated for real FT by mimicking combustion-related physicochemical properties. Mechanisms these two components are developed based on decoupling methodology: skeletal sub-mechanisms describing cracking process constructed combined...

10.1177/0957650919897474 article EN Proceedings of the Institution of Mechanical Engineers Part A Journal of Power and Energy 2020-01-06

Yilin Niu, Fei Huang, Jiaming Liang, Wenkai Chen, Xiaoyan Zhu, Minlie Huang. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.237 article EN cc-by 2021-01-01

With the development of display technology, people's needs for long lifetime are getting higher and higher. Due to period organic materials, how improve in certain materials system becomes more important. In this paper, we achieve by varying ratio nitrogen oxygen, as well changing power plasma, thereby improving performance mobile displays.

10.1002/sdtp.17285 article EN SID Symposium Digest of Technical Papers 2024-04-01

Affordance understanding, the task of identifying actionable regions on 3D objects, plays a vital role in allowing robotic systems to engage with and operate within physical world. Although Visual Language Models (VLMs) have excelled high-level reasoning long-horizon planning for manipulation, they still fall short grasping nuanced properties required effective human-robot interaction. In this paper, we introduce PAVLM (Point cloud Vision-Language Model), an innovative framework that...

10.48550/arxiv.2410.11564 preprint EN arXiv (Cornell University) 2024-10-15

For the underwater image of swimming pool, its S/N is low and edges are fuzzy. If use traditional restoration method to dispose it directly, result not expected, existing weaknesses such as enlarged edge, denoise incompletely so on. According high stabilization robust estimation noise, we combined technique with together, proposed an adaptive smoothing algorithm. This set up model based on M-estimator, saw a problem position estimation. Experiments images pool proved that approach can...

10.1109/dmamh.2007.82 article EN 2007-12-01

The capability for robotic systems to rearrange objects based on human instructions represents a critical step towards realizing embodied intelligence. Recently, diffusion-based learning has shown significant advancements in the field of data generation while prompt-based proven effective formulating robot manipulation strategies. However, prior solutions rearrangement have overlooked significance integrating preferences and optimizing efficiency. Additionally, traditional approaches...

10.36227/techrxiv.171171867.71552439/v1 preprint EN cc-by 2024-03-29

Composed Image Retrieval (CIR) is a complex task that retrieves images using query, which configured with an image and caption describes desired modifications to image. Supervised CIR approaches have shown strong performance, but their reliance on expensive manually-annotated datasets restricts scalability broader applicability. To address these issues, previous studies proposed pseudo-word token-based Zero-Shot (ZS-CIR) methods, utilize projection module map word tokens. However, we...

10.48550/arxiv.2405.00571 preprint EN arXiv (Cornell University) 2024-05-01

The event-based camera is a novel neuromorphic vision sensor that can perceive different dynamic behaviors due to its low latency, asynchronous data stream, and high range characteristics. There has been much work based on event cameras solve problems, such as object tracking, visual odometry, gesture recognition. However, the adoption of analyze hand-object action in environment, problem regular CMOS cannot handle, still lacking relevant research. This presents richly annotated...

10.1109/tii.2024.3393007 article EN IEEE Transactions on Industrial Informatics 2024-05-09

10.1109/globecom52923.2024.10901777 article EN GLOBECOM 2022 - 2022 IEEE Global Communications Conference 2024-12-08

LLMs have demonstrated impressive zero-shot performance on NLP tasks thanks to the knowledge they acquired in their training. In multiple-choice QA tasks, LM probabilities are used as an imperfect measure of plausibility each answer choice. One major limitations basic score is that it treats all words equally important. We propose CASE, a Commonsense-Augmented Score with Expanded Answer Space. CASE addresses this limitation by assigning importance weights for individual based semantic...

10.18653/v1/2023.findings-emnlp.180 article EN cc-by 2023-01-01

For the underwater image of swimming pool, its S/N is low and edges are fuzzy. If use traditional restoration method to dispose it directly, result not expected, existing weaknesses such as enlarged edge, denoise incompletely so on. According high stabilization robust estimation noise, we combined technique with together, proposed an adaptive smoothing algorithm. This set up model based on M-estimator, saw a problem position estimation. Experiments images pool proved that approach can...

10.1109/dmamh.2007.4414522 article EN 2007-12-01

Unsupervised commonsense question answering is appealing since it does not rely on any labeled task data. Among existing work, a popular solution to use pre-trained language models score candidate choices directly conditioned the or context. However, such scores from can be easily affected by irrelevant factors, as word frequencies, sentence structures, etc. These distracting factors may only mislead model choose wrong answer but also make oversensitive lexical perturbations in answers. In...

10.48550/arxiv.2105.14781 preprint EN other-oa arXiv (Cornell University) 2021-01-01

This paper collected earthquake cases related data in China mainland from 1993 to 2015, we used three geographical detectors (risk detector, factor and interaction detector) based on spatial variation analysis of some potential factors assess quantitatively their effects the mortality.It was found that are responsible for mortality: intensity, house destruction area, indoor rate.While not observing population density level economic development area earthquake.There between factor, mainly...

10.2991/wrarm-17.2017.8 article EN cc-by-nc 2017-01-01

LLMs have demonstrated impressive zero-shot performance on NLP tasks thanks to the knowledge they acquired in their training. In multiple-choice QA tasks, LM probabilities are used as an imperfect measure of plausibility each answer choice. One major limitations basic score is that it treats all words equally important. We propose CASE, a Commonsense-Augmented Score with Expanded Answer Space. CASE addresses this limitation by assigning importance weights for individual based semantic...

10.48550/arxiv.2311.01684 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...