Ye Wang

ORCID: 0000-0003-3169-0211
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech Recognition and Synthesis
  • Service-Oriented Architecture and Web Services
  • Software System Performance and Reliability
  • Multimodal Machine Learning Applications
  • Caching and Content Delivery
  • Business Process Modeling and Analysis
  • Neural Networks and Applications
  • Recommender Systems and Techniques
  • Graph Theory and Algorithms
  • Music and Audio Processing
  • Software Engineering Techniques and Practices
  • Domain Adaptation and Few-Shot Learning
  • Advanced Database Systems and Queries
  • Web Data Mining and Analysis
  • Data Quality and Management
  • Software Engineering Research
  • Speech and Audio Processing
  • Human Pose and Action Recognition
  • Advanced Software Engineering Methodologies
  • Video Analysis and Summarization
  • Knowledge Management and Sharing
  • Speech and dialogue systems
  • Data Stream Mining Techniques

Zhejiang Gongshang University
2020-2024

Zhejiang University
2023

Multi-media communications facilitate global interaction among people. However, despite researchers exploring cross-lingual translation techniques such as machine and audio speech to overcome language barriers, there is still a shortage of studies on visual speech. This lack research mainly due the absence datasets containing translated text pairs. In this paper, we present AVMuST-TED, first dataset for Audio-Visual Multilingual Speech Translation, derived from TED talks. Nonetheless, not...

10.1109/iccv51070.2023.01442 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Current video captioning efforts most focus on describing a single while the need for videos in groups has increased considerably. In this study, we propose new task, group captioning, which aims to infer desired content among of target and describe it with another related reference videos. This task requires model effectively summarize accurately distinguishing compared videos, becomes more difficult as length increases. To solve problem, 1) First, an efficient relational approximation...

10.1109/iccv51070.2023.01402 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

At present, Mashup development has attracted much attention in the field of software engineering. It is focus this article to use existing open APIs meet needs developers. Therefore, how select most appropriate API for a specific user requirement crucial problem be solved. We propose Hybrid Open Selection Approach (HyOASAM), which consists two basic approaches: one user-story-driven discovery approach, and other multidimensional-information-matrix- (MIM-) based recommendation approach. The...

10.1155/2020/4984375 article EN Mathematical Problems in Engineering 2020-04-25

Conventional pipeline of multimodal learning consists three stages, including encoding, fusion, and decoding. Most existing methods under missing modality condition focus on the first stage aim to learn invariant representation or reconstruct features. However, these rely strong assumptions (i.e., all pre-defined modalities are available for each input sample during training number is fixed). To solve this problem, we propose a simple yet effective method called Interaction Augmented...

10.1145/3581783.3612291 article EN 2023-10-26
Coming Soon ...