Shaozu Yuan

ORCID: 0000-0001-5084-7064
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Multimodal Machine Learning Applications
  • Generative Adversarial Networks and Image Synthesis
  • Computer Graphics and Visualization Techniques
  • Advanced Image and Video Retrieval Techniques
  • Human Pose and Action Recognition
  • Natural Language Processing Techniques
  • Topic Modeling
  • Video Analysis and Summarization
  • Image Retrieval and Classification Techniques
  • Image and Signal Denoising Methods
  • Domain Adaptation and Few-Shot Learning
  • Speech and dialogue systems
  • Aesthetic Perception and Analysis
  • Human Motion and Animation
  • Handwritten Text Recognition Techniques
  • Advanced Vision and Imaging
  • Sentiment Analysis and Opinion Mining
  • Video Surveillance and Tracking Methods
  • Seismic Imaging and Inversion Techniques
  • Advanced Technologies in Various Fields
  • Hate Speech and Cyberbullying Detection
  • 3D Surveying and Cultural Heritage
  • Anomaly Detection Techniques and Applications
  • Drilling and Well Engineering
  • 3D Shape Modeling and Analysis

Jingdong (China)
2020-2025

Meizu (China)
2025

JDSU (United States)
2024

China University of Petroleum, East China
2020-2021

Tohoku University
2007

Yiwei Wei, Shaozu Yuan, Ruosong Yang, Lei Shen, Zhangmeizhi Li, Longbiao Wang, Meng Chen. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.

10.18653/v1/2023.acl-long.287 article EN cc-by 2023-01-01

The significance of visual emotion distribution learning (VEDL) has surged, particularly with the growing inclination to convey emotions through images. key VEDL lies in capturing both low- and high-level features within same content, thus promoting model for salient subtle awareness. To learn involved images, most previous works coarse semantic knowledge unbiased filtering. Consequently, they focus on entire scene suffer from redundancy semantic-irrelevant information, which diminishes...

10.1609/aaai.v39i9.32965 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Human conversations are complicated and building a human-like dialogue agent is an extremely challenging task. With the rapid development of deep learning techniques, data-driven models become more prevalent which need huge amount real conversation data. In this paper, we construct large-scale scenario Chinese E-commerce corpus, JDDC, with than 1 million multi-turn dialogues, 20 utterances, 150 words. The dataset reflects several characteristics human-human conversations, e.g., goal-driven,...

10.48550/arxiv.1911.09969 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Previous works on font generation mainly focus the standard print fonts where character's shape is stable and strokes are clearly separated. There rare research brush hand-writing generation, which involves holistic structure changes complex transfer. To address this issue, we propose a novel GAN-based image translation model by integrating skeleton information. We first extract from training images, then design an encoder to corresponding features. A self-attentive refined attention module...

10.1109/icme52920.2022.9859964 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2022-07-18

Multimodal sarcasm detection, aiming to detect the ironic sentiment within multimodal social data, has gained substantial popularity in both natural language processing and computer vision communities. Recently, graph-based studies by drawing sentimental relations have made notable advancements. However, they neglected exploiting global semantic congruity from existing instances facilitate prediction, which ultimately hinders model's performance. In this paper, we introduce a new inference...

10.1609/aaai.v38i8.28766 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

In this paper, we propose a novel bicubic method for digital image interpolation. Since the conventional does not consider local features, interpolated images obtained by often have blurring problem. proposed adopts both asymmetry features and gradient of an in interpolation processing. Experimental results show that can obtain high accuracy images.

10.1093/ietfec/e90-a.8.1611 article EN IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences 2007-08-01

Image caption based on reinforcement learning (RL) methods has achieved significant success recently. Most of these take CIDEr score as the reward algorithm to compute gradients, thus refining image baseline model. However, is not sole criterion judge quality a generated caption. In this paper, Hierarchical Attention Fusion (HAF) model presented for RL, where multi-level feature maps Resnet are integrated with hierarchical attention. Revaluation network (REN) exploited revaluating by...

10.1109/access.2020.2981513 article EN cc-by IEEE Access 2020-01-01

Emotion plays a critical role in calligraphy composition, which makes the artwork impressive and have soul. However, previous research on generation all neglected emotion as major contributor to artistry of calligraphy. Such defects prevent them from generating aesthetic, stylistic, diverse artworks, but only static handwriting font library instead. To address this problem, we propose novel cross-modal approach generate stylistic Chinese driven by different emotions automatically. We firstly...

10.1145/3474085.3475711 article EN Proceedings of the 30th ACM International Conference on Multimedia 2021-10-17

In this paper, we present a novel system (denoted as Polaca) to generate poetic Chinese landscape painting with calligraphy. Unlike previous single image-to-image generation, Polaca takes the classic poetry input and outputs artistic image corresponding It is equipped three different modules complete whole piece of artwork: first one text-to-image module image, second an stylistic calligraphy third fusion fuse two images into aesthetic artwork.

10.24963/ijcai.2022/696 article EN Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022-07-01

Applying deformable transformer for dense video captioning has achieved great success recently. However, only explores local-perspective perception by attending to a small set of key sampling points, which will make the decoder short-sighted and generate semantically incoherent contradictory captions long video. In this paper, we propose novel Multi-Perspective Perception Network improve problem. We first introduce hierarchical temporal-spatial summary method global-perspective context each...

10.2139/ssrn.4346395 article EN 2023-01-01

Recently, Visual Question Answering(VQA), which is required to generate the answer by understanding both visual and textual content, has attracted considerable research interest. Most existing works extract features with CNN network learn its feature embedding an attention mechanism. However, this mechanism may ignore interaction between entities in image, a fuzzy impact on generation. To better explore relationship different novel Nested Attention Network Graph Filtering (NANGF) proposed....

10.1109/icassp49357.2023.10096849 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

10.1016/j.engappai.2024.108547 article EN Engineering Applications of Artificial Intelligence 2024-05-17

Supplementing product attribute information is a critical step for E-commerce platforms, which further benefits various downstream tasks, including recommendation, search, and knowledge graph construction. Intuitively, the visual available on e-commerce platforms can effectively function as primary source certain attributes. However, existing works either extract values solely from textual descriptions or leverage limited (e.g., image features optical character recognition tokens) to assist...

10.1109/tmm.2024.3407667 article EN IEEE Transactions on Multimedia 2024-01-01

In this paper, we present a novel system (denoted as Polaca) to generate poetic Chinese landscape painting with calligraphy. Unlike previous single image-to-image generation, Polaca takes the classic poetry input and outputs artistic image corresponding It is equipped three different modules complete whole piece of artwork: first one text-to-image module image, second an stylistic calligraphy third fusion fuse two images into aesthetic artwork.

10.48550/arxiv.2305.04719 preprint EN other-oa arXiv (Cornell University) 2023-01-01

We present a novel Chinese calligraphy artwork composition system (MaLiang) which can generate aesthetic, stylistic and diverse images based on the emotion status from input text. Different previous research, it's first work to endow synthesis with ability express fickle emotions composite whole piece of discourse-level instead single character images. The consists three modules: detection, image generation, layout prediction. As creative form interactive art, MaLiang has been exhibited in...

10.1145/3394171.3416338 article EN Proceedings of the 30th ACM International Conference on Multimedia 2020-10-12

In this paper, we use a modified Gaussian filter to improve enlargement accuracy of the arbitrary scale LP method, which is based on Laplacian pyramid representation (so called “LP method”). The parameters proposed algorithm are extracted through theoretical analysis and an experimental estimation. Experimental results show that effective for method.

10.1093/ietfec/e90-a.5.1115 article EN IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences 2007-05-01

Sketch plays a critical role in the human art creation process. As one of functions sketch, text-to-sketch may help artists to catch fleeting inspirations efficiently. Different from traditional text2image tasks, sketches consist only set sparse lines and depend on very strict edge information, which requires model understand text descriptions accurately control shape texture fine-grained granularity. However, there was rare previous research challenging text2sketch task. In this paper, we...

10.1109/iccvw54120.2021.00277 article EN 2021-10-01
Coming Soon ...