NFDI4DS | UHH-SEMS - Publication Details

Zhenyu Yang

ORCID: 0000-0002-6588-3014

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100433327

Research Areas

Topic Modeling
Natural Language Processing Techniques
Multimodal Machine Learning Applications
Advanced Text Analysis Techniques
Advanced Graph Neural Networks
Advanced Data Storage Technologies
Machine Learning in Materials Science
Speech Recognition and Synthesis
Model-Driven Software Engineering Techniques
Simulation Techniques and Applications
Software Engineering Research
Bioinformatics and Genomic Networks
Computational Drug Discovery Methods
Cryptography and Data Security
Complex Network Analysis Techniques
Machine Learning and Data Classification
AI in cancer detection
Network Security and Intrusion Detection
Soil Mechanics and Vehicle Dynamics
Text Readability and Simplification
Speech and dialogue systems
Internet Traffic Analysis and Secure E-voting
Text and Document Classification Technologies
Domain Adaptation and Few-Shot Learning
Software Testing and Debugging Techniques

Shandong University of Technology
2024

Macquarie University
2023-2024

China Mobile (China)
2022-2023

University Town of Shenzhen
2023

Tsinghua University
2022-2023

NetEase (China)
2022

Xinjiang Technical Institute of Physics & Chemistry
2021

Chinese Academy of Sciences
2021

University of Chinese Academy of Sciences
2021

UNSW Sydney
2020-2021

Privacy-Preserving Query over Encrypted Graph-Structured Data in Cloud Computing

OPENALEX - Publications

Ning Cao Zhenyu Yang Cong Wang Kui Ren Wenjing Lou

In the emerging cloud computing paradigm, data owners become increasingly motivated to outsource their complex management systems from local sites commercial public for great flexibility and economic savings. For consideration of users' privacy, sensitive have be encrypted before outsourcing, which makes effective utilization a very challenging task. this paper, first time, we define solve problem privacy-preserving query over graph-structured in (PPGQ), establish set strict privacy...

10.1109/icdcs.2011.84 article EN 2011-06-01

LT codes-based secure and reliable cloud storage service

OPENALEX - Publications

Ning Cao Shucheng Yu Zhenyu Yang Wenjing Lou Yantian Hou

With the increasing adoption of cloud computing for data storage, assuring service reliability, in terms correctness and availability, has been outstanding. While redundancy can be added into problem becomes challenging "pay-as-you-use" paradigm where we always want to efficiently resolve it both corruption detection repair. Prior distributed storage systems based on erasure codes or network coding techniques have either high decoding computational cost users, too much burden repair being...

10.1109/infcom.2012.6195814 article EN 2012-03-01

Minimum Entropy Principle Guided Graph Neural Networks

OPENALEX - Publications

Zhenyu Yang Ge Zhang Jia Wu Jian Yang Quan Z. Sheng and 4 more

Graph neural networks (GNNs) are now the mainstream method for mining graph-structured data and learning low-dimensional node- graph-level embeddings to serve downstream tasks. However, limited by bottleneck of interpretability that deep present, existing GNNs have ignored issue estimating appropriate number dimensions embeddings. Hence, we propose a novel framework called Minimum Entropy principle-guided Dimension Estimation, i.e. MGEDE, learns embedding both node graph representations. In...

10.1145/3539597.3570467 article EN 2023-02-22

An adaptive imputation method of missing data for sparsely retrieved dropouts in treatment policy strategy

OPENALEX - Publications

Chongfeng Yuan Zhenyu Yang Jiaqing Liu Xiaozhou Li Bokai Chen and 3 more

10.1016/j.cct.2025.107886 article EN Contemporary Clinical Trials 2025-03-01

GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

OPENALEX - Publications

Jian Ma Yong-lin Deng Chen Chen Nanyang Du H. Lu and 1 more

Posters serve an essential function in marketing and advertising by improving visual communication brand visibility, thus significantly contributing to industrial design. With the latest developments controllable T2I diffusion models, research interest has surged text rendering within synthesized images. Although accuracy seen advancements, automatic poster generation remains a relatively untapped area. This paper presents framework featuring capabilities through use of LLMs. Our employs...

10.1609/aaai.v39i6.32636 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

NARRepair: Non-Autoregressive Code Generation Model for Automatic Program Repair

OPENALEX - Publications

Zhenyu Yang Zhen Yang Zhongxing Yu

With the advancement of deep learning techniques, performance Automatic Program Repair(APR) techniques has reached a new level. Previous learning-based APR essentially modified program sentences in Autoregressive(AR) manner, which predicts future values based on past values. Due to manner word-by-word generation, AR-based technique huge time delay. This negative consequence overshadows widespread adoption real-life software development. To address issue, we aim apply Non-Autoregressive(NAR)...

10.48550/arxiv.2406.16526 preprint EN arXiv (Cornell University) 2024-06-24

E-code: Mastering efficient code generation through pretrained models and expert encoder group

OPENALEX - Publications

Yue Pan Chen Lyu Zhenyu Yang Lantian Li Qi Liu and 1 more

10.1016/j.infsof.2024.107602 article EN Information and Software Technology 2024-10-20

Low-rank and sparse representation based learning for cancer survivability prediction

OPENALEX - Publications

Jie Yang Jun Ma Khin Than Win Junbin Gao Zhenyu Yang

10.1016/j.ins.2021.10.013 article EN Information Sciences 2021-10-07

Generating Coherent Narratives by Learning Dynamic and Discrete Entity States with a Contrastive Framework

OPENALEX - Publications

Jian Guan Zhenyu Yang Rongsheng Zhang Zhipeng Hu Minlie Huang

Despite advances in generating fluent texts, existing pretraining models tend to attach incoherent event sequences involved entities when narratives such as stories and news. We conjecture that issues result from representing static embeddings of superficial words, while neglecting model their ever-changing states, i.e., the information they carry, text unfolds. Therefore, we extend Transformer dynamically conduct entity state updates sentence realization for narrative generation. propose a...

10.1609/aaai.v37i11.26509 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

State of the Art and Potentialities of Graph-level Learning

OPENALEX - Publications

Zhenyu Yang Ge Zhang Jia Wu Jian Yang Quan Z. Sheng and 7 more

Graphs have a superior ability to represent relational data, like chemical compounds, proteins, and social networks. Hence, graph-level learning, which takes set of graphs as input, has been applied many tasks including comparison, regression, classification, more. Traditional approaches learning heavily rely on hand-crafted features, such substructures. But while these methods benefit from good interpretability, they often suffer computational bottlenecks cannot skirt the graph isomorphism...

10.48550/arxiv.2301.05860 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Risk Factors Discovery for Cancer Survivability Analysis Using Graph-Rule Mining

OPENALEX - Publications

Chaoyu Yang Jie Yang Zhenyu Yang

Mining and understanding patients’ disease-development pattern is a major healthcare need. A huge number of research studies have focused on medical resource allocation, survivability prediction, risk management diagnosis, etc. In this article, we are specifically interested in discovering factors for patients with high probability developing cancers. We propose systematic data-driven algorithm build around the idea association rule mining. More precisely, rule-mining method firstly applied...

10.1155/2020/2384130 article EN cc-by Mathematical Problems in Engineering 2020-07-31

Curriculum-Based Self-Training Makes Better Few-Shot Learners for Data-to-Text Generation

OPENALEX - Publications

Pei Ke Haozhe Ji Zhenyu Yang Yi Huang Junlan Feng and 2 more

Despite the success of text-to-text pre-trained models in various natural language generation (NLG) tasks, performance is largely restricted by number labeled data downstream particularly data-to-text tasks. Existing works mostly utilize abundant unlabeled structured to conduct unsupervised pre-training for task adaption, which fail model complex relationship between source and target texts. Thus, we introduce self-training as a better few-shot learner than task-adaptive pre-training,...

10.24963/ijcai.2022/580 article EN Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022-07-01

GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

OPENALEX - Publications

Jian Ma Yong-lin Deng Chen Chen H. Lu Zhenyu Yang

Posters play a crucial role in marketing and advertising, contributing significantly to industrial design by enhancing visual communication brand visibility. With recent advances controllable text-to-image diffusion models, more concise research is now focusing on rendering text within synthetic images. Despite improvements accuracy, the field of end-to-end poster generation remains underexplored. This complex task involves striking balance between accuracy automated layout produce...

10.48550/arxiv.2407.02252 preprint EN arXiv (Cornell University) 2024-07-02

E-code: Mastering Efficient Code Generation through Pretrained Models and Expert Encoder Group

OPENALEX - Publications

Yue Pan Chen Lyu Zhenyu Yang Lantian Li Qi Liu and 1 more

Context: With the waning of Moore's Law, software industry is placing increasing importance on finding alternative solutions for continuous performance enhancement. The significance and research results optimization have been rise in recent years, especially with advancement propelled by Large Language Models(LLMs). However, traditional strategies rectifying flaws shown significant limitations at competitive code efficiency level, this topic surprisingly scarce. Objective: This study aims to...

10.48550/arxiv.2408.12948 preprint EN arXiv (Cornell University) 2024-08-23

Design and Testing of a Closed Multi-Channel Air-Blowing Seedling Pick-Up Device for an Automatic Vegetable Transplanter

OPENALEX - Publications

Bingchao Zhang Xiangyu Wen Yongshuang Wen Xinglong Wang Haoqi Zhu and 2 more

In this study, a closed multi-channel air-blowing plug seedling pick-up device and combined tray were designed to address the issues of complex structure, high damage rates low efficiency in fully automated vegetable transplanter systems. The operates by sealing seedlings cup, where compressed air is channeled into sealed cavity through multiple passages during process. upper surface subjected uniform force, overcoming friction adhesion between tray. This process presses guide tube,...

10.3390/agriculture14101688 article EN cc-by Agriculture 2024-09-26

LaMemo: Language Modeling with Look-Ahead Memory

OPENALEX - Publications

Haozhe Ji Rongsheng Zhang Zhenyu Yang Zhipeng Hu Minlie Huang

Haozhe Ji, Rongsheng Zhang, Zhenyu Yang, Zhipeng Hu, Minlie Huang. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.

10.18653/v1/2022.naacl-main.422 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

Semantic-Enhanced Explainable Finetuning for Open-Domain Dialogues

OPENALEX - Publications

Chen Wu Yinhe Zheng Yida Wang Zhenyu Yang Minlie Huang

This paper propose to combine pretrained language models with the modular dialogue paradigm for open-domain modeling. Our method, semantic-enhanced finetuning, instantiates conversation understanding, planning, and response generation as a model finetuning task. At inference, we disentangle semantic token variations by specifying sampling methods constraints each module separately. For training evaluation, present X-Weibo, Chinese multi-turn dataset automatic annotation emotions, DAs,...

10.48550/arxiv.2106.03065 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Multi-view learning for context-aware extractive summarization

OPENALEX - Publications

Zhenyu Yang Jie Yang Brian Yecies Wanqing Li

Natural language processing techniques have witnessed a notable success in many applications, such as dialogue generation, machine translation, and document summarization. Among them, the study of summarization is an active research area, which aims to generate concise version original preserve its most informative content. However, existing work fails consider interaction among raw data (in terms contextual similarity difference), results inaccurate even conflicting outcomes. In this paper,...

10.1109/ssci47803.2020.9308369 article EN 2021 IEEE Symposium Series on Computational Intelligence (SSCI) 2020-12-01

Generating Coherent Narratives by Learning Dynamic and Discrete Entity States with a Contrastive Framework

OPENALEX - Publications

Jian Guan Zhenyu Yang Rongsheng Zhang Zhipeng Hu Minlie Huang

10.48550/arxiv.2208.03985 preprint EN other-oa arXiv (Cornell University) 2022-01-01

RTJTN: Relational Triplet Joint Tagging Network for Joint Entity and Relation Extraction

OPENALEX - Publications

Zhenyu Yang Lei Wang Bo Ma Yating Yang Rui Dong and 1 more

Extracting entities and relations from unstructured sentences is one of the most concerned tasks in field natural language processing. However, existing works process entity relation information a certain order suffer error iteration. In this paper, we introduce relational triplet joint tagging network (RTJTN), which divided into layer judgment layer. layer, instead extracting separately, propose method that allows model to simultaneously extract prevent iteration; and, solve overlapping...

10.1155/2021/3447473 article EN Computational Intelligence and Neuroscience 2021-01-01

LaMemo: Language Modeling with Look-Ahead Memory

OPENALEX - Publications

Haozhe Ji Rongsheng Zhang Zhenyu Yang Zhipeng Hu Minlie Huang

Although Transformers with fully connected self-attentions are powerful to model long-term dependencies, they struggling scale long texts thousands of words in language modeling. One the solutions is equip a recurrence memory. However, existing approaches directly reuse hidden states from previous segment that encodes contexts uni-directional way. As result, this prohibits memory dynamically interact current context provides up-to-date information for token prediction. To remedy issue, we...

10.48550/arxiv.2204.07341 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Coming Soon ...