NFDI4DS | UHH-SEMS - Publication Details

Wentao Zhang

ORCID: 0000-0002-7532-5550

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5008772211

Research Areas

Advanced Graph Neural Networks
Topic Modeling
Machine Learning and Data Classification
Natural Language Processing Techniques
Recommender Systems and Techniques
Advanced Neural Network Applications
Graph Theory and Algorithms
Domain Adaptation and Few-Shot Learning
Machine Learning and Algorithms
Complex Network Analysis Techniques
Text and Document Classification Technologies
Advanced Image and Video Retrieval Techniques
Generative Adversarial Networks and Image Synthesis
Machine Learning in Materials Science
Lymphoma Diagnosis and Treatment
Systemic Lupus Erythematosus Research
Semantic Web and Ontologies
Data Stream Mining Techniques
Salivary Gland Tumors Diagnosis and Treatment
Multimodal Machine Learning Applications
Rheumatoid Arthritis Research and Therapies
Sentiment Analysis and Opinion Mining
Neural Networks and Applications
Gene expression and cancer classification
Salivary Gland Disorders and Functions

Peking University
2011-2025

Peking Union Medical College Hospital
2011-2024

Chinese Academy of Medical Sciences & Peking Union Medical College
2012-2024

HEC Montréal
2023-2024

Beijing Institute of Big Data Research
2020-2024

UNSW Sydney
2024

Mila - Quebec Artificial Intelligence Institute
2023-2024

Shenzhen University Health Science Center
2024

National Clinical Research Center for Digestive Diseases
2024

Lenovo (China)
2023

American College of Rheumatology classification criteria for Sjögren's syndrome: A data‐driven, expert consensus approach in the Sjögren's International Collaborative Clinical Alliance Cohort

OPENALEX - Publications

Stephen Shiboski Caroline H. Shiboski Lindsey A. Criswell Alan N. Baer Stephen Challacombe and 26 more

We propose new classification criteria for Sjögren's syndrome (SS), which are needed considering the emergence of biologic agents as potential treatments and their associated comorbidity. These target individuals with signs/symptoms suggestive SS.Criteria based on expert opinion elicited using nominal group technique analyses data from International Collaborative Clinical Alliance. Preliminary validation included comparisons classifications American–European Consensus Group (AECG) criteria,...

10.1002/acr.21591 article EN Arthritis Care & Research 2012-01-09

Chinese SLE Treatment and Research group (CSTAR) registry: I. Major clinical characteristics of Chinese patients with systemic lupus erythematosus

OPENALEX - Publications

Ming Li Wentao Zhang Xiaomei Leng Zhijun Li Zhen-Nan Ye and 10 more

The Chinese systemic lupus erythematosus (SLE) treatment and research group (CSTAR) provides major clinical characteristics of SLE in China establishes a platform to provide resources for future basic studies. CSTAR originated as multicentre, consecutive, prospective design. data were collected online from 104 rheumatology centers, which covered 30 provinces China. registered patients required meet four or more the American College Rheumatology (ACR) criteria classification SLE. All centers...

10.1177/0961203313499086 article EN Lupus 2013-08-20

Graph Neural Networks in Recommender Systems: A Survey

OPENALEX - Publications

Shiwen Wu Fei Sun Wentao Zhang Xu Xie Bin Cui

With the explosive growth of online information, recommender systems play a key role to alleviate such information overload. Due important application value systems, there have always been emerging works in this field. In main challenge is learn effective user/item representations from their interactions and side (if any). Recently, graph neural network (GNN) techniques widely utilized since most essentially has structure GNN superiority representation learning. This article aims provide...

10.48550/arxiv.2011.02260 preprint EN cc-by-nc-sa arXiv (Cornell University) 2020-01-01

Graph Attention Multi-Layer Perceptron

OPENALEX - Publications

Wentao Zhang Ziqi Yin Zeang Sheng Yang Li Wen Ouyang and 4 more

Graph neural networks (GNNs) have achieved great success in many graph-based applications. However, the enormous size and high sparsity level of graphs hinder their applications under industrial scenarios. Although some scalable GNNs are proposed for large-scale graphs, they adopt a fixed $K$-hop neighborhood each node, thus facing over-smoothing issue when adopting large propagation depths nodes within sparse regions. To tackle above issue, we propose new GNN architecture -- Attention...

10.1145/3534678.3539121 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022-08-12

Li-rich channels as the material gene for facile lithium diffusion in halide solid electrolytes

OPENALEX - Publications

Yang Guo-hao Xianhui Liang Shisheng Zheng Haibiao Chen Wentao Zhang and 2 more

Halide solid electrolytes have attracted intense research interest recently for application in all-solid-state lithium-ion batteries. Herein, we present a systematic first-principles study of the Li3MX6 (M: multivalent cation; X: halogen anion) halide family that unveils link between Li-rich channels and ionic conductivity, highlighting former as material gene these compounds. By screening total 180 halides those with high thermodynamic stability, wide electrochemical window, low chemical...

10.1016/j.esci.2022.01.001 article EN cc-by-nc-nd eScience 2022-01-01

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

OPENALEX - Publications

DeepSeek-AI NULL AUTHOR_ID Xiao Guo Bi Deli Chen Guanting Chen and 83 more

The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over LLMs. We delve into study laws and present our distinctive findings that facilitate scale two commonly used configurations, 7B 67B. Guided by laws, we introduce DeepSeek LLM, project dedicated to advancing with long-term perspective. To support pre-training phase, have developed...

10.48550/arxiv.2401.02954 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Incidence of malignancy in primary Sjogren's syndrome in a Chinese cohort

OPENALEX - Publications

Wentao Zhang Feng Shi Sheng Yan Yan Zhao Ming Li and 4 more

To evaluate the incidence of malignancies in a cohort Chinese patients with primary Sjögren's syndrome (pSS) and to identify risk factors malignancy pSS patients.A retrospective analysis was carried out 1320 who were recruited Peking Union Medical College Hospital from 1990 2005 followed up for an average 4.4 years. Among them, 29 developed malignancies. Standardized ratios (SIRs) calculated along 95% CIs. Clinical characteristics compared between without malignancies, as well haematological...

10.1093/rheumatology/kep404 article EN Lara D. Veeken 2009-12-29

A BERT-BiLSTM-CRF Model for Chinese Electronic Medical Records Named Entity Recognition

OPENALEX - Publications

Wentao Zhang Shaohua Jiang Shan Zhao Kai Hou Yang Liu and 1 more

Named entity recognition is a fundamental task in natural language processing and many studies have done about it recent decades. Previous word representation methods represent words as single vector of multiple dimensions, which ignore the ambiguity character Chinese. To solve this problem, we apply BERT-BiLSTM-CRF model to Chinese electronic medical records named paper. This enhances semantic by using BERT pre-trained model, then combine BiLSTM network with CRF layer, used input for...

10.1109/icicta49267.2019.00043 article EN 2019-10-01

Reliable Data Distillation on Graph Convolutional Network

OPENALEX - Publications

Wentao Zhang Xupeng Miao Yingxia Shao Jiawei Jiang Lei Chen and 2 more

Graph Convolutional Network (GCN) is a widely used method for learning from graph-based data. However, it fails to use the unlabeled data its full potential, thereby hindering ability. Given some pseudo labels of data, GCN can benefit this extra supervision. Based on Knowledge Distillation and Ensemble Learning, lots methods teacher-student architecture make better then prediction. these introduce unnecessary training costs high bias student model if teacher's predictions are unreliable....

10.1145/3318464.3389706 article EN 2020-05-29

PaSca: A Graph Neural Architecture Search System under the Scalable Paradigm

OPENALEX - Publications

Wentao Zhang Yu Shen Zheyu Lin Yang Li Xiao‐Sen Li and 4 more

Graph neural networks (GNNs) have achieved state-of-the-art performance in various graph-based tasks. However, as mainstream GNNs are designed based on the message passing mechanism, they do not scale well to data size and steps. Although there has been an emerging interest design of scalable GNNs, current researches focus specific GNN design, rather than general space, limiting discovery potential models. This paper proposes PaSca, a new paradigm system that offers principled approach...

10.1145/3485447.3511986 article EN Proceedings of the ACM Web Conference 2022 2022-04-25

Snapshot boosting: a fast ensemble framework for deep neural networks

OPENALEX - Publications

Wentao Zhang Jiawei Jiang Yingxia Shao Bin Cui

10.1007/s11432-018-9944-x article EN Science China Information Sciences 2019-12-24

Model Degradation Hinders Deep Graph Neural Networks

OPENALEX - Publications

Wentao Zhang Zeang Sheng Ziqi Yin Yuezihan Jiang Yikuan Xia and 3 more

Graph Neural Networks (GNNs) have achieved great success in various graph mining tasks.However, drastic performance degradation is always observed when a GNN stacked with many layers. As result, most GNNs only shallow architectures, which limits their expressive power and exploitation of deep neighborhoods.Most recent studies attribute the to \textit{over-smoothing} issue. In this paper, we disentangle conventional convolution operation into two independent operations: \textit{Propagation}...

10.1145/3534678.3539374 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022-08-12

Graph Condensation: A Survey

OPENALEX - Publications

Xinyi Gao Junliang Yu Tong Chen Guanhua Ye Wentao Zhang and 1 more

10.1109/tkde.2025.3535877 article EN IEEE Transactions on Knowledge and Data Engineering 2025-01-01

Acceleration Algorithms in GNNs: A Survey

OPENALEX - Publications

Lu Ma Zeang Sheng Xunkai Li Xinyi Gao Zhezheng Hao and 5 more

10.1109/tkde.2025.3540787 article EN IEEE Transactions on Knowledge and Data Engineering 2025-01-01

Transfer learning in Scalable Graph Neural Network for Improved Physical Simulation

OPENALEX - Publications

Siqi Shen Yü Liu David R. Biggs Omar Hafez Jiandong Yu and 3 more

In recent years, Graph Neural Network (GNN) based models have shown promising results in simulating physics of complex systems. However, training dedicated graph network simulators can be costly, as most are confined to fully supervised training, which requires extensive data generated from traditional simulators. To date, how transfer learning could improve the model performance and efficiency has remained unexplored. this work, we introduce a pre-training paradigm for We propose scalable...

10.48550/arxiv.2502.06848 preprint EN arXiv (Cornell University) 2025-02-07

A Comprehensive Survey on Imbalanced Data Learning

OPENALEX - Publications

Xinyi Gao Dongting Xie Yihang Zhang Zhengren Wang Conghui He and 2 more

With the expansion of data availability, machine learning (ML) has achieved remarkable breakthroughs in both academia and industry. However, imbalanced distributions are prevalent various types raw severely hinder performance ML by biasing decision-making processes. To deepen understanding facilitate related research applications, this survey systematically analyzing real-world formats concludes existing researches for different into four distinct categories: re-balancing, feature...

10.48550/arxiv.2502.08960 preprint EN arXiv (Cornell University) 2025-02-12

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

OPENALEX - Publications

Ling Yang Xinchen Zhang Ye Tian Chao Shang Minghao Xu and 2 more

The remarkable success of the autoregressive paradigm has made significant advancement in Multimodal Large Language Models (MLLMs), with powerful models like Show-o, Transfusion and Emu3 achieving notable progress unified image understanding generation. For first time, we uncover a common phenomenon: capabilities MLLMs are typically stronger than their generative capabilities, gap between two. Building on this insight, propose HermesFlow, simple yet general framework designed to seamlessly...

10.48550/arxiv.2502.12148 preprint EN arXiv (Cornell University) 2025-02-17

HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation

OPENALEX - Publications

Hao Liu Zhengren Wang Xijing Chen Zhiyu Li Feiyu Xiong and 2 more

Retrieval-Augmented Generation (RAG) systems often struggle with imperfect retrieval, as traditional retrievers focus on lexical or semantic similarity rather than logical relevance. To address this, we propose HopRAG, a novel RAG framework that augments retrieval reasoning through graph-structured knowledge exploration. During indexing, HopRAG constructs passage graph, text chunks vertices and connections established via LLM-generated pseudo-queries edges. it employs retrieve-reason-prune...

10.48550/arxiv.2502.12442 preprint EN arXiv (Cornell University) 2025-02-17

MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification

OPENALEX - Publications

Linzhuang Sun Hao Liang Jingxuan Wei Bihui Yu Tianpeng Li and 3 more

According to the Test-Time Scaling, integration of External Slow-Thinking with Verify mechanism has been demonstrated enhance multi-round reasoning in large language models (LLMs). However, multimodal (MM) domain, there is still a lack strong MM-Verifier. In this paper, we introduce MM-Verifier and MM-Reasoner through longer inference more robust verification. First, propose two-step MM verification data synthesis method, which combines simulation-based tree search uses rejection sampling...

10.48550/arxiv.2502.13383 preprint EN arXiv (Cornell University) 2025-02-18

GRAIN

OPENALEX - Publications

Wentao Zhang Zhi Yang Yexin Wang Yu Shen Yang Li and 2 more

Data selection methods, such as active learning and core-set selection, are useful tools for improving the data efficiency of deep models on large-scale datasets. However, recent have moved forward from independent identically distributed to graph-structured data, social networks, e-commerce user-item graphs, knowledge graphs. This evolution has led emergence Graph Neural Networks (GNNs) that go beyond existing methods designed for. Therefore, we present GRAIN, an efficient framework opens...

10.14778/3476249.3476295 article EN Proceedings of the VLDB Endowment 2021-07-01

ALG: Fast and Accurate Active Learning Framework for Graph Convolutional Networks

OPENALEX - Publications

Wentao Zhang Yu Shen Yang Li Lei Chen Zhi Yang and 1 more

Graph Convolutional Networks (GCNs) have become state-of-the-art methods in many supervised and semi-supervised graph representation learning scenarios. In order to achieve satisfactory performance, GCNs require a sufficient amount of labeled data. However, real-world scenarios, data is often expensive obtain. Therefore, we propose ALG, novel Active Learning framework for GCNs, which employs domain-specific intelligence much higher performance efficiency compared the generic AL frameworks....

10.1145/3448016.3457325 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

Coming Soon ...