NFDI4DS | UHH-SEMS - Publication Details

Xinchen Wan

ORCID: 0000-0001-6503-5309

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5063239975

Research Areas

Cloud Computing and Resource Management
Software-Defined Networks and 5G
Network Traffic and Congestion Control
Advanced Graph Neural Networks
Various Chemistry Research Topics
Machine Learning in Materials Science
Computational Drug Discovery Methods
Scientific Computing and Data Management
IoT-based Smart Home Systems
Advanced Neural Network Applications
Technology and Data Analysis
Brain Tumor Detection and Classification
Context-Aware Activity Recognition Systems
Robotics and Automated Systems
Experimental Learning in Engineering
Online Learning and Analytics
QR Code Applications and Technologies
Machine Learning in Healthcare
Advanced Data Storage Technologies
Internet Traffic Analysis and Secure E-voting
Data Stream Mining Techniques
Graph Theory and Algorithms
Parallel Computing and Optimization Techniques
Advanced Memory and Neural Computing
Energy Efficient Wireless Sensor Networks

Hong Kong University of Science and Technology
2020-2025

University of Hong Kong
2020-2025

Imperial College London
2022-2023

Guangzhou Experimental Station
2022-2023

Astraea: Towards Fair and Efficient Learning-based Congestion Control

OPENALEX - Publications

Xudong Liao Han Tian Chaoliang Zeng Xinchen Wan Kai Chen

Recent years have witnessed a plethora of learning-based solutions for congestion control (CC) that demonstrate better performance over traditional TCP schemes. However, they fail to provide consistently good convergence properties, including fairness, fast and stability, due the mismatch between their objective functions these properties. Despite being intuitive, integrating properties into existing CC is challenging, because: 1) training environments are designed optimization single flow...

10.1145/3627703.3650069 article EN 2024-04-18

Scalable and Efficient Full-Graph GNN Training for Large Graphs

OPENALEX - Publications

Xinchen Wan Kaiqiang Xu Xudong Liao Yilun Jin Kai Chen and 1 more

Graph Neural Networks (GNNs) have emerged as powerful tools to capture structural information from graph-structured data, achieving state-of-the-art performance on applications such recommendation, knowledge graph, and search. Graphs in these domains typically contain hundreds of millions nodes billions edges. However, previous GNN systems demonstrate poor scalability because large interleaved computation dependencies training cause significant overhead current parallelization methods. We...

10.1145/3589288 article EN Proceedings of the ACM on Management of Data 2023-06-13

mFabric: An Efficient and Scalable Fabric for Mixture-of-Experts Training

OPENALEX - Publications

Xudong Liao Yijun Sun Han Tian Xinchen Wan Yilun Jin and 11 more

Mixture-of-Expert (MoE) models outperform conventional by selectively activating different subnets, named \emph{experts}, on a per-token basis. This gated computation generates dynamic communications that cannot be determined beforehand, challenging the existing GPU interconnects remain \emph{static} during distributed training process. In this paper, we advocate for first-of-its-kind system, called mFabric, unlocks topology reconfiguration \emph{during} MoE training. Towards vision, first...

10.48550/arxiv.2501.03905 preprint EN arXiv (Cornell University) 2025-01-07

Design and Operation of Shared Machine Learning Clusters on Campus

OPENALEX - Publications

Kaiqiang Xu Decang Sun Hao Wang Zhenghang Ren Xinchen Wan and 4 more

10.1145/3669940.3707266 article EN 2025-02-03

Achieving Fairness Generalizability for Learning-based Congestion Control with Jury

OPENALEX - Publications

Han Tian Xudong Liao Decang Sun Chaoliang Zeng Yilun Jin and 5 more

10.1145/3689031.3696065 article EN 2025-03-26

Harmonia: A Unified Framework for Heterogeneous FPGA Acceleration in the Cloud

OPENALEX - Publications

Luyang Li Heng Pan Xinchen Wan Kai Lv Zilong Wang and 7 more

10.1145/3676641.3716259 article EN 2025-03-27

RAT - Resilient Allreduce Tree for Distributed Machine Learning

OPENALEX - Publications

Xinchen Wan Hong Zhang Hao Wang Shuihai Hu Junxue Zhang and 1 more

Parameter/gradient exchange plays an important role in large-scale distributed machine learning (DML). However, prior solutions such as parameter server (PS) or ring-allreduce (Ring) fall short since they are not resilient to issues uncertainties like oversubscription, congestion failures that may occur datacenter networks (DCN).

10.1145/3411029.3411037 article EN 2020-08-03

TACC: A Full-stack Cloud Computing Infrastructure for Machine Learning Tasks

OPENALEX - Publications

Kaiqiang Xu Xinchen Wan Hao Wang Zhenghang Ren Xudong Liao and 3 more

In Machine Learning (ML) system research, efficient resource scheduling and utilization have always been an important topic given the compute-intensive nature of ML applications. this paper, we introduce design TACC, a full-stack cloud infrastructure that efficiently manages executes large-scale machine learning applications in compute clusters. TACC implements 4-layer application workflow abstraction through which optimization techniques can be dynamically combined applied to various types...

10.48550/arxiv.2110.01556 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Domain-specific Communication Optimization for Distributed DNN Training

OPENALEX - Publications

Hao Wang Jingrong Chen Xinchen Wan Tian Han Jiacheng Xia and 5 more

Communication overhead poses an important obstacle to distributed DNN training and draws increasing attention in recent years. Despite continuous efforts, prior solutions such as gradient compression/reduction, compute/communication overlapping layer-wise flow scheduling, etc., are still coarse-grained insufficient for efficient especially when the network is under pressure. We present DLCP, a novel solution exploiting domain-specific properties of deep learning optimize communication...

10.48550/arxiv.2008.08445 preprint EN other-oa arXiv (Cornell University) 2020-01-01

DGS: Communication-Efficient Graph Sampling for Distributed GNN Training

OPENALEX - Publications

Xinchen Wan Kai Chen Yiming Zhang

Distributed GNN training tends to generate huge volumes of communication. To reduce communication cost, the state-of-the-art sampling-based techniques sample and retrieve only a subset nodes. However, our analysis shows that current sampling algorithms are still inefficient in network for distributed training, which is mainly because three problems: first, they overlook locality sampled neighbor nodes cluster; second, data at coarse-grained graph node level; third, some mechanisms adopted...

10.1109/icnp55882.2022.9940348 article EN 2022-10-30

Fast, Scalable, and Accurate Rate Limiter for RDMA NICs

OPENALEX - Publications

Zilong Wang Xinchen Wan Luyang Li Yijun Sun Xie Peng and 4 more

10.1145/3651890.3672215 article EN 2024-07-31

Accurate and Scalable Rate Limiter for RDMA NICs

OPENALEX - Publications

Zilong Wang Xinchen Wan Chaoliang Zeng Kai Chen

Rate limiter is required by RDMA NIC (RNIC) to enforce the rate limits calculated congestion control. RNIC expects be accurate and scalable: precisely shape traffic for numerous flows with minimized resource consumption, thereby mitigating incasts congestions improving network performance. Previous works, however, fail meet performance requirements of while achieving accuracy scalability.

10.1145/3600061.3600078 article EN 2023-06-29

A Novel Atom Pair Attention Methodology for Molecular Representation Learning

OPENALEX - Publications

Haotong Sun Yinghui Jiang Minhao Wang Xianglu Xiao Xinchen Wan and 4 more

Abstract Rapid and accurate prediction of molecular properties is a fundamental task in drug discovery. In recent years, deep learning-based property methods have received much attention successes shown that learning the representations structures by applying graph neural networks (GNNs) can achieve better results. However, most previous approaches typically focus on atomic embedding, while this paper, we propose novel method based atom pair it was applied to two types task. Firstly,...

10.21203/rs.3.rs-2418652/v1 preprint EN cc-by Research Square (Research Square) 2023-02-07

A Novel Atom Pair Attention Methodology for Molecular Representation Learning

OPENALEX - Publications

Haotong Sun Yinghui Jiang Minhao Wang Xianglu Xiao Xinchen Wan and 4 more

Rapid and accurate prediction of molecular properties is a fundamental task in drug discovery. In recent years, deep learning-based property methods have received much attention successes shown that learning the representations structures by applying graph neural networks (GNNs) can achieve better results. However, most previous approaches typically focus on atomic embedding, while this paper, we propose novel method based atom pair it was applied to two types task. Firstly, embedding done...

10.26434/chemrxiv-2022-3b0js preprint EN cc-by 2022-12-23

Coming Soon ...