NFDI4DS | UHH-SEMS - Publication Details

What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders

OPENALEX - Publications

Jintang Li Ruofan Wu Wangbin Sun Liang Chen Sheng Tian and 4 more

The last years have witnessed the emergence of a promising self-supervised learning strategy, referred to as masked autoencoding. However, there is lack theoretical understanding how masking matters on graph autoencoders (GAEs). In this work, we present autoencoder (MaskGAE), framework for graph-structured data. Different from standard GAEs, MaskGAE adopts modeling (MGM) principled pretext task - portion edges and attempting reconstruct missing part with partially visible, unmasked...

10.1145/3580305.3599546 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023-08-04

Hierarchical Dynamic Image Harmonization

OPENALEX - Publications

Haoxing Chen Zhangxuan Gu Yaohui Li Jun Lan Changhua Meng and 2 more

Image harmonization is a critical task in computer vision, which aims to adjust the foreground make it compatible with background. Recent works mainly focus on using global transformations (i.e., normalization and color curve rendering) achieve visual consistency. However, these models ignore local consistency their huge model sizes limit ability edge devices. In this paper, we propose hierarchical dynamic network (HDNet) adapt features from view for better feature transformation efficient...

10.1145/3581783.3611747 article EN 2023-10-26

Revisit Targeted Model Poisoning on Federated Recommendation: Optimize via Multi-objective Transport

OPENALEX - Publications

Jiajie Su Chaochao Chen Weiming Liu Zibin Lin Shuheng Shen and 2 more

10.1145/3626772.3657764 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2024-07-10

Revisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective

OPENALEX - Publications

Yunfei Liu Jintang Li Yuehe Chen Ruofan Wu Ericbk Wang and 7 more

Graph clustering, a fundamental and challenging task in graph mining, aims to classify nodes into several disjoint clusters. In recent years, contrastive learning (GCL) has emerged as dominant line of research clustering advances the new state-of-the-art. However, GCL-based methods heavily rely on augmentations schemes, which may potentially introduce challenges such semantic drift scalability issues. Another promising involves adoption modularity maximization, popular effective measure for...

10.1145/3637528.3671967 article EN cc-by Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2024-08-24

Online Fraud Detection via Test-Time Retrieval-Based Representation Enrichment

OPENALEX - Publications

Yiran Qiao Ningtao Wang Yuncong Gao Yang Yang Xing Fu and 2 more

Anti-fraud machine learning systems are perpetually confronted with the significant challenge of concept drift, driven by continuous and intense evolution fraudulent techniques. That is, outdated models trained on historical behaviors often fall short in addressing evolving tactics malicious users over time. The key issue lies effectively tackling rapid fraudsters' to detect these emerging unforeseen anomalies. In this paper, we propose a solution directly accessing real-time data...

10.1609/aaai.v39i12.33359 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Hunting in the Dark Forest: A Pre-trained Model for On-chain Attack Transaction Detection in Web3

OPENALEX - Publications

Zhiying Wu Jiajing Wu Hui Zhang Zibin Zheng Weiqiang Wang

10.1145/3696410.3714928 article EN 2025-04-22

Transferable and Forecastable User Targeting Foundation Model

OPENALEX - Publications

Bin Dou Baokun Wang Yun Zhu Xiang Lin Yike Xu and 9 more

10.1145/3701716.3715266 article EN 2025-05-08

Joint Local Relational Augmentation and Global Nash Equilibrium for Federated Learning with Non-IID Data

OPENALEX - Publications

Xinting Liao Chaochao Chen Weiming Liu Pengyang Zhou Huabin Zhu and 5 more

Federated learning (FL) is a distributed machine paradigm that needs collaboration between server and series of clients with decentralized data. To make FL effective in real-world applications, existing work devotes to improving the modeling non-IID In settings, there are intra-client inconsistency comes from imbalanced data modeling, inter-client among heterogeneous client distributions, which not only hinders sufficient representation minority data, but also brings discrepant model...

10.1145/3581783.3612178 preprint EN 2023-10-26

Edge-guided and Class-balanced Active Learning for Semantic Segmentation of Aerial Images

OPENALEX - Publications

Lianlei Shan Weiqiang Wang Ke Lv Bin Luo

Semantic segmentation requires pixel-level annotation, which is time-consuming. Active Learning (AL) a promising method for reducing data annotation costs. Due to the gap between aerial and natural images, previous AL methods are not ideal, mainly caused by unreasonable labeling units neglect of class imbalance. Previous based on images or regions, does consider characteristics tasks i.e., network often makes mistakes in edge region, interlaced irregular. Therefore, an edge-guided unit...

10.48550/arxiv.2405.18078 preprint EN arXiv (Cornell University) 2024-05-28

Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding

OPENALEX - Publications

Yuke Hu Wei Liang Ruofan Wu Kai Xiao Weiqiang Wang and 3 more

Knowledge Graph Embedding (KGE) is a fundamental technique that extracts expressive representation from knowledge graph (KG) to facilitate diverse downstream tasks. The emerging federated KGE (FKGE) collaboratively trains distributed KGs held among clients while avoiding exchanging clients' sensitive raw KGs, which can still suffer privacy threats as evidenced in other model trainings (e.g., neural networks). However, quantifying and defending against such remain unexplored for FKGE...

10.1145/3543507.3583450 article EN Proceedings of the ACM Web Conference 2022 2023-04-26

Secure Collaborative Learning in Mining Pool via Robust and Efficient Verification

OPENALEX - Publications

Xiaoli Zhang Zhicheng Xu Hongbing Cheng Tong Che Ke Xu and 4 more

Recently, collaborative learning is proposed to amortize massive computation costs of highly sophisticated artificial intelligence (AI) tasks. To attract lots participants, researchers investigate blockchains ' economic incentives with proof useful work (PoUW) consensus protocols motivate substantial numbers miners in a mining pool complete AI However, participants might be untrusted and defraud rewards as less possible efforts. In the paper, we propose robust efficient scheme called RPoL...

10.1109/icdcs57875.2023.00012 article EN 2023-07-01

GUARD: Graph Universal Adversarial Defense

OPENALEX - Publications

Jintang Li Jie Liao Ruofan Wu Liang Chen Zibin Zheng and 3 more

Graph convolutional networks (GCNs) have been shown to be vulnerable small adversarial perturbations, which becomes a severe threat and largely limits their applications in security-critical scenarios. To mitigate such threat, considerable research efforts devoted increasing the robustness of GCNs against attacks. However, current defense approaches are typically designed prevent from untargeted attacks focus on overall performance, making it challenging protect important local nodes more...

10.1145/3583780.3614903 article EN 2023-10-21

Knowledge-inspired Subdomain Adaptation for Cross-Domain Knowledge Transfer

OPENALEX - Publications

Liyue Chen Linian Wang Jinyu Xu S.C. Chen Weiqiang Wang and 3 more

Most state-of-the-art deep domain adaptation techniques align source and target samples in a global fashion. That is, after alignment, each sample is expected to become similar any sample. However, alignment may not always be optimal or necessary practice. For example, consider cross-domain fraud detection, where there are two types of transactions: credit non-credit. Aligning non-credit transactions separately yield better performance than as unlikely exhibit patterns transactions. To...

10.1145/3583780.3614946 article EN 2023-10-21

Provenance of Training without Training Data: Towards Privacy-Preserving DNN Model Ownership Verification

OPENALEX - Publications

Yunpeng Liu Kexin Li Zhuotao Liu Bihan Wen Ke Xu and 3 more

In the era of deep learning, it is critical to protect intellectual property high-performance neural network (DNN) models. Existing proposals, however, are subject adversarial ownership forgery (e.g., methods based on watermarks or fingerprints) require full access original training dataset for verification requiring replay learning process). this paper, we propose a novel Provenance Training (PoT) scheme, first empirical study towards verifying DNN model without accessing any while being...

10.1145/3543507.3583198 article EN cc-by Proceedings of the ACM Web Conference 2022 2023-04-26

Multi-Aspect Heterogeneous Graph Augmentation

OPENALEX - Publications

Yuchen Zhou Yanan Cao Yongchao Liu Yanmin Shang Peng Zhang and 5 more

Data augmentation has been widely studied as it can be used to improve the generalizability of graph representation learning models. However, existing works focus only on data homogeneous graphs. for heterogeneous graphs remains under-explored. Considering that contain different types nodes and links, ignoring type information directly applying methods will lead suboptimal results. In this paper, we propose a novel Multi-Aspect Heterogeneous Graph Augmentation framework named MAHGA....

10.1145/3543507.3583208 article EN cc-by Proceedings of the ACM Web Conference 2022 2023-04-26

Revisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective

OPENALEX - Publications

Yunfei Liu Jintang Li Yuehe Chen Ruofan Wu Erfan Wang and 7 more

Graph clustering, a fundamental and challenging task in graph mining, aims to classify nodes into several disjoint clusters. In recent years, contrastive learning (GCL) has emerged as dominant line of research clustering advances the new state-of-the-art. However, GCL-based methods heavily rely on augmentations schemes, which may potentially introduce challenges such semantic drift scalability issues. Another promising involves adoption modularity maximization, popular effective measure for...

10.48550/arxiv.2406.14288 preprint EN arXiv (Cornell University) 2024-06-20

Privacy Risks of Federated Knowledge Graph Embedding: New Membership Inference Attacks and Personalized Differential Privacy Defense

OPENALEX - Publications

Yuke Hu Yang Wang Jian Lou Wei Liang Ruofan Wu and 4 more

10.1109/tdsc.2024.3522025 article EN IEEE Transactions on Dependable and Secure Computing 2024-01-01

E-ANT: A Large-Scale Dataset for Efficient Automatic GUI NavigaTion

OPENALEX - Publications

Ke Wang Tianyu Xia Zhangxuan Gu Yi Zhao Shuheng Shen and 3 more

Online GUI navigation on mobile devices has driven a lot of attention recent years since it contributes to many real-world applications. With the rapid development large language models (LLM), multimodal (MLLM) have tremendous potential this task. However, existing MLLMs need high quality data improve its abilities making correct decisions according human user inputs. In paper, we developed novel and highly valuable dataset, named \textbf{E-ANT}, as first Chinese dataset that contains real...

10.48550/arxiv.2406.14250 preprint EN arXiv (Cornell University) 2024-06-20

DTFormer: A Transformer-Based Method for Discrete-Time Dynamic Graph Representation Learning

OPENALEX - Publications

Xi Chen Yun Xiong Siwei Zhang Jiawei Zhang Yao Zhang and 5 more

10.1145/3627673.3679568 article EN 2024-10-20

Multi-Grained Preference Enhanced Transformer for Multi-Behavior Sequential Recommendation

OPENALEX - Publications

Chuan He Yue Liu Qiang Li Weiqiang Wang Xin Fu and 3 more

Sequential recommendation (SR) aims to predict the next purchasing item according users' dynamic preference learned from their historical user-item interactions. To improve performance of recommendation, learning heterogeneous cross-type behavior dependencies is indispensable for recommender system. However, there still exists some challenges in Multi-Behavior Recommendation (MBSR). On one hand, existing methods only model multi-behavior at behavior-level or item-level, and modelling...

10.48550/arxiv.2411.12179 preprint EN arXiv (Cornell University) 2024-11-18

A Momentum Loss Reweighting Method for Improving Recall

OPENALEX - Publications

Chenzhi Jiang Yin Jin Ningtao Wang Ruofan Wu Xing Fu and 1 more

In many practical binary classification applications, such as financial fraud detection or medical diagnosis, it is crucial to optimize a model's performance on high-confidence samples whose scores are higher than specific threshold, which calculated by given false positive rate according requirements. However, the proportion of typically extremely small, especially in long-tailed datasets, can lead poor recall results and an alignment bias between realistic goals loss. To address this...

10.1145/3583780.3614764 article EN 2023-10-21

LasTGL: An Industrial Framework for Large-Scale Temporal Graph Learning

OPENALEX - Publications

Jintang Li Jiawang Dan Ruofan Wu Jing Zhou Sheng Tian and 7 more

Over the past few years, graph neural networks (GNNs) have become powerful and practical tools for learning on (static) graph-structure data. However, many real-world applications, such as social e-commerce, involve temporal graphs where nodes edges are dynamically evolving. Temporal (TGNNs) progressively emerged an extension of GNNs to address time-evolving gradually a trending research topic in both academics industry. Advancing application emerging field necessitates development new...

10.48550/arxiv.2311.16605 preprint EN other-oa arXiv (Cornell University) 2023-01-01