NFDI4DS | UHH-SEMS - Publication Details

Junzhou Zhao

ORCID: 0000-0003-3476-8248

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5007402211

Research Areas

Complex Network Analysis Techniques
Advanced Graph Neural Networks
Opinion Dynamics and Social Influence
Topic Modeling
Caching and Content Delivery
Spam and Phishing Detection
Graph Theory and Algorithms
Privacy-Preserving Technologies in Data
Data Stream Mining Techniques
Peer-to-Peer Network Technologies
Network Security and Intrusion Detection
Internet Traffic Analysis and Secure E-voting
Data Quality and Management
Speech and dialogue systems
Natural Language Processing Techniques
Human Mobility and Location-Based Analysis
Cryptography and Data Security
Artificial Intelligence in Law
AI in Service Interactions
Multimodal Machine Learning Applications
Advanced Image and Video Retrieval Techniques
Data Management and Algorithms
Bayesian Modeling and Causal Inference
Anomaly Detection Techniques and Applications
Bioinformatics and Genomic Networks

Xi'an Jiaotong University
2014-2025

Shanghai University of Engineering Science
2023

King Abdullah University of Science and Technology
2018-2019

Chinese University of Hong Kong
2017-2019

Distinguish Confusing Law Articles for Legal Judgment Prediction

OPENALEX - Publications

Nuo Xu Pinghui Wang Long Chen Li Pan Xiaoyan Wang and 1 more

Legal Judgement Prediction (LJP) is the task of automatically predicting a law case’s judgment results given text describing facts, which has great prospects in judicial assistance systems and handy services for public. In practice, confusing charges are often presented, because cases applicable to similar articles easily misjudged. To address this issue, existing work relies heavily on domain experts, hinders its application different systems. paper, we present an end-to-end model, LADAN,...

10.18653/v1/2020.acl-main.280 article EN 2020-01-01

Robust Visual Question Answering: Datasets, Methods, and Future Challenges

OPENALEX - Publications

Jie Ma Pinghui Wang Dechen Kong Zewei Wang Jun Liu and 2 more

Visual question answering requires a system to provide an accurate natural language answer given image and question. However, it is widely recognized that previous generic VQA methods often tend memorize biases present in the training data rather than learning proper behaviors, such as grounding images before predicting answers. Therefore, these usually achieve high in-distribution but poor out-of-distribution performance. In recent years, various datasets debiasing have been proposed...

10.1109/tpami.2024.3366154 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-02-15

Efficiently Estimating Motif Statistics of Large Networks

OPENALEX - Publications

Pinghui Wang John C. S. Lui Bruno Ribeiro Don Towsley Junzhou Zhao and 1 more

Exploring statistics of locally connected subgraph patterns (also known as network motifs) has helped researchers better understand the structure and function biological Online Social Networks (OSNs). Nowadays, massive size some critical networks—often stored in already overloaded relational databases—effectively limits rate at which nodes edges can be explored, making it a challenge to accurately discover statistics. In this work, we propose sampling methods estimate from few queried...

10.1145/2629564 article EN ACM Transactions on Knowledge Discovery from Data 2014-09-23

FuAlign: Cross-lingual entity alignment via multi-view representation learning of fused knowledge graphs

OPENALEX - Publications

Chenxu Wang Zhenhao Huang Yue Wan Junyu Wei Junzhou Zhao and 1 more

10.1016/j.inffus.2022.08.002 article EN Information Fusion 2022-08-09

MOSS-5: A Fast Method of Approximating Counts of 5-Node Graphlets in Large Graphs

OPENALEX - Publications

Pinghui Wang Junzhou Zhao Xiangliang Zhang Zhenguo Li Jiefeng Cheng and 4 more

Counting 3-, 4-, and 5-node graphlets in graphs is important for graph mining applications such as discovering abnormal/ evolution patterns social biology networks. In addition, it recently widely used computing similarities between classification protein function prediction malware detection. However, challenging to compute these graphlet counts a large or set of due the combinatorial nature problem. Despite recent efforts counting 3-node 4-node graphlets, little attention has been paid...

10.1109/tkde.2017.2756836 article EN IEEE Transactions on Knowledge and Data Engineering 2017-09-26

Poisoning Attacks and Defenses to Learned Bloom Filters for Malicious URL Detection

OPENALEX - Publications

Fang-Ming Dong Pinghui Wang Rundong Li Xu Cui Junzhou Zhao and 3 more

10.1109/tdsc.2025.3528993 article EN IEEE Transactions on Dependable and Secure Computing 2025-01-01

How Vital is the Jurisprudential Relevance: Law Article Intervened Legal Case Retrieval and Matching

OPENALEX - Publications

Nuo Xu Pinghui Wang Zi Liang Junzhou Zhao Xiaohong Guan

Legal case retrieval aims to automatically scour comparable legal cases based on a given query, which is crucial for offering relevant precedents support the judgment in intelligent systems. Due similar goals, it often associated with matching task. To address them, daunting challenge assessing uniquely defined legal-rational similarity within judicial domain, distinctly deviates from semantic similarities general text retrieval. Past works either tagged domain-specific factors or...

10.1145/3725729 article EN ACM transactions on office information systems 2025-03-21

Mobile Data Traffic Prediction by Exploiting Time-Evolving User Mobility Patterns

OPENALEX - Publications

Feiyang Sun Pinghui Wang Junzhou Zhao Nuo Xu Juxiang Zeng and 5 more

Understanding mobile data traffic and forecasting future trend is beneficial to wireless carriers service providers who need perform resource allocation energy saving management. However, predicting accurately at large-scale fine-granularity particularly challenging due the following two factors: spatial correlations between network units (i.e., a cell tower or an access point) introduced by user arbitrary movements, time-evolving nature of movements which frequently changes with time. In...

10.1109/tmc.2021.3079117 article EN IEEE Transactions on Mobile Computing 2021-05-11

Federated Learning over Coupled Graphs

OPENALEX - Publications

Runze Lei Pinghui Wang Junzhou Zhao Lin Lan Jing Tao and 4 more

Graphs are widely used to represent the relations among entities. When one owns complete data, an entire graph can be easily built, therefore performing analysis on is straightforward. However, in many scenarios, it impractical centralize data due privacy concerns. An organization or party only keeps a part of whole i.e., isolated from different parties. Recently, Federated Learning (FL) has been proposed solve isolation issue, mainly for Euclidean data. It still challenge apply FL because...

10.1109/tpds.2023.3240527 article EN IEEE Transactions on Parallel and Distributed Systems 2023-01-01

MR-GNN: Multi-Resolution and Dual Graph Neural Network for Predicting Structured Entity Interactions

OPENALEX - Publications

Nuo Xu Pinghui Wang Long Chen Jing Tao Junzhou Zhao

Predicting interactions between structured entities lies at the core of numerous tasks such as drug regimen and new material design. In recent years, graph neural networks have become attractive. They represent graphs then extract features from each individual using convolution operations. However, these methods some limitations: i) their only a fix-sized subgraph structure (i.e., receptive field) node, ignore in substructures different sizes, ii) are extracted by considering entity...

10.24963/ijcai.2019/551 preprint EN 2019-07-28

Minfer: A method of inferring motif statistics from sampled edges

OPENALEX - Publications

Pinghui Wang John C. S. Lui Don Towsley Junzhou Zhao

Characterizing motif (i.e., locally connected sub-graph patterns) statistics is important for understanding complex networks such as online social and communication networks. Previous work made the strong assumption that graph topology of interest known in advance. In practice, sometimes researchers have to deal with situation where unknown because it expensive collect store all topological meta information. Hence, typically what available only a snapshot graph, i.e., subgraph graph....

10.1109/icde.2016.7498312 article EN 2016-05-01

Approximately Counting Butterflies in Large Bipartite Graph Streams

OPENALEX - Publications

Rundong Li Pinghui Wang Peng Jia Xiangliang Zhang Junzhou Zhao and 3 more

Bipartite graphs widely exist in real-world scenarios and model binary relations like host-website, author-paper, user-product. In bipartite graphs, a butterfly (i.e., <inline-formula><tex-math notation="LaTeX">$2\times 2$</tex-math></inline-formula> bi-clique) is the smallest non-trivial cohesive structure plays an important role applications such as anomaly detection. Considerable efforts focus on counting butterflies static graphs. However, they suffer from high time space complexity when...

10.1109/tkde.2021.3062987 article EN IEEE Transactions on Knowledge and Data Engineering 2021-03-02

Empirical analysis of the evolution of follower network: A case study on Douban

OPENALEX - Publications

Junzhou Zhao John C. S. Lui Don Towsley Xiaohong Guan Yadong Zhou

Follower networks such as Twitter and Digg are becoming popular form of social information networks. This paper seeks to gain insights into how they evolve the relationship between their structure ability spread information. By studying Douban follower network, which is a online network in China, we provide some evidences showing its suitability for spreading. For example, it exhibits an unbalanced bow-tie with large out-component, indicates that majority users can widely; effective diameter...

10.1109/infcomw.2011.5928945 article EN 2011-04-01

A Tale of Three Social Networks: User Activity Comparisons across Facebook, Twitter, and Foursquare

OPENALEX - Publications

Pinghui Wang Wenbo He Junzhou Zhao

Despite recent efforts to characterize online social network (OSN) structures and activities, user behavior across different OSNs has received little attention. Yet such information could provide insight into issues relating personal privacy protection. For instance, many Foursquare users reveal their Facebook Twitter accounts the public. The authors' in-depth measurement study examines users' activities settings Facebook, Twitter, Foursquare. Results show that are highly correlated among...

10.1109/mic.2013.128 article EN IEEE Internet Computing 2014-01-31

Half-Xor: A Fully-Dynamic Sketch for Estimating the Number of Distinct Values in Big Tables

OPENALEX - Publications

Pinghui Wang Dongdong Xie Junzhou Zhao Jinsong Li LI Zhi-cheng and 3 more

Calculating the number of distinct values (i.e., NDV) in a column big table is costly yet fundamental to variety database applications such as data compression and profiling. To reduce high time space cost, sketch methods (e.g., HyperLogLog) have been proposed, which estimate NDV from constructed compact summary values. However, these fail or are manage fully-dynamic scenarios where often inserted into deleted table. solve this issue, we propose novel method, <italic...

10.1109/tkde.2024.3359710 article EN IEEE Transactions on Knowledge and Data Engineering 2024-01-29

A New Sketch Method for Measuring Host Connection Degree Distribution

OPENALEX - Publications

Pinghui Wang Xiaohong Guan Junzhou Zhao Jing Tao Tao Qin

The host connection degree distribution (HCDD) is an important metric for network security monitoring. However, it difficult to accurately obtain the HCDD in real time high-speed links with a massive amount of traffic data. In this paper, we propose new sketch method build probabilistic summary host's flows using uniform Flajolet-Martin combined small bitmap. To study its performance comparison previous sampling and methods, present general model that encompasses all these methods. With...

10.1109/tifs.2014.2312544 article EN IEEE Transactions on Information Forensics and Security 2014-03-19

Sampling online social networks by random walk with indirect jumps

OPENALEX - Publications

Junzhou Zhao Pinghui Wang John C. S. Lui Don Towsley Xiaohong Guan

10.1007/s10618-018-0587-5 article EN Data Mining and Knowledge Discovery 2018-08-30

Modeling the Assimilation-Contrast Effects in Online Product Rating Systems

OPENALEX - Publications

Xiaoying Zhang Junzhou Zhao John C. S. Lui

The unbiasedness of online product ratings, an important property to ensure that users' ratings indeed reflect their true evaluations products, is vital both in shaping consumer purchase decisions and providing reliable recommendations. Recent experimental studies showed distortions from historical would ruin the subsequent ratings. How "discover" each single rating (or at micro-level), perform "debiasing operations" real systems are main objectives this work.

10.1145/3109859.3109885 article EN 2017-08-24

LogLog Filter: Filtering Cold Items within a Large Range over High Speed Data Streams

OPENALEX - Publications

Peng Jia Pinghui Wang Junzhou Zhao Ye Yuan Jing Tao and 1 more

Many real-world datasets are given in the format of data streams, and processing these streams is fundamental for many applications such as anomaly detection. In this paper, we study problem computing item frequencies, finding topk hot items, detecting heavy changes. However, widelyused sketches cost large memory usage their performance easily affected by unbalanced distribution streams. To solve issue, a novel method Cold Filter (CF) proposed to split cold items use separate structure...

10.1109/icde51399.2021.00075 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2021-04-01

A tale of three graphs: Sampling design on hybrid social-affiliation networks

OPENALEX - Publications

Junzhou Zhao John C. S. Lui Don Towsley Pinghui Wang Xiaohong Guan

Random walk-based graph sampling methods have become increasingly popular and important for characterizing large-scale complex networks. While powerful, they are known to exhibit problems when the is loosely connected, which slows down convergence of a random walk can result in poor estimation accuracy. In this work, we observe that many graphs under study, called target graphs, usually do not exist isolation. situations, often related an auxiliary affiliation graph, becomes better connected...

10.1109/icde.2015.7113346 article EN 2015-04-01

The quantification of ADAMTS expression in an animal model of cerebral ischemia using real‐time PCR

OPENALEX - Publications

Yi Tian P. B. Zhang Xue Xiao Jiangshe Zhang Junzhou Zhao and 4 more

Background: ADAMTS1 and ADAMTS8 are proteases involved in extracellular matrix proteolysis antiangiogenesis, but little is known about their expression function cerebral ischemia. We investigated the changes a rat model of permanent middle artery occlusion (pMCAO). The expressions glyseraldehyde‐3‐phosphate dehydrogenase (GAPDH), β‐actin, cyclophilin, RPL13A were examined order to validate appropriate housekeeping genes for long duration after inducing Methods: Male Sprague–Dawley rats...

10.1111/j.1399-6576.2006.01161.x article EN Acta Anaesthesiologica Scandinavica 2006-10-31

Measuring and maximizing group closeness centrality over disk-resident graphs

OPENALEX - Publications

Junzhou Zhao John C. S. Lui Don Towsley Xiaohong Guan

As an important metric in graphs, group closeness centrality measures how close a of vertices is to all other graph, and it used numerous graph applications such as measuring the dominance influence node over graph. However, when large-scale contains hundreds millions nodes/edges which cannot reside entirely computer's main memory, maximizing become challenging tasks. In this paper, we present systematic solution for efficiently calculating disk-resident graphs. Our first leverages...

10.1145/2567948.2579356 article EN 2014-04-07

The Knowledge Alignment Problem: Bridging Human and External Knowledge for Large Language Models

OPENALEX - Publications

Shuo Zhang Liangming Pan Junzhou Zhao William Yang Wang

Large language models often necessitate grounding on external knowledge to generate faithful and reliable answers. Yet even with the correct groundings in reference, they can ignore them rely wrong or their inherent biases hallucinate when users, being largely unaware of specifics stored information, pose questions that might not directly correlate retrieved groundings. In this work, we formulate alignment problem introduce MixAlign, a framework interacts both human user base obtain...

10.48550/arxiv.2305.13669 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Token-based deep reinforcement learning for Heterogeneous VRP with Service Time Constraints

OPENALEX - Publications

Yujun Wang Xiaopeng Hong Yabin Wang Junzhou Zhao Guanghui Sun and 1 more

10.1016/j.knosys.2024.112173 article EN Knowledge-Based Systems 2024-07-01

Whom to follow: Efficient followee selection for cascading outbreak detection on online social networks

OPENALEX - Publications

Junzhou Zhao John C. S. Lui Don Towsley Xiaohong Guan

10.1016/j.comnet.2014.08.024 article EN Computer Networks 2014-10-05

Coming Soon ...