Junzhou Zhao

ORCID: 0000-0003-3476-8248
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Complex Network Analysis Techniques
  • Advanced Graph Neural Networks
  • Opinion Dynamics and Social Influence
  • Topic Modeling
  • Caching and Content Delivery
  • Spam and Phishing Detection
  • Graph Theory and Algorithms
  • Privacy-Preserving Technologies in Data
  • Data Stream Mining Techniques
  • Peer-to-Peer Network Technologies
  • Network Security and Intrusion Detection
  • Internet Traffic Analysis and Secure E-voting
  • Data Quality and Management
  • Speech and dialogue systems
  • Natural Language Processing Techniques
  • Human Mobility and Location-Based Analysis
  • Cryptography and Data Security
  • Artificial Intelligence in Law
  • AI in Service Interactions
  • Multimodal Machine Learning Applications
  • Advanced Image and Video Retrieval Techniques
  • Data Management and Algorithms
  • Bayesian Modeling and Causal Inference
  • Anomaly Detection Techniques and Applications
  • Bioinformatics and Genomic Networks

Xi'an Jiaotong University
2014-2025

Shanghai University of Engineering Science
2023

King Abdullah University of Science and Technology
2018-2019

Chinese University of Hong Kong
2017-2019

Legal Judgement Prediction (LJP) is the task of automatically predicting a law case’s judgment results given text describing facts, which has great prospects in judicial assistance systems and handy services for public. In practice, confusing charges are often presented, because cases applicable to similar articles easily misjudged. To address this issue, existing work relies heavily on domain experts, hinders its application different systems. paper, we present an end-to-end model, LADAN,...

10.18653/v1/2020.acl-main.280 article EN 2020-01-01

Visual question answering requires a system to provide an accurate natural language answer given image and question. However, it is widely recognized that previous generic VQA methods often tend memorize biases present in the training data rather than learning proper behaviors, such as grounding images before predicting answers. Therefore, these usually achieve high in-distribution but poor out-of-distribution performance. In recent years, various datasets debiasing have been proposed...

10.1109/tpami.2024.3366154 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-02-15

Exploring statistics of locally connected subgraph patterns (also known as network motifs) has helped researchers better understand the structure and function biological Online Social Networks (OSNs). Nowadays, massive size some critical networks—often stored in already overloaded relational databases—effectively limits rate at which nodes edges can be explored, making it a challenge to accurately discover statistics. In this work, we propose sampling methods estimate from few queried...

10.1145/2629564 article EN ACM Transactions on Knowledge Discovery from Data 2014-09-23

Counting 3-, 4-, and 5-node graphlets in graphs is important for graph mining applications such as discovering abnormal/ evolution patterns social biology networks. In addition, it recently widely used computing similarities between classification protein function prediction malware detection. However, challenging to compute these graphlet counts a large or set of due the combinatorial nature problem. Despite recent efforts counting 3-node 4-node graphlets, little attention has been paid...

10.1109/tkde.2017.2756836 article EN IEEE Transactions on Knowledge and Data Engineering 2017-09-26

Legal case retrieval aims to automatically scour comparable legal cases based on a given query, which is crucial for offering relevant precedents support the judgment in intelligent systems. Due similar goals, it often associated with matching task. To address them, daunting challenge assessing uniquely defined legal-rational similarity within judicial domain, distinctly deviates from semantic similarities general text retrieval. Past works either tagged domain-specific factors or...

10.1145/3725729 article EN ACM transactions on office information systems 2025-03-21

Understanding mobile data traffic and forecasting future trend is beneficial to wireless carriers service providers who need perform resource allocation energy saving management. However, predicting accurately at large-scale fine-granularity particularly challenging due the following two factors: spatial correlations between network units (i.e., a cell tower or an access point) introduced by user arbitrary movements, time-evolving nature of movements which frequently changes with time. In...

10.1109/tmc.2021.3079117 article EN IEEE Transactions on Mobile Computing 2021-05-11

Graphs are widely used to represent the relations among entities. When one owns complete data, an entire graph can be easily built, therefore performing analysis on is straightforward. However, in many scenarios, it impractical centralize data due privacy concerns. An organization or party only keeps a part of whole i.e., isolated from different parties. Recently, Federated Learning (FL) has been proposed solve isolation issue, mainly for Euclidean data. It still challenge apply FL because...

10.1109/tpds.2023.3240527 article EN IEEE Transactions on Parallel and Distributed Systems 2023-01-01

Predicting interactions between structured entities lies at the core of numerous tasks such as drug regimen and new material design. In recent years, graph neural networks have become attractive. They represent graphs then extract features from each individual using convolution operations. However, these methods some limitations: i) their only a fix-sized subgraph structure (i.e., receptive field) node, ignore in substructures different sizes, ii) are extracted by considering entity...

10.24963/ijcai.2019/551 preprint EN 2019-07-28

Characterizing motif (i.e., locally connected sub-graph patterns) statistics is important for understanding complex networks such as online social and communication networks. Previous work made the strong assumption that graph topology of interest known in advance. In practice, sometimes researchers have to deal with situation where unknown because it expensive collect store all topological meta information. Hence, typically what available only a snapshot graph, i.e., subgraph graph....

10.1109/icde.2016.7498312 article EN 2016-05-01

Bipartite graphs widely exist in real-world scenarios and model binary relations like host-website, author-paper, user-product. In bipartite graphs, a butterfly (i.e., <inline-formula><tex-math notation="LaTeX">$2\times 2$</tex-math></inline-formula> bi-clique) is the smallest non-trivial cohesive structure plays an important role applications such as anomaly detection. Considerable efforts focus on counting butterflies static graphs. However, they suffer from high time space complexity when...

10.1109/tkde.2021.3062987 article EN IEEE Transactions on Knowledge and Data Engineering 2021-03-02

Follower networks such as Twitter and Digg are becoming popular form of social information networks. This paper seeks to gain insights into how they evolve the relationship between their structure ability spread information. By studying Douban follower network, which is a online network in China, we provide some evidences showing its suitability for spreading. For example, it exhibits an unbalanced bow-tie with large out-component, indicates that majority users can widely; effective diameter...

10.1109/infcomw.2011.5928945 article EN 2011-04-01

Despite recent efforts to characterize online social network (OSN) structures and activities, user behavior across different OSNs has received little attention. Yet such information could provide insight into issues relating personal privacy protection. For instance, many Foursquare users reveal their Facebook Twitter accounts the public. The authors' in-depth measurement study examines users' activities settings Facebook, Twitter, Foursquare. Results show that are highly correlated among...

10.1109/mic.2013.128 article EN IEEE Internet Computing 2014-01-31

Calculating the number of distinct values (i.e., NDV) in a column big table is costly yet fundamental to variety database applications such as data compression and profiling. To reduce high time space cost, sketch methods (e.g., HyperLogLog) have been proposed, which estimate NDV from constructed compact summary values. However, these fail or are manage fully-dynamic scenarios where often inserted into deleted table. solve this issue, we propose novel method, <italic...

10.1109/tkde.2024.3359710 article EN IEEE Transactions on Knowledge and Data Engineering 2024-01-29

The host connection degree distribution (HCDD) is an important metric for network security monitoring. However, it difficult to accurately obtain the HCDD in real time high-speed links with a massive amount of traffic data. In this paper, we propose new sketch method build probabilistic summary host's flows using uniform Flajolet-Martin combined small bitmap. To study its performance comparison previous sampling and methods, present general model that encompasses all these methods. With...

10.1109/tifs.2014.2312544 article EN IEEE Transactions on Information Forensics and Security 2014-03-19

The unbiasedness of online product ratings, an important property to ensure that users' ratings indeed reflect their true evaluations products, is vital both in shaping consumer purchase decisions and providing reliable recommendations. Recent experimental studies showed distortions from historical would ruin the subsequent ratings. How "discover" each single rating (or at micro-level), perform "debiasing operations" real systems are main objectives this work.

10.1145/3109859.3109885 article EN 2017-08-24

Many real-world datasets are given in the format of data streams, and processing these streams is fundamental for many applications such as anomaly detection. In this paper, we study problem computing item frequencies, finding topk hot items, detecting heavy changes. However, widelyused sketches cost large memory usage their performance easily affected by unbalanced distribution streams. To solve issue, a novel method Cold Filter (CF) proposed to split cold items use separate structure...

10.1109/icde51399.2021.00075 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2021-04-01

Random walk-based graph sampling methods have become increasingly popular and important for characterizing large-scale complex networks. While powerful, they are known to exhibit problems when the is loosely connected, which slows down convergence of a random walk can result in poor estimation accuracy. In this work, we observe that many graphs under study, called target graphs, usually do not exist isolation. situations, often related an auxiliary affiliation graph, becomes better connected...

10.1109/icde.2015.7113346 article EN 2015-04-01

Background: ADAMTS1 and ADAMTS8 are proteases involved in extracellular matrix proteolysis antiangiogenesis, but little is known about their expression function cerebral ischemia. We investigated the changes a rat model of permanent middle artery occlusion (pMCAO). The expressions glyseraldehyde‐3‐phosphate dehydrogenase (GAPDH), β‐actin, cyclophilin, RPL13A were examined order to validate appropriate housekeeping genes for long duration after inducing Methods: Male Sprague–Dawley rats...

10.1111/j.1399-6576.2006.01161.x article EN Acta Anaesthesiologica Scandinavica 2006-10-31

As an important metric in graphs, group closeness centrality measures how close a of vertices is to all other graph, and it used numerous graph applications such as measuring the dominance influence node over graph. However, when large-scale contains hundreds millions nodes/edges which cannot reside entirely computer's main memory, maximizing become challenging tasks. In this paper, we present systematic solution for efficiently calculating disk-resident graphs. Our first leverages...

10.1145/2567948.2579356 article EN 2014-04-07

Large language models often necessitate grounding on external knowledge to generate faithful and reliable answers. Yet even with the correct groundings in reference, they can ignore them rely wrong or their inherent biases hallucinate when users, being largely unaware of specifics stored information, pose questions that might not directly correlate retrieved groundings. In this work, we formulate alignment problem introduce MixAlign, a framework interacts both human user base obtain...

10.48550/arxiv.2305.13669 preprint EN cc-by arXiv (Cornell University) 2023-01-01
Coming Soon ...