- Advanced Graph Neural Networks
- Complex Network Analysis Techniques
- Topic Modeling
- Natural Language Processing Techniques
- Data Visualization and Analytics
- Multimodal Machine Learning Applications
- Graph Theory and Algorithms
- Bayesian Modeling and Causal Inference
- Opinion Dynamics and Social Influence
- Recommender Systems and Techniques
- Data Quality and Management
- Video Analysis and Summarization
- Bioinformatics and Genomic Networks
- Data Management and Algorithms
- Time Series Analysis and Forecasting
- Data Stream Mining Techniques
- Graph theory and applications
- Human Mobility and Location-Based Analysis
- Advanced Text Analysis Techniques
- Advanced Image and Video Retrieval Techniques
- Caching and Content Delivery
- Semantic Web and Ontologies
- Speech and dialogue systems
- Domain Adaptation and Few-Shot Learning
- Anomaly Detection Techniques and Applications
Adobe Systems (United States)
2017-2025
Universidade Federal de São Paulo
2024
Northwestern University
2022
Purdue University West Lafayette
2010-2021
Universitat Politècnica de Catalunya
2021
University of Massachusetts Amherst
2021
Southern California University for Professional Studies
2021
University of Southern California
2021
Palo Alto Research Center
2015-2019
California Institute of Technology
2010
NetworkRepository (NR) is the first interactive data repository with a web-based platform for visual analytics. Unlike other repositories (e.g., UCI ML Data Repository, and SNAP), network (networkrepository.com) allows users to not only download, but interactively analyze visualize such using our graph analytics platform. Users can in real-time analyze, visualize, compare, explore along many different dimensions. The aim of NR make it easy discover key insights into extremely fast little...
Networks evolve continuously over time with the addition, deletion, and changing of links nodes. Although many networks contain this type temporal information, majority research in network representation learning has focused on static snapshots graph largely ignored dynamics network. In work, we describe a general framework for incorporating information into embedding methods. The gives rise to methods time-respecting embeddings from continuous-time dynamic networks. Overall, experiments...
Graph classification is a problem with practical applications in many different domains. To solve this problem, one usually calculates certain graph statistics (i.e., features) that help discriminate between graphs of classes. When calculating such features, most existing approaches process the entire graph. In graphlet-based approach, for instance, processed to get total count graphlets or subgraphs. real-world applications, however, can be noisy discriminative patterns confined regions...
From social science to biology, numerous applications often rely on graphlets for intuitive and meaningful characterization of networks at both the global macro-level as well local micro-level. While have witnessed a tremendous success impact in variety domains, there has yet be fast efficient approach computing frequencies these subgraph patterns. However, existing methods are not scalable large with millions nodes edges, which impedes application new problems that require large-scale...
Graph-structured data arise naturally in many different application domains. By representing as graphs, we can capture entities (i.e., nodes) well their relationships edges) with each other. Many useful insights be derived from graph-structured demonstrated by an ever-growing body of work focused on graph mining. However, the real-world, graphs both large—with complex patterns—and noisy, which pose a problem for effective An way to deal this issue is incorporate “attention” into mining...
Graph Neural Networks (GNNs) have proven to be useful for many different practical applications. However, existing GNN models implicitly assumed homophily among the nodes connected in graph, and therefore largely overlooked important setting of heterophily, where most are from classes. In this work, we propose a novel framework called CPGNN that generalizes GNNs graphs with either or heterophily. The proposed incorporates an interpretable compatibility matrix modeling heterophily level which...
Abstract Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these can learn, perpetuate, amplify harmful biases. In article, we present a comprehensive survey bias evaluation mitigation techniques for LLMs. We first consolidate, formalize, expand notions fairness in natural defining distinct facets harm introducing several...
The `pre-train, prompt, predict' paradigm of large language models (LLMs) has achieved remarkable success in open-domain question answering (OD-QA). However, few works explore this multi-document (MD-QA), a task demanding thorough understanding the logical associations among contents and structures documents. To fill crucial gap, we propose Knowledge Graph Prompting (KGP) method to formulate right context prompting LLMs for MD-QA, which consists graph construction module traversal module....
Given a large time-evolving graph, how can we model and characterize the temporal behaviors of individual nodes (and network states)? How behavioral transition patterns nodes? We propose behavior that captures "roles" in graph they evolve over time. The proposed dynamic mixed-membership (DBMM) is scalable, fully automatic (no user-defined parameters), non-parametric/data-driven specific functional form or parameterization), interpretable (identifies explainable patterns), flexible...
Roles represent node-level connectivity patterns such as star-center, star-edge nodes, near-cliques or nodes that act bridges to different regions of the graph. Intuitively, two belong same role if they are structurally similar. have been mainly interest sociologists, but more recently, roles become increasingly useful in other domains. Traditionally, notion were defined based on graph equivalences structural, regular, and stochastic equivalences. We briefly revisit these early notions...
This paper describes a general framework for learning Higher-Order Network Embeddings (HONE) from graph data based on network motifs. The HONE is highly expressive and flexible with many interchangeable components. experimental results demonstrate the effectiveness of higher-order representations. In all cases, outperforms recent embedding methods that are unable to capture structures mean relative gain in AUC 19% (and up 75% gain) across wide variety networks methods.
We propose a fast, parallel maximum clique algorithm for large sparse graphs that is designed to exploit characteristics of social and information networks. Despite clique's status as an NP-hard problem with poor approximation guarantees, our method exhibits nearly linear runtime scaling over real-world networks ranging from 1000 100 million nodes. In test on network 1.8 billion edges, the finds largest in about 20 minutes. Key efficiency are initial heuristic procedure quickly parallelized...
We present a fast, parallel maximum clique algorithm for large sparse graphs that is designed to exploit characteristics of social and information networks. The method exhibits roughly linear runtime scaling over real-world networks ranging from thousand hundred million nodes. In test on network with 1.8 billion edges, the finds largest in about 20 minutes. At its heart employs branch-and-bound strategy novel aggressive pruning techniques. techniques include combined use core numbers...
The success of deep convolutional neural networks in the domains computer vision and speech recognition has led researchers to investigate generalizations said architecture graph-structured data. A recently-proposed method called Graph Convolutional Networks been able achieve state-of-the-art results task node classification. However, since proposed relies on localized first-order approximations spectral graph convolutions, it is unable capture higher-order interactions between nodes graph....
Recent advances in contrastive representation learning over paired image-text data have led to models such as CLIP that achieve state-of-the-art performance for zero-shot classification and distributional robustness. Such typically require joint reasoning the image text spaces downstream inference tasks. Contrary prior beliefs, we demonstrate representations learned via a standard objective are not interchangeable can lead inconsistent predictions. To mitigate this issue, formalize...
Using histological sections, the gonads of samples yellow and silver eels two populations were examined. The previously analysed for growth sex ratio. structures observed are similar to those described in previous publications European eel, Anguilla anguilla indicated Pacific A. japonica . Well differentiated present eels. In eels, ranging age from 0 + 2 years a length 20 cm that at which they become silver, undif‐ferentiated both found. Histological evidence is presented suggests ovary,...
This paper presents a general inductive graph representation learning framework called <inline-formula><tex-math notation="LaTeX">$\text{DeepGL}$</tex-math></inline-formula> for deep node <i>and</i> edge features that generalize across-networks. In particular, begins by deriving set of base from the (e.g., graphlet features) and automatically learns multi-layered hierarchical where each successive layer leverages output previous to learn higher-order. Contrary work, <i>relational...
We propose Graph Priority Sampling ( gps ), a new paradigm for order-based reservoir sampling from massive graph streams. provides general way to weight edge according auxiliary and/or size variables so as accomplish various estimation goals of properties. In the context subgraph counting, we show how weights can be chosen minimize variance counts specified sets subgraphs. distinction with many prior schemes, separates functions and estimation. two frameworks: (1) Post-Stream estimation,...
Random walks are at the heart of many existing network embedding methods. However, such algorithms have limitations that arise from use random walks, e.g., features resulting these methods unable to transfer new nodes and graphs as they tied vertex identity. In this work, we introduce Role2Vec framework which uses flexible notion attributed serves a basis for generalizing DeepWalk, node2vec, others leverage walks. Our proposed enables be more widely applicable both transductive inductive...
Scientific data repositories have historically made widely accessible to the scientific community, and led better research through comparisons, reproducibility, as well further discoveries insights. Despite growing importance utilization of in many disciplines, design existing has not changed for decades. In this paper, we revisit current envision interactive repositories, which only make accessible, but also provide techniques exploration, mining, visualization an easy, intuitive,...
Random walks are at the heart of many existing node embedding and network representation learning methods. However, such methods have limitations that arise from use traditional random walks, e.g., embeddings resulting these capture proximity (communities) among vertices as opposed to structural similarity (roles). Furthermore, unable transfer new nodes graphs they tied identity. To overcome limitations, we introduce <italic xmlns:mml="http://www.w3.org/1998/Math/MathML"...
Graph-structured data arise naturally in many different application domains. By representing as graphs, we can capture entities (i.e., nodes) well their relationships edges) with each other. Many useful insights be derived from graph-structured demonstrated by an ever-growing body of work focused on graph mining. However, the real-world, graphs both large - complex patterns and noisy which pose a problem for effective An way to deal this issue is incorporate "attention" into mining...
Structural roles define sets of structurally similar nodes that are more to inside the set than outside, whereas communities with connections outside. Roles based on structural similarity and proximity fundamentally different but important complementary notions. Recently, notion has become increasingly gained a lot attention due proliferation work learning representations (node/edge embeddings) from graphs preserve roles. Unfortunately, recent sometimes confused (based proximity) leading...
Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in application Statistical Learning (SRL) algorithms these domains. In this article, we examine categorize techniques for transforming graph-based relational improve SRL algorithms. particular, appropriate transformations nodes, links, and/or features can dramatically affect capabilities...