Weihua Hu

ORCID: 0000-0003-2956-2616
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Graph Neural Networks
  • Machine Learning and Data Classification
  • Atmospheric chemistry and aerosols
  • Domain Adaptation and Few-Shot Learning
  • Atmospheric Ozone and Climate
  • Atmospheric and Environmental Gas Dynamics
  • Software-Defined Networks and 5G
  • Machine Learning and Algorithms
  • Advanced Image and Video Retrieval Techniques
  • Data Quality and Management
  • Network Traffic and Congestion Control
  • Educational Technology and Assessment
  • Bayesian Modeling and Causal Inference
  • Algorithms and Data Compression
  • Topic Modeling
  • Data Mining Algorithms and Applications
  • Face and Expression Recognition
  • Machine Learning in Materials Science
  • Seismology and Earthquake Studies
  • Error Correcting Code Techniques
  • earthquake and tectonic studies
  • Higher Education and Teaching Methods
  • Anomaly Detection Techniques and Applications
  • Innovative Educational Techniques
  • Mobile Ad Hoc Networks

James Madison University
2025

Xi'an Polytechnic University
2012-2023

Google (United Kingdom)
2023

DeepMind (United Kingdom)
2023

University Town of Shenzhen
2022

Harbin Institute of Technology
2022

Shenzhen University
2022

Stanford University
2016-2021

The University of Tokyo
1996-2018

RIKEN Center for Advanced Intelligence Project
2017

Graph Neural Networks (GNNs) are an effective framework for representation learning of graphs. GNNs follow a neighborhood aggregation scheme, where the vector node is computed by recursively aggregating and transforming vectors its neighboring nodes. Many GNN variants have been proposed achieved state-of-the-art results on both graph classification tasks. However, despite revolutionizing learning, there limited understanding their representational properties limitations. Here, we present...

10.48550/arxiv.1810.00826 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Deep learning with noisy labels is practically challenging, as the capacity of deep models so high that they can totally memorize these sooner or later during training. Nonetheless, recent studies on memorization effects neural networks show would first training data clean and then those labels. Therefore in this paper, we propose a new paradigm called ''Co-teaching'' for combating Namely, train two simultaneously, let them teach each other given every mini-batch: firstly, network feeds...

10.5555/3327757.3327944 article EN Neural Information Processing Systems 2018-01-01

We present the Open Graph Benchmark (OGB), a diverse set of challenging and realistic benchmark datasets to facilitate scalable, robust, reproducible graph machine learning (ML) research. OGB are large-scale, encompass multiple important ML tasks, cover range domains, ranging from social information networks biological networks, molecular graphs, source code ASTs, knowledge graphs. For each dataset, we provide unified evaluation protocol using meaningful application-specific data splits...

10.48550/arxiv.2005.00687 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Global medium-range weather forecasting is critical to decision-making across many social and economic domains. Traditional numerical prediction uses increased compute resources improve forecast accuracy but does not directly use historical data the underlying model. Here, we introduce GraphCast, a machine learning-based method trained from reanalysis data. It predicts hundreds of variables for next 10 days at 0.25° resolution globally in under 1 minute. GraphCast significantly outperforms...

10.1126/science.adi2336 article EN cc-by Science 2023-11-14

Mobile carrier networks follow an architecture where network elements and their interfaces are defined in detail through standardization, but provide limited ways to develop new features once deployed. In recent years we have witnessed rapid growth over-the-top mobile applications a 10-fold increase subscriber traffic while ground-breaking innovation took back seat. We argue that can benefit from advances computer science pertinent technology trends by incorporating way of thinking current...

10.1109/mcom.2013.6553677 article EN IEEE Communications Magazine 2013-07-01

Global medium-range weather forecasting is critical to decision-making across many social and economic domains. Traditional numerical prediction uses increased compute resources improve forecast accuracy, but cannot directly use historical data the underlying model. We introduce a machine learning-based method called "GraphCast", which can be trained from reanalysis data. It predicts hundreds of variables, over 10 days at 0.25 degree resolution globally, in under one minute. show that...

10.48550/arxiv.2212.12794 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Learning discrete representations of data is a central machine learning task because the compactness and ease interpretation. The includes clustering hash as special cases. Deep neural networks are promising to be used they can model non-linearity scale large datasets. However, their complexity huge, therefore, we need carefully regularize in order learn useful that exhibit intended invariance for applications interest. To this end, propose method called Information Maximizing Self-Augmented...

10.48550/arxiv.1702.08720 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Distributionally Robust Supervised Learning (DRSL) is necessary for building reliable machine learning systems. When deployed in the real world, its performance can be significantly degraded because test data may follow a different distribution from training data. DRSL with f-divergences explicitly considers worst-case shift by minimizing adversarially reweighted loss. In this paper, we analyze DRSL, focusing on classification scenario. Since formulated scenario, naturally expect it to give...

10.48550/arxiv.1611.02041 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Enabling effective and efficient machine learning (ML) over large-scale graph data (e.g., graphs with billions of edges) can have a great impact on both industrial scientific applications. However, existing efforts to advance ML been largely limited by the lack suitable public benchmark. Here we present OGB Large-Scale Challenge (OGB-LSC), collection three real-world datasets for facilitating advancements in ML. The OGB-LSC are orders magnitude larger than ones, covering core tasks -- link...

10.48550/arxiv.2103.09430 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Answering complex logical queries on large-scale incomplete knowledge graphs (KGs) is a fundamental yet challenging task. Recently, promising approach to this problem has been embed KG entities as well the query into vector space such that answer are embedded close query. However, prior work models single points in space, which problematic because represents potentially large set of its entities, but it unclear how can be represented point. Furthermore, only handle use conjunctions...

10.48550/arxiv.2002.05969 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Collecting labeled data is costly and thus a critical bottleneck in real-world classification tasks. To mitigate this problem, we propose novel setting, namely learning from complementary labels for multi-class classification. A label specifies class that pattern does not belong to. would be less laborious than collecting ordinary labels, since users do have to carefully choose the correct long list of candidate classes. However, are informative suitable approach needed better learn them. In...

10.48550/arxiv.1705.07541 preprint EN other-oa arXiv (Cornell University) 2017-01-01

We present the Temporal Graph Benchmark (TGB), a collection of challenging and diverse benchmark datasets for realistic, reproducible, robust evaluation machine learning models on temporal graphs. TGB are large scale, spanning years in duration, incorporate both node edge-level prediction tasks cover set domains including social, trade, transaction, transportation networks. For tasks, we design protocols based realistic use-cases. extensively each dataset find that performance common can...

10.48550/arxiv.2307.01026 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Network Function Virtualization (NFV) and Software Defined (SDN) technologies makes it possible for the Telco Operators to assign resource virtual network functions (VNF) on demand. Provision orchestration of physical is crucial both Quality Service (QoS) guarantee cost management in cloud computing environment. Auto-scaling mechanism essential lifecycle those VNFs. Threshold based policy always applied classic IT environments which can not satisfy carrier grade requirements such as...

10.1109/glocom.2015.7417181 article EN 2015 IEEE Global Communications Conference (GLOBECOM) 2015-12-01

Distribution shifts -- where the training distribution differs from test can substantially degrade accuracy of machine learning (ML) systems deployed in wild. Despite their ubiquity real-world deployments, these are under-represented datasets widely used ML community today. To address this gap, we present WILDS, a curated benchmark 10 reflecting diverse range that naturally arise applications, such as across hospitals for tumor identification; camera traps wildlife monitoring; and time...

10.48550/arxiv.2012.07421 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Uncovering thematic structures of SNS and blog posts is a crucial yet challenging task, because the severe data sparsity induced by short length texts diverse use vocabulary.This hinders effective topic inference traditional LDA it infers topics based on document-level co-occurrence words.To robustly infer in such contexts, we propose latent concept model (LCTM).Unlike LDA, LCTM reveals via concepts, which introduce as variables to capture conceptual similarity words.More specifically,...

10.18653/v1/p16-2062 article EN cc-by 2016-01-01

Measurements of a suite atmospheric trace constituents made from the NASA DC‐8 aircraft, while it was making vertical profiles during Pacific Exploratory Mission A (PEM‐West A) to western in September–October 1991 have revealed layered structure much region. Ozone, water vapor, carbon monoxide, and methane were available continuously are primary used define layers; nonmethane hydrocarbons, dioxide, hydrogen peroxide methylhydroperoxide less frequently but also used. From 105 profiles,...

10.1029/95jd02613 article EN Journal of Geophysical Research Atmospheres 1996-01-01

The DC‐8 mission of September 27, 1991, was designed to sample air flowing into Typhoon Mireille in the boundary layer, upper tropospheric eye region, and emerging from typhoon ahead system, also troposphere. objective find how a redistributes trace constituents West Pacific region whether any such redistribution is important on global scale. layer (300 m), SE eye, contained low mixing ratios tracer species O 3 , CO, C 2 H 6 8 CS but high values dimethylsulfide (DMS). relative showed...

10.1029/95jd01374 article EN Journal of Geophysical Research Atmospheres 1996-01-01
Coming Soon ...