Wenbing Huang

ORCID: 0000-0002-2566-4159
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Graph Neural Networks
  • Multimodal Machine Learning Applications
  • Advanced Neural Network Applications
  • Domain Adaptation and Few-Shot Learning
  • Human Pose and Action Recognition
  • Complex Network Analysis Techniques
  • Machine Learning in Materials Science
  • Computational Drug Discovery Methods
  • Anomaly Detection Techniques and Applications
  • Reinforcement Learning in Robotics
  • Topic Modeling
  • Recommender Systems and Techniques
  • Neural Networks and Applications
  • Robot Manipulation and Learning
  • Brain Tumor Detection and Classification
  • Advanced Image and Video Retrieval Techniques
  • Generative Adversarial Networks and Image Synthesis
  • Machine Learning and ELM
  • Protein Structure and Dynamics
  • Adversarial Robustness in Machine Learning
  • Tactile and Sensory Interactions
  • Advanced Image Processing Techniques
  • Bioinformatics and Genomic Networks
  • Advanced Sensor and Energy Harvesting Materials
  • Robotics and Sensor-Based Localization

Chongqing University
2024-2025

Renmin University of China
2022-2025

Hong Kong Polytechnic University
2024

Zhejiang University
2020-2024

Central South University
2024

Alibaba Group (China)
2024

Jinan University
2024

State Key Laboratory of Chemical Engineering
2023-2024

Tsinghua University
2014-2023

Beijing Institute of Big Data Research
2022-2023

\emph{Over-fitting} and \emph{over-smoothing} are two main obstacles of developing deep Graph Convolutional Networks (GCNs) for node classification. In particular, over-fitting weakens the generalization ability on small dataset, while over-smoothing impedes model training by isolating output representations from input features with increase in network depth. This paper proposes DropEdge, a novel flexible technique to alleviate both issues. At its core, DropEdge randomly removes certain...

10.48550/arxiv.1907.10903 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Most state-of-the-art action localization systems process each proposal individually, without explicitly exploiting their relations during learning. However, the between proposals actually play an important role in localization, since a meaningful always consists of multiple video. In this paper, we propose to exploit proposal-proposal using GraphConvolutional Networks (GCNs). First, construct graph, where is represented as node and two edge. Here, use types relations, one for capturing...

10.1109/iccv.2019.00719 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Social media has been developing rapidly in public due to its nature of spreading new information, which leads rumors being circulated. Meanwhile, detecting from such massive information social is becoming an arduous challenge. Therefore, some deep learning methods are applied discover through the way they spread, as Recursive Neural Network (RvNN) and so on. However, these only take into account patterns propagation but ignore structures wide dispersion rumor detection. Actually, two...

10.1609/aaai.v34i01.5393 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

The richness in the content of various information networks such as social and communication provides unprecedented potential for learning high-quality expressive representations without external supervision. This paper investigates how to preserve extract abundant from graph-structured data into embedding space an unsupervised manner. To this end, we propose a novel concept, Graphical Mutual Information (GMI), measure correlation between input graphs high-level hidden representations. GMI...

10.1145/3366423.3380112 article EN 2020-04-20

Unsupervised domain adaptation (UDA) transfers knowledge from a label-rich source to fully-unlabeled target domain. To tackle this task, recent approaches resort discriminative transfer in virtue of pseudo-labels enforce the class-level distribution alignment across and domains. These methods, however, are vulnerable error accumulation thus incapable preserving cross-domain category consistency, as pseudo-labeling accuracy is not guaranteed explicitly. In paper, we propose Progressive...

10.1109/cvpr.2019.00072 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

We propose a simple, fast, and accurate one-stage approach to visual grounding, inspired by the following insight. The performances of existing propose-and-rank two-stage methods are capped quality region candidates they in first stage - if none could cover ground truth region, there is no hope second rank right top. To avoid this caveat, we model that enables end-to-end joint optimization. main idea as straightforward fusing text query's embedding into YOLOv3 object detector, augmented...

10.1109/iccv.2019.00478 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

How to obtain informative representations of molecules is a crucial prerequisite in AI-driven drug design and discovery. Recent researches abstract as graphs employ Graph Neural Networks (GNNs) for molecular representation learning. Nevertheless, two issues impede the usage GNNs real scenarios: (1) insufficient labeled supervised training; (2) poor generalization capability new-synthesized molecules. To address them both, we propose novel framework, GROVER, which stands Representation frOm...

10.48550/arxiv.2007.02835 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Most of the recent progresses on visual question answering are based recurrent neural networks (RNNs) with attention. Despite success, these models often timeconsuming and having difficulties in modeling long range dependencies due to sequential nature RNNs. We propose a new architecture, Positional Self-Attention Coattention (PSAC), which does not require RNNs for video answering. Specifically, inspired by success self-attention machine translation task, we calculate response at each...

10.1609/aaai.v33i01.33018658 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

We address the problem of video grounding from natural language queries. The key challenge in this task is that one training might only contain a few annotated starting/ending frames can be used as positive examples for model training. Most conventional approaches directly train binary classifier using such imbalance data, thus achieving inferior results. idea paper to use distances between frame within ground truth and starting (ending) dense supervisions improve accuracy. Specifically, we...

10.1109/cvpr42600.2020.01030 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Despite the recent success of end-to-end learned representations, hand-crafted optical flow features are still widely used in video analysis tasks. To fill this gap, we propose TVNet, a novel trainable neural network, to learn optical-flow-like from data. TVNet subsumes specific solver, TV-L1 method, and is initialized by unfolding its optimization iterations as layers. can therefore be directly without any extra learning. Moreover, it naturally concatenated with other task-specific networks...

10.1109/cvpr.2018.00630 article EN 2018-06-01

Unsupervised image-to-image translation is a central task in computer vision. Current frameworks will abandon the discriminator once training process completed. This paper contends novel role of by reusing it for encoding images target domain. The proposed architecture, termed as NICE-GAN, exhibits two advantageous patterns over previous approaches: First, more compact since no independent component required; Second, this plug-in encoder directly trained adversary loss, making informative...

10.1109/cvpr42600.2020.00819 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Graph Convolutional Networks (GCNs) have become a crucial tool on learning representations of graph vertices. The main challenge adapting GCNs large-scale graphs is the scalability issue that it incurs heavy cost both in computation and memory due to uncontrollable neighborhood expansion across layers. In this paper, we accelerate training through developing an adaptive layer-wise sampling method. By constructing network layer by top-down passway, sample lower conditioned top one, where...

10.48550/arxiv.1809.05343 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Many adaptations of transformers have emerged to address the single-modal vision tasks, where self-attention modules are stacked handle input sources like images. Intuitively, feeding multiple modalities data could improve performance, yet innermodal attentive weights may be diluted, which thus greatly undermine final performance. In this paper, we propose a multimodal token fusion method (TokenFusion), tailored for transformer-based tasks. To effectively fuse modalities, TokenFusion...

10.1109/cvpr52688.2022.01187 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Predicting Click-Through Rate (CTR) in billion-scale recommender systems poses a long-standing challenge for Graph Neural Networks (GNNs) due to the overwhelming computational complexity involved aggregating billions of neighbors.To tackle this, GNNbased CTR models usually sample hundreds neighbors out facilitate efficient online recommendations.However, sampling only small portion results severe bias and failure encompass full spectrum user or item behavioral patterns.To address this...

10.1145/3589334.3645517 article EN Proceedings of the ACM Web Conference 2022 2024-05-08

Node classification and graph are two learning problems that predict the class label of a node respectively. A usually represents real-world entity, e.g., user in social network, or protein protein-protein interaction network. In this work, we consider more challenging but practically useful setting, which itself is instance. This leads to hierarchical perspective arises many domains such as biological network document collection. For example, group people with shared interests forms group,...

10.1145/3308558.3313461 article EN 2019-05-13

With the great success of graph embedding model on both academic and industry area, robustness against adversarial attack inevitably becomes a central problem in learning domain. Regardless fruitful progress, most current works perform white-box fashion: they need to access predictions labels construct their loss. However, inaccessibility real systems makes impractical system. This paper promotes frameworks more general flexible sense – we demand various kinds with black-box driven. To this...

10.1609/aaai.v34i04.5741 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Deep multimodal fusion by using multiple sources of data for classification or regression has exhibited a clear advantage over the unimodal counterpart on various applications. Yet, current methods including aggregation-based and alignment-based are still inadequate in balancing trade-off between inter-modal intra-modal processing, incurring bottleneck performance improvement. To this end, paper proposes Channel-Exchanging-Network (CEN), parameter-free framework that dynamically exchanges...

10.48550/arxiv.2011.05005 preprint EN other-oa arXiv (Cornell University) 2020-01-01

We address the challenging problem of weakly supervised temporal action localization from unconstrained web videos, where only video-level labels are available during training. Inspired by adversarial erasing strategy in semantic segmentation, we propose a novel iterative-winners-out network. Specifically, make two technical contributions: an iterative training strategy, namely, winners-out, to select most discriminative instances each iteration and remove them next iteration. This process...

10.1109/tip.2019.2922108 article EN IEEE Transactions on Image Processing 2019-06-17

Robotic grasping has become increasingly important in many application areas such as industrial manufacturing and logistics. Because of the diversity uncertainty objects environments, common grippers with one single mode face difficulties to fulfill all tasks. Hence, we proposed a soft gripper multiple modes this study. The consists four modular fingers integrated layer jamming structure tendon-driven mechanism. Each finger's rotating shaft base uses torsional spring decouple bending...

10.1089/soro.2020.0065 article EN Soft Robotics 2021-06-10

Temporal action localization, which requires a machine to recognize the location as well category of instances in videos, has long been researched computer vision. The main challenge temporal localization lies that videos are usually and untrimmed with diverse contents involved. Existing state-of-the-art methods divide each video into multiple units (i.e., proposals two-stage segments one-stage methods) then perform recognition/regression on them individually, without explicitly exploiting...

10.1109/tpami.2021.3090167 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-06-17

The crux of molecular property prediction is to generate meaningful representations the molecules. One promising route exploit graph structure through neural networks (GNNs). Both atoms and bonds significantly affect chemical properties a molecule, so an expressive model ought both node (atom) edge (bond) information simultaneously. Inspired by this observation, we explore multi-view modeling with GNN (MVGNN) form novel paralleled framework, which considers equally important when learning...

10.1093/bioinformatics/btac039 article EN Bioinformatics 2022-01-25

Recently, Transformer model, which has achieved great success in many artificial intelligence fields, demonstrated its potential modeling graph-structured data. Till now, a variety of Transformers been proposed to adapt the However, comprehensive literature review and systematical evaluation these variants for graphs are still unavailable. It's imperative sort out existing models systematically investigate their effectiveness on various graph tasks. In this survey, we provide Graph from...

10.48550/arxiv.2202.08455 preprint EN other-oa arXiv (Cornell University) 2022-01-01
Coming Soon ...