Jinhua Zhu

ORCID: 0000-0003-2157-9077
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Computational Drug Discovery Methods
  • Radiation Effects in Electronics
  • Protein Structure and Dynamics
  • Multimodal Machine Learning Applications
  • Machine Learning in Materials Science
  • Fusion materials and technologies
  • Nuclear Materials and Properties
  • Machine Learning in Bioinformatics
  • Reinforcement Learning in Robotics
  • Satellite Communication Systems
  • Software Engineering Research
  • Genomics and Phylogenetic Studies
  • Interconnection Networks and Systems
  • vaccines and immunoinformatics approaches
  • E-commerce and Technology Innovations
  • Biomedical Text Mining and Ontologies
  • Advanced Bandit Algorithms Research
  • Speech and Audio Processing
  • Ion-surface interactions and analysis
  • Indoor and Outdoor Localization Technologies
  • Low-power high-performance VLSI design
  • Adaptive Dynamic Programming Control
  • Cryptographic Implementations and Security

University of Science and Technology of China
2018-2025

Science and Technology on Surface Physics and Chemistry Laboratory
2023-2024

SP Technology (South Korea)
2023

Tianjin University
2018-2022

State Key Laboratory of Nuclear Physics and Technology
2017-2019

Peking University
2015-2019

Microsoft Research Asia (China)
2019

Xi'an University of Science and Technology
2019

Zhejiang Yuexiu University
2015

Sun Yat-sen University
2009

The recently proposed BERT has shown great power on a variety of natural language understanding tasks, such as text classification, reading comprehension, etc. However, how to effectively apply neural machine translation (NMT) lacks enough exploration. While is more commonly used fine-tuning instead contextual embedding for downstream in NMT, our preliminary exploration using better than fine-tuning. This motivates us think leverage NMT along this direction. We propose new algorithm named...

10.48550/arxiv.2002.06823 preprint EN public-domain arXiv (Cornell University) 2020-01-01

While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study natural language tasks still very limited. In this paper, we present a novel method for neural machine translation.Different from previous that randomly drop, swap or replace words with other sentence, softly augment chosen word sentence by contextual mixture multiple related words. More accurately, one-hot representation distribution (provided model) over...

10.18653/v1/p19-1555 preprint EN cc-by 2019-01-01

Multimodal Sentiment Analysis (MSA) is a challenging research area that studies sentiment expressed from multiple heterogeneous modalities. Given those pre-trained language models such as BERT have shown state-of-the-art (SOTA) performance in NLP disciplines, existing tend to integrate these modalities into and treat the MSA single prediction task. However, we find simply fusing multimodal features cannot well establish power of strong model. Besides, classification ability each modality...

10.1109/taslp.2022.3178204 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2022-01-01

While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study natural language tasks still very limited. In this paper, we present a novel method for neural machine translation. Different from previous that randomly drop, swap or replace words with other sentence, softly augment chosen word sentence by contextual mixture multiple related words. More accurately, one-hot representation distribution (provided model) over...

10.48550/arxiv.1905.10523 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Molecular representation learning has attracted much attention recently. A molecule can be viewed as a 2D graph with nodes/atoms connected by edges/bonds, and also represented 3D conformation 3-dimensional coordinates of all atoms. We note that most previous work handles information separately, while jointly leveraging these two sources may foster more informative representation. In this work, we explore appealing idea propose new method based on unified pre-training. Atom interatomic...

10.1145/3534678.3539368 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022-08-12

In pixel-based reinforcement learning (RL), the states are raw video frames, which mapped into hidden representation before feeding to a policy network. To improve sample efficiency of state learning, recently, most prominent work is based on contrastive unsupervised representation. Witnessing that consecutive frames in game highly correlated, further data efficiency, we propose new algorithm, i.e., masked for RL (M-CURL), takes correlation among inputs consideration. our architecture,...

10.1109/tpami.2022.3176413 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-01-01

Molecular pre-training, which is about to learn an effective representation for molecules on large amount of data, has attracted substantial attention in cheminformatics and bioinformatics. A molecule can be viewed as either a graph (where atoms are connected by bonds) or SMILES sequence depth-first-search applied the molecular with specific rules). The Transformer neural networks (GNN) two representative methods deal sequential data graphic globally locally model respectively supposed...

10.1145/3580305.3599317 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023-08-04

Accurate prediction of drug-target affinity (DTA) is vital importance in early-stage drug discovery, facilitating the identification drugs that can effectively interact with specific targets and regulate their activities. While wet experiments remain most reliable method, they are time-consuming resource-intensive, resulting limited data availability poses challenges for deep learning approaches. Existing methods have primarily focused on developing techniques based available DTA data,...

10.1093/bib/bbad386 article EN Briefings in Bioinformatics 2023-09-22

Tritium (T) is a costly radioactive element that, when retained in plasma-facing materials (PFMs), not only results fuel loss but also raises issues of contamination. Hydrogen isotope exchange potential method for T removal future fusion devices. However, the nuclear environment, PFMs will be subjected to low-energy and high-flux helium (He) plasma irradiation, forming He bubble layer near material surface. This greatly impacts diffusion retention behavior hydrogen isotopes PFMs. In this...

10.1016/j.nme.2024.101596 article EN cc-by-nc-nd Nuclear Materials and Energy 2024-01-23

Yingce Xia, Xu Tan, Fei Tian, Gao, Di He, Weicong Chen, Yang Fan, Linyuan Gong, Yichong Leng, Renqian Luo, Yiren Wang, Lijun Wu, Jinhua Zhu, Tao Qin, Tie-Yan Liu. Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1). 2019.

10.18653/v1/w19-5348 article EN cc-by 2019-01-01

Recent research trends in computational biology have increasingly focused on integrating text and bio-entity modeling, especially the context of molecules proteins. However, previous efforts like BioT5 faced challenges generalizing across diverse tasks lacked a nuanced understanding molecular structures, particularly their textual representations (e.g., IUPAC). This paper introduces BioT5+, an extension framework, tailored to enhance biological drug discovery. BioT5+ incorporates several...

10.48550/arxiv.2402.17810 preprint EN arXiv (Cornell University) 2024-02-27

Precisely predicting the drug-drug interaction (DDI) is an important application and host research topic in drug discovery, especially for avoiding adverse effect when using combination treatment patients. Nowadays, machine learning deep methods have achieved great success DDI prediction. However, we notice that most of works ignore importance relation type building prediction models. In this work, propose a novel R$^2$-DDI framework, which introduces relation-aware feature refinement module...

10.1093/bib/bbac576 article EN Briefings in Bioinformatics 2022-11-28

The interaction between drugs and targets (DTI) in human body plays a crucial role biomedical science applications. As millions of papers come out every year the domain, automatically discovering DTI knowledge from literature, which are usually triplets about drugs, their interaction, becomes an urgent demand industry. Existing methods biological mainly extractive approaches that often require detailed annotations (e.g. all mentions entities, relations two entity mentions, etc.). However, it...

10.1093/bioinformatics/btac648 article EN Bioinformatics 2022-10-04

A bstract Accurately solving the structures of protein complexes is crucial for understanding and further modifying biological activities. Recent success AlphaFold its variants shows that deep learning models are capable accurately predicting complex structures, yet with painstaking effort homology search pairing. To bypass this need, we present Uni-Fold MuSSe (Multimer Single Sequence inputs), which predicts from their primary sequences aid pre-trained language models. Specifically, built...

10.1101/2023.02.14.528571 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2023-02-15

Antibodies are proteins that effectively protect the human body by binding to pathogens. Recently, deep learning-based computational antibody design has attracted popular attention since it automatically mines patterns from data could be complementary experiences. However, methods heavily rely on high-quality structure data, which is quite limited. Besides, complementarity-determining region (CDR), key component of an determines specificity and affinity, highly variable hard predict....

10.1145/3580305.3599468 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023-08-04

A Viterbi decoder is used in many communication receivers to efficiently decode the received signal that has been convolutional encoded transmitter. This decoding corrects errors occur due noise and other imperfections channel key achieve a low bit error rate. If implemented on SRAM-based field-programmable gate array (SRAM-FPGA), radiation-induced soft can affect operation of by corrupting configuration memory, which change circuit functionality will not be corrected unless FPGA...

10.1109/tnano.2019.2925872 article EN IEEE Transactions on Nanotechnology 2019-01-01

Inspired by its success in natural language processing and computer vision, pre-training has attracted substantial attention cheminformatics bioinformatics, especially for molecule based tasks. A can be represented either a graph (where atoms are connected bonds) or SMILES sequence depth-first-search is applied to the molecular with specific rules). Existing works on use representations only only. In this work, we propose leverage both design new algorithm, dual-view (briefly, DMP), that...

10.48550/arxiv.2106.10234 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Recently, various auxiliary tasks have been proposed to accelerate representation learning and improve sample efficiency in deep reinforcement (RL). However, existing do not take the characteristics of RL problems into consideration are unsupervised. By leveraging returns, most important feedback signals RL, we propose a novel task that forces learnt representations discriminate state-action pairs with different returns. Our loss is theoretically justified learn capture structure new form...

10.48550/arxiv.2102.10960 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...