- Natural Language Processing Techniques
- Topic Modeling
- Computational Drug Discovery Methods
- Radiation Effects in Electronics
- Protein Structure and Dynamics
- Multimodal Machine Learning Applications
- Machine Learning in Materials Science
- Fusion materials and technologies
- Nuclear Materials and Properties
- Machine Learning in Bioinformatics
- Reinforcement Learning in Robotics
- Satellite Communication Systems
- Software Engineering Research
- Genomics and Phylogenetic Studies
- Interconnection Networks and Systems
- vaccines and immunoinformatics approaches
- E-commerce and Technology Innovations
- Biomedical Text Mining and Ontologies
- Advanced Bandit Algorithms Research
- Speech and Audio Processing
- Ion-surface interactions and analysis
- Indoor and Outdoor Localization Technologies
- Low-power high-performance VLSI design
- Adaptive Dynamic Programming Control
- Cryptographic Implementations and Security
University of Science and Technology of China
2018-2025
Science and Technology on Surface Physics and Chemistry Laboratory
2023-2024
SP Technology (South Korea)
2023
Tianjin University
2018-2022
State Key Laboratory of Nuclear Physics and Technology
2017-2019
Peking University
2015-2019
Microsoft Research Asia (China)
2019
Xi'an University of Science and Technology
2019
Zhejiang Yuexiu University
2015
Sun Yat-sen University
2009
The recently proposed BERT has shown great power on a variety of natural language understanding tasks, such as text classification, reading comprehension, etc. However, how to effectively apply neural machine translation (NMT) lacks enough exploration. While is more commonly used fine-tuning instead contextual embedding for downstream in NMT, our preliminary exploration using better than fine-tuning. This motivates us think leverage NMT along this direction. We propose new algorithm named...
While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study natural language tasks still very limited. In this paper, we present a novel method for neural machine translation.Different from previous that randomly drop, swap or replace words with other sentence, softly augment chosen word sentence by contextual mixture multiple related words. More accurately, one-hot representation distribution (provided model) over...
Multimodal Sentiment Analysis (MSA) is a challenging research area that studies sentiment expressed from multiple heterogeneous modalities. Given those pre-trained language models such as BERT have shown state-of-the-art (SOTA) performance in NLP disciplines, existing tend to integrate these modalities into and treat the MSA single prediction task. However, we find simply fusing multimodal features cannot well establish power of strong model. Besides, classification ability each modality...
While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study natural language tasks still very limited. In this paper, we present a novel method for neural machine translation. Different from previous that randomly drop, swap or replace words with other sentence, softly augment chosen word sentence by contextual mixture multiple related words. More accurately, one-hot representation distribution (provided model) over...
Molecular representation learning has attracted much attention recently. A molecule can be viewed as a 2D graph with nodes/atoms connected by edges/bonds, and also represented 3D conformation 3-dimensional coordinates of all atoms. We note that most previous work handles information separately, while jointly leveraging these two sources may foster more informative representation. In this work, we explore appealing idea propose new method based on unified pre-training. Atom interatomic...
In pixel-based reinforcement learning (RL), the states are raw video frames, which mapped into hidden representation before feeding to a policy network. To improve sample efficiency of state learning, recently, most prominent work is based on contrastive unsupervised representation. Witnessing that consecutive frames in game highly correlated, further data efficiency, we propose new algorithm, i.e., masked for RL (M-CURL), takes correlation among inputs consideration. our architecture,...
Molecular pre-training, which is about to learn an effective representation for molecules on large amount of data, has attracted substantial attention in cheminformatics and bioinformatics. A molecule can be viewed as either a graph (where atoms are connected by bonds) or SMILES sequence depth-first-search applied the molecular with specific rules). The Transformer neural networks (GNN) two representative methods deal sequential data graphic globally locally model respectively supposed...
Accurate prediction of drug-target affinity (DTA) is vital importance in early-stage drug discovery, facilitating the identification drugs that can effectively interact with specific targets and regulate their activities. While wet experiments remain most reliable method, they are time-consuming resource-intensive, resulting limited data availability poses challenges for deep learning approaches. Existing methods have primarily focused on developing techniques based available DTA data,...
Tritium (T) is a costly radioactive element that, when retained in plasma-facing materials (PFMs), not only results fuel loss but also raises issues of contamination. Hydrogen isotope exchange potential method for T removal future fusion devices. However, the nuclear environment, PFMs will be subjected to low-energy and high-flux helium (He) plasma irradiation, forming He bubble layer near material surface. This greatly impacts diffusion retention behavior hydrogen isotopes PFMs. In this...
Yingce Xia, Xu Tan, Fei Tian, Gao, Di He, Weicong Chen, Yang Fan, Linyuan Gong, Yichong Leng, Renqian Luo, Yiren Wang, Lijun Wu, Jinhua Zhu, Tao Qin, Tie-Yan Liu. Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1). 2019.
Recent research trends in computational biology have increasingly focused on integrating text and bio-entity modeling, especially the context of molecules proteins. However, previous efforts like BioT5 faced challenges generalizing across diverse tasks lacked a nuanced understanding molecular structures, particularly their textual representations (e.g., IUPAC). This paper introduces BioT5+, an extension framework, tailored to enhance biological drug discovery. BioT5+ incorporates several...
Precisely predicting the drug-drug interaction (DDI) is an important application and host research topic in drug discovery, especially for avoiding adverse effect when using combination treatment patients. Nowadays, machine learning deep methods have achieved great success DDI prediction. However, we notice that most of works ignore importance relation type building prediction models. In this work, propose a novel R$^2$-DDI framework, which introduces relation-aware feature refinement module...
The interaction between drugs and targets (DTI) in human body plays a crucial role biomedical science applications. As millions of papers come out every year the domain, automatically discovering DTI knowledge from literature, which are usually triplets about drugs, their interaction, becomes an urgent demand industry. Existing methods biological mainly extractive approaches that often require detailed annotations (e.g. all mentions entities, relations two entity mentions, etc.). However, it...
A bstract Accurately solving the structures of protein complexes is crucial for understanding and further modifying biological activities. Recent success AlphaFold its variants shows that deep learning models are capable accurately predicting complex structures, yet with painstaking effort homology search pairing. To bypass this need, we present Uni-Fold MuSSe (Multimer Single Sequence inputs), which predicts from their primary sequences aid pre-trained language models. Specifically, built...
Antibodies are proteins that effectively protect the human body by binding to pathogens. Recently, deep learning-based computational antibody design has attracted popular attention since it automatically mines patterns from data could be complementary experiences. However, methods heavily rely on high-quality structure data, which is quite limited. Besides, complementarity-determining region (CDR), key component of an determines specificity and affinity, highly variable hard predict....
A Viterbi decoder is used in many communication receivers to efficiently decode the received signal that has been convolutional encoded transmitter. This decoding corrects errors occur due noise and other imperfections channel key achieve a low bit error rate. If implemented on SRAM-based field-programmable gate array (SRAM-FPGA), radiation-induced soft can affect operation of by corrupting configuration memory, which change circuit functionality will not be corrected unless FPGA...
Inspired by its success in natural language processing and computer vision, pre-training has attracted substantial attention cheminformatics bioinformatics, especially for molecule based tasks. A can be represented either a graph (where atoms are connected bonds) or SMILES sequence depth-first-search is applied to the molecular with specific rules). Existing works on use representations only only. In this work, we propose leverage both design new algorithm, dual-view (briefly, DMP), that...
Recently, various auxiliary tasks have been proposed to accelerate representation learning and improve sample efficiency in deep reinforcement (RL). However, existing do not take the characteristics of RL problems into consideration are unsupervised. By leveraging returns, most important feedback signals RL, we propose a novel task that forces learnt representations discriminate state-action pairs with different returns. Our loss is theoretically justified learn capture structure new form...