- Topic Modeling
- Natural Language Processing Techniques
- Advanced Text Analysis Techniques
- Text and Document Classification Technologies
- Sentiment Analysis and Opinion Mining
- Protein Structure and Dynamics
- Multimodal Machine Learning Applications
- Bioinformatics and Genomic Networks
- Image Processing Techniques and Applications
- Computational Drug Discovery Methods
- Complex Network Analysis Techniques
- Advanced Computational Techniques and Applications
- Chinese history and philosophy
- Food Quality and Safety Studies
- Geomechanics and Mining Engineering
- Advanced Sensor and Control Systems
- Digital Marketing and Social Media
- Enzyme Structure and Function
- Environmental and Agricultural Sciences
- Optical measurement and interference techniques
- Quantum Information and Cryptography
- Semantic Web and Ontologies
- Impact of Technology on Adolescents
- Access Control and Trust
- Quantum optics and atomic interactions
Sichuan University
2015-2025
State Key Laboratory of Biotherapy
2023-2024
Children's Hospital of Philadelphia
2024
University of Auckland
2024
Central University of Finance and Economics
2018-2024
University of Cambridge
2024
Wuhan University
2024
Dalian University of Technology
2008-2024
Beijing Normal University
2015-2024
Tang Hospital
2024
Thanks to the breakthrough of large-scale pre-trained language model (PLM) technology, prompt-based classification tasks, e.g., sentiment analysis and emotion detection, have raised increasing attention. Such tasks are formalized as masked prediction which in line with pre-training objects most models. Thus, one can use a PLM infer words downstream task, then obtaining label predictions manually defined label-word mapping templates. Prompt-based affective computing takes advantages both...
The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilities fast large-batch inference enabled by multi-query attention. StarCoderBase is trained 1 trillion tokens sourced from Stack, a large collection permissively licensed GitHub repositories inspection tools opt-out process. We fine-tuned 35B Python...
Zhengbao Jiang, Frank Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, Graham Neubig. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.
While sentiment analysis systems try to determine the polarities of given targets based on key opinion expressions in input texts, implicit (ISA) cues come an and obscure manner. Thus detecting requires common-sense multi-hop reasoning ability infer latent intent opinion. Inspired by recent chain-of-thought (CoT) idea, this work we introduce a Three-hop Reasoning (THOR) CoT framework mimic human-like process for ISA. We design three-step prompting principle THOR step-by-step induce aspect,...
Despite the continuing efforts to improve engagingness and consistency of chit-chat dialogue systems, majority current work simply focus on mimicking human-like responses, leaving understudied aspects modeling understanding between interlocutors. The research in cognitive science, instead, suggests that is an essential signal for a high-quality conversation. Motivated by this, we propose Pˆ2 Bot, transmitter-receiver based framework with aim explicitly understanding. Specifically, Bot...
Ovarian cancer is one of the most common gynecologic malignancies. Accurate classification ovarian types (serous carcinoma, mucous endometrioid transparent cell carcinoma) an essential part in different diagnosis. Computer-aided diagnosis (CADx) can provide useful advice for pathologists to determine correctly. In our study, we employed a Deep Convolutional Neural Networks (DCNN) based on AlexNet automatically classify cancers from cytological images. The DCNN consists five convolutional...
Distinction between true protein interactions and crystal packing contacts is important for structural bioinformatics studies to respond the need of accurate classification rapidly increasing structures. There are many unannotated there also exist false annotations in this expanding volume data. Previous tools have been proposed address problem. However, challenging issues still remain, such as low performance when training test data contain mixed interfaces having diverse sizes contact...
Accurate determination of protein–ligand binding affinity is a fundamental problem in biochemistry useful for many applications including drug design and docking. A number scoring functions have been proposed the prediction affinity. However, accurate still challenging because poor performance often seen evaluation under leave-one-cluster-out cross-validation (LCOCV). We introduce new function named B2BScore to improve performance. integrates two physicochemical properties prediction. One...
Abstract Hyperentanglement, the entanglement in several degrees of freedom (DOFs) a quantum system, has attracted much attention as it can be used to increase both channel capacity communication and its security largely. Here, we present first scheme completely distinguish hyperentangled Bell states two-photon systems three DOFs with help cross-Kerr nonlinearity without destruction, including two longitudinal momentum polarization DOF. We use construct nondemolition detectors which make...
Recent years the task of incomplete utterance rewriting has raised a large attention. Previous works usually shape it as machine translation and employ sequence to based architecture with copy mechanism. In this paper, we present novel extensive approach, which formulates semantic segmentation task. Instead generating from scratch, such formulation introduces edit operations shapes problem prediction word-level matrix. Benefiting being able capture both local global information, our approach...
Shuang Chen, Qian Liu, Zhiwei Yu, Chin-Yew Lin, Jian-Guang Lou, Feng Jiang. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing: System Demonstrations. 2021.
Concept-level sentiment analysis improves on standard word-level opinion mining by leveraging the power of multiword expressions, linguistic objects formed two or more words that behave like ‘semantic atoms’ displaying formal functional idiosyncratic properties with respect to free word combinations. The extraction meaningful expressions from text, however, is not an easy task, as it goes beyond simple n-gram modeling. In context analysis, such concepts are represented those high...
The O-ring theory reveals that the binding hot spot at a protein interface is surrounded by ring of residues are energetically less important than in spot. As this served to occlude water molecules from spot, also called 'water exclusion' hypothesis. We propose 'double hypothesis refine assuming itself water-free. To computationally model water-free we use biclique pattern defined as two maximal groups chains complex holding property every residue contacts with all other group.Given chain...
Abstract Motivation: B-cell epitope is a small area on the surface of an antigen that binds to antibody. Accurately locating epitopes critical importance for vaccine development. Compared with wet-lab methods, computational methods have strong potential efficient and large-scale prediction candidates at much lower cost. However, it still not clear which features are good determinants accurate prediction, leading unsatisfactory performance existing methods. Method results: We propose more...
While the recent Chain-of-Thought (CoT) technique enhances reasoning ability of large language models (LLMs) with theory mind, it might still struggle in handling logical that relies much on symbolic expressions and rigid deducing rules. To strengthen capability LLMs, we propose a novel Symbolic Chain-of-Thought, namely SymbCoT, fully LLM-based framework integrates logic rules CoT prompting. Technically, building upon an LLM, SymbCoT 1) first translates natural context into format, then 2)...