Donghong Ji

ORCID: 0000-0003-4020-983X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Text Analysis Techniques
  • Biomedical Text Mining and Ontologies
  • Advanced Graph Neural Networks
  • Text and Document Classification Technologies
  • Data Quality and Management
  • Speech and dialogue systems
  • Sentiment Analysis and Opinion Mining
  • Multimodal Machine Learning Applications
  • Web Data Mining and Analysis
  • Advanced Image and Video Retrieval Techniques
  • Semantic Web and Ontologies
  • Computational Drug Discovery Methods
  • Information Retrieval and Search Behavior
  • AI in cancer detection
  • Image Retrieval and Classification Techniques
  • Spam and Phishing Detection
  • Rough Sets and Fuzzy Logic
  • Complex Network Analysis Techniques
  • Artificial Intelligence in Law
  • Domain Adaptation and Few-Shot Learning
  • Misinformation and Its Impacts
  • Genomics and Phylogenetic Studies
  • Data Management and Algorithms

Wuhan University
2015-2025

Guangdong University of Foreign Studies
2019

Xiamen University
2008

Institute for Infocomm Research
2004-2006

The automatic extraction of chemical information from text requires the recognition entity mentions as one its key steps. When developing supervised named (NER) systems, availability a large, manually annotated corpus is desirable. Furthermore, large corpora permit robust evaluation and comparison different approaches that detect chemicals in documents. We present CHEMDNER corpus, collection 10,000 PubMed abstracts contain total 84,355 labeled by expert chemistry literature curators,...

10.1186/1758-2946-7-s1-s2 article EN cc-by Journal of Cheminformatics 2015-01-19

So far, named entity recognition (NER) has been involved with three major types, including flat, overlapped (aka. nested), and discontinuous NER, which have mostly studied individually. Recently, a growing interest built for unified tackling the above jobs concurrently one single model. Current best-performing methods mainly include span-based sequence-to-sequence models, where unfortunately former merely focus on boundary identification latter may suffer from exposure bias. In this work, we...

10.1609/aaai.v36i10.21344 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Biomedical information extraction (BioIE) is an important task. The aim to analyze biomedical texts and extract structured such as named entities semantic relations between them. In recent years, pre-trained language models have largely improved the performance of BioIE. However, they neglect incorporate external structural knowledge, which can provide rich factual support underlying understanding reasoning for extraction. this paper, we first evaluate current methods, including vanilla...

10.1093/bib/bbaa110 article EN Briefings in Bioinformatics 2020-05-07

Biomedical named entity recognition(BNER) is a crucial initial step of information extraction in biomedical domain. The task typically modeled as sequence labeling problem. Various machine learning algorithms, such Conditional Random Fields (CRFs), have been successfully used for this task. However, these state-of-the-art BNER systems largely depend on hand-crafted features.We present recurrent neural network (RNN) framework based word embeddings and character representation. On top the...

10.1186/s12859-017-1868-5 article EN cc-by BMC Bioinformatics 2017-10-30

In this paper, we propose to enhance the pair-wise aspect and opinion terms extraction (PAOTE) task by incorporating rich syntactic knowledge. We first build a syntax fusion encoder for encoding features, including label-aware graph convolutional network (LAGCN) modeling dependency edges labels, as well POS tags unifiedly, local-attention module better term boundary detection. During pairing, then adopt Biaffine Triaffine scoring high-order aspect-opinion in meantime re-harnessing...

10.24963/ijcai.2021/545 article EN 2021-08-01

Document-level relation extraction aims to detect the relations within one document, which is challenging since it requires complex reasoning using mentions, entities, local and global contexts.Few previous studies have distinguished explicitly, may be problematic because they play different roles in intra-and inter-sentence relations.Moreover, interactions between contexts should considered could help based on our observation.In this paper, we propose a novel mention-based (MRN) module...

10.18653/v1/2021.findings-acl.117 article EN cc-by 2021-01-01

Fei Li, ZhiChao Lin, Meishan Zhang, Donghong Ji. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.372 article EN cc-by 2021-01-01

The task of event extraction contains subtasks including detections for entity mentions, triggers and argument roles. Traditional methods solve them as a pipeline, which does not make use correlation their mutual benefits. There have been recent efforts towards building joint model all tasks. However, due to technical challenges, there has work predicting the output structure single task. We build first this end using neural transition-based framework, incrementally complex structures in...

10.24963/ijcai.2019/753 article EN 2019-07-28

Disease prediction based on Electronic Health Records (EHR) has become one hot research topic in biomedical community. Existing work mainly focuses the of target disease, and little is proposed for multiple associated diseases prediction. Meanwhile, a piece EHR usually contains two main information: textual description physical indicators. However, existing largely adopts statistical models with discrete features from numerical indicators EHR, fails to make full use information. In this...

10.1186/s12911-019-0765-4 article EN cc-by BMC Medical Informatics and Decision Making 2019-04-01

We consider retrofitting structure-aware Transformer language model for facilitating end tasks by proposing to exploit syntactic distance encode both the phrasal constituency and dependency connection into model. A middle-layer structural learning strategy is leveraged structure integration, accomplished with main semantic task training under multi-task scheme. Experimental results show that retrofitted achieves improved perplexity, meanwhile inducing accurate phrases. By performing...

10.18653/v1/2020.emnlp-main.168 article EN cc-by 2020-01-01

A majority of research interests in irregular (e.g., nested or discontinuous) named entity recognition (NER) have been paid on entities, while discontinuous entities received limited attention. Existing work for NER, however, either suffers from decoding ambiguity predicting using token-level local features. In this work, we present an innovative model NER based pointer networks, where the simultaneously decides whether a token at each frame constitutes mention and next constituent is. Our...

10.1609/aaai.v35i14.17513 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Abstract Motivation Entity relation extraction is one of the fundamental tasks in biomedical text mining, which usually solved by models from natural language processing. Compared with traditional pipeline methods, joint methods can avoid error propagation entity to relation, giving better performances. However, existing are built upon sequential scheme, and fail detect overlapping ubiquitous texts. The main reason that have relatively weaker power capturing long-range dependencies, results...

10.1093/bioinformatics/btaa993 article EN Bioinformatics 2020-11-17

Deep learning approaches have demonstrated significant progress in breast cancer histopathological image diagnosis. Training an interpretable diagnosis model using high-resolution is still challenging. To alleviate this problem, a novel multi-view attention-guided multiple instance detection network (MA-MIDN) proposed. The traditional classification problem framed as weakly supervised (MIL) problem. We first divide each histopathology into instances and form corresponding bag to fully...

10.1109/access.2021.3084360 article EN cc-by-nc-nd IEEE Access 2021-01-01

The chemical compound and drug name recognition plays an important role in text mining, it is the basis for automatic relation extraction event identification information processing. So a high-performance named entity system names necessary.We developed CHEMDNER based on mixed conditional random fields (CRF) with word clustering recognition. For clustering, we used Brown's hierarchical algorithm Skip-gram model deep learning massive PubMed articles including titles abstracts.This achieved...

10.1186/1758-2946-7-s1-s4 article EN cc-by Journal of Cheminformatics 2015-01-19

10.1016/j.ipm.2015.12.012 article EN Information Processing & Management 2016-01-20

10.1016/j.ipm.2020.102415 article EN Information Processing & Management 2020-11-09
Coming Soon ...