Lin Dai

ORCID: 0000-0002-0093-137X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Text Analysis Techniques
  • Natural Language Processing Techniques
  • Topic Modeling
  • Sentiment Analysis and Opinion Mining
  • Biomedical Text Mining and Ontologies
  • Text and Document Classification Technologies
  • Genetics, Bioinformatics, and Biomedical Research
  • Traffic Prediction and Management Techniques
  • Genomics and Phylogenetic Studies
  • Advanced Steganography and Watermarking Techniques
  • Data Mining Algorithms and Applications
  • Machine Learning and Algorithms
  • Evaluation Methods in Various Fields
  • Hate Speech and Cyberbullying Detection
  • Semantic Web and Ontologies
  • Chaos-based Image/Signal Encryption
  • Transportation Planning and Optimization
  • Human Mobility and Location-Based Analysis
  • Social Media and Politics
  • Machine Learning in Bioinformatics
  • Advanced Control Systems Optimization
  • Machine Learning and Data Classification
  • Data Quality and Management
  • Digital Filter Design and Implementation
  • Advanced Computational Techniques and Applications

Beijing Institute of Technology
2010-2024

Northwestern Polytechnical University
2024

Harbin Institute of Technology
2024

Beijing Institute of Genomics
2012-2013

Chinese Academy of Sciences
2012-2013

Zhejiang Gongshang University
2010

The web is being loaded daily with a huge volume of data, mainly unstructured textual which increases the need for information extraction and NLP systems significantly. Named‐entity recognition task key step towards efficiently understanding text data saving time effort. Being widely used language globally, English taking over most research conducted in this field, especially biomedical domain. Unlike other languages, Arabic suffers from lack resources. This work presents BERT‐based model to...

10.1155/2021/6633213 article EN cc-by Complexity 2021-01-01

The car‐sharing system is a popular rental model for cars in shared use. It has become particularly attractive due to its flexibility; that is, the car can be rented and returned anywhere within one of authorized parking slots. main objective this research work predict usage stations investigate factors help improve prediction. Thus, new strategies designed make more on road fewer stations. To achieve that, various machine learning models, namely vector autoregression (VAR), support...

10.1155/2022/8843000 article EN cc-by Complexity 2022-01-01

Rice is the most important staple food for a large part of world's human population and also key model organism biological studies crops as well other related plants. Here we present RiceWiki (http://ricewiki.big.ac.cn), wiki-based, publicly editable open-content platform community curation rice genes. Most existing databases are based on expert curation; with exponentially exploding volume knowledge relevant data, however, becomes increasingly laborious time-consuming to keep up-to-date,...

10.1093/nar/gkt926 article EN cc-by-nc Nucleic Acids Research 2013-10-16

Politics is one of the hottest and most commonly mentioned viewed topics on social media networks nowadays. Microblogging platforms like Twitter Weibo are widely used by many politicians who have a huge number followers supporters those platforms. It essential to study supporters’ network political leaders because it can help in decision making when predicting their futures. This focuses three famous Pakistan, namely, Imran Khan (IK), Maryam Nawaz Sharif (MNS), Bilawal Bhutto Zardari (BBZ)....

10.1155/2020/9353120 article EN cc-by Scientific Programming 2020-09-01

Context . Social media platforms such as Facebook and Twitter carry a big load of people’s opinions about politics leaders, which makes them good source information for researchers to exploit different tasks that include election predictions. Objective Identify, categorize, present comprehensive overview the approaches, techniques, tools used in predictions on Twitter. Method Conducted systematic mapping study (SMS) provided empirical evidence work published between January 2010 2021....

10.1155/2021/5565434 article EN cc-by Complexity 2021-01-01

Sentiment classification is an important data mining task. Previous researches tried various machine learning techniques while didn't make fully use of the difference among features. This paper proposes a novel method for improving sentiment by exploring different contribution The consists two parts. First, we highlight sentimental features increasing their weight. Second, bagging to construct multiple classifiers on feature spaces and combine them into aggregating classifier. Extensive...

10.1109/icdmw.2011.96 article EN 2011-12-01

Abstract Summary: Community curation—harnessing community intelligence in knowledge curation, bears great promise dealing with the flood of biological knowledge. To exploit full potential scientific for multiple wikis (bio-wikis) have been built to date. However, none them achieved a substantial impact on curation. One major limitations bio-wikis is insufficient participation, which intrinsically because lack explicit authorship and thus no credit increase curation bio-wikis, here we develop...

10.1093/bioinformatics/btt284 article EN cc-by-nc Bioinformatics 2013-06-03

Named Entity Disambiguation (NED) refers to the task of resolving multiple named entity mentions in an input-text sequence their correct references a knowledge graph. We tackle NED problem by leveraging two novel objectives for pre-training framework, and propose model. Especially, proposed model consists of: (i) concept-enhanced pre-training, aiming at identifying valid lexical semantic relations with concept constraints derived from external resource Probase; (ii) masked language model,...

10.1109/access.2020.2994247 article EN cc-by IEEE Access 2020-01-01

Based on the analysis of existing opinion mining techniques, accuracy and complexity, this paper proposes a system which mines useful information from camera reviews by utilizing Semantic Role Labeling (SRL) polarity computing method. Feature lexicon sentiment are constructed to mine features emotional items. In end, comparison between positive negative presented visually. Our experimental results show is feasible effective.

10.1109/fskd.2010.5569525 article EN 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery 2010-08-01

Harnessing community intelligence in knowledge curation bears significant promise dealing with communication and education the flood of scientific knowledge. As is accumulated at ever-faster rates, nomenclature, a particular kind knowledge, concurrently generated all kinds fields. Since nomenclature system terms used to name things discipline, accurate translation different languages critical importance, not only for communications collaborations English-speaking people, but also...

10.1371/journal.pone.0056961 article EN cc-by PLoS ONE 2013-02-25

With the rapid growth of online news services, users can actively respond to by making comments. Users often express subjective emotions in comments such as sadness, surprise and anger. Such help understand preferences perspectives individual users, therefore may facilitate publishers provide with more relevant services. This paper tackles task predicting for news. To best our knowledge, this is first research work addressing task. In particular, proposes a novel Meta classification approach...

10.1145/2348283.2348468 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2012-08-12

The rapidly growing data in many areas, as well the biomedical domain, require assistance of information extraction systems to acquire much needed knowledge about specific entities such proteins, drugs, or diseases practically within a short time. Annotated corpora serve purpose facilitating process building NLP systems. While colossal work has been done this area for English language, other languages like Arabic seem lack these resources, especially healthcare area. Therefore, work, we...

10.1155/2020/8896659 article EN cc-by Complexity 2020-10-09

In this paper, we propose a novel method that performs Cross Language Text Categorization (CLTC) from the perspective of Information Retrieval. We present an input document in target language form query source language. Then retrieve training documents and find K most relevant results. At last, use class labels results to predict document. The only external resource required by our is bilingual dictionary. Experimental show gives promising performance, which better than translation-based method.

10.1109/icicee.2012.74 article EN International Conference on Industrial Control and Electronics Engineering 2012-08-01

This paper proposes a novel approach for multi-document summarization based on subtopic segmentation. It firstly detects the subtopics in topic, and then finds central sentence each subtopic. The sentences are scored their importance document Two anti-redundancy strategies used to extract form summarization. Since our is intrinsically incremental, it effective when new documents added set. Experimental results indicate that proposed efficient.

10.1109/icmlc.2009.5212767 article EN International Conference on Machine Learning and Cybernetics 2009-07-01

Car-sharing systems require accurate demand prediction to ensure efficient resource allocation and scheduling decisions. However, developing precise predictive models for vehicle remains a challenging problem due the complex spatio-temporal relationships. This paper introduces USTIN, Unified Spatio-Temporal Inference Prediction Network, novel neural network architecture prediction. The model consists of three key components: temporal feature unit, spatial unit. unit utilizes historical data...

10.3390/s24041266 article EN cc-by Sensors 2024-02-16

<title>Abstract</title> In recent years, nonlinear science, particularly chaotic systems, has garnered significant research interest. Cross-coupled maplattices (CCML) represent a typical spatiotemporal structure, commonly utilized in data security fields such as hashfunctions. However, traditional CCMLs have limited parameter range, and their effects need enhancement,restricting application cryptography. This paper proposes an improved signal-enhanced cross-coupled map lattice(SE-CCML),...

10.21203/rs.3.rs-4680904/v1 preprint EN cc-by Research Square (Research Square) 2024-07-30

10.1109/isap62502.2024.10846468 article EN 2022 International Symposium on Antennas and Propagation (ISAP) 2024-11-05

Given a set of labels, multi-label text classification (MLTC) aims to assign multiple relevant labels for text. Recently, deep learning models get inspiring results in MLTC. Training high-quality MLTC model typically demands large-scale labeled data. And comparing with annotations single-label data samples, samples are more time-consuming and expensive. Active can enable achieve optimal prediction performance using fewer samples. Although active has been considered models, there few studies...

10.1038/s41598-024-79249-7 article EN cc-by-nc-nd Scientific Reports 2024-11-15
Coming Soon ...