Qi Li

ORCID: 0000-0002-3136-2157
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Mobile Crowdsensing and Crowdsourcing
  • Biomedical Text Mining and Ontologies
  • Semantic Web and Ontologies
  • Advanced Text Analysis Techniques
  • Advanced Graph Neural Networks
  • Web Data Mining and Analysis
  • Data Quality and Management
  • Text and Document Classification Technologies
  • Expert finding and Q&A systems
  • Privacy-Preserving Technologies in Data
  • Data Stream Mining Techniques
  • Sentiment Analysis and Opinion Mining
  • Speech and dialogue systems
  • Recommender Systems and Techniques
  • Gene expression and cancer classification
  • Software Engineering Research
  • Bioinformatics and Genomic Networks
  • Electronic Health Records Systems
  • Advanced Computational Techniques and Applications
  • Domain Adaptation and Few-Shot Learning
  • Forest Biomass Utilization and Management
  • Software Reliability and Analysis Research
  • Complex Network Analysis Techniques

Tsinghua University
2013-2024

Nanning Normal University
2024

Qiqihar Medical University
2024

Zhongyuan University of Technology
2024

Shanghai CASB Biotechnology (China)
2024

Iowa State University
2013-2023

Zhejiang University-University of Edinburgh Institute
2022-2023

Shaoxing University
2020-2023

Shandong Normal University
2022-2023

Xijing University
2023

We present an incremental joint framework to simultaneously extract entity mentions and relations using structured perceptron with efficient beam-search. A segment-based decoder based on the idea of semi-Markov chain is adopted new as opposed traditional token-based tagging. In addition, by virtue inexact search, we developed a number effective global features soft constraints capture interdependency among relations. Experiments Automatic Content Extraction (ACE) 1 corpora demonstrate that...

10.3115/v1/p14-1038 article EN cc-by 2014-01-01

In many real world applications, the same item may be described by multiple sources. As a consequence, conflicts among these sources are inevitable, which leads to an important task: how identify piece of information is trustworthy, i.e., truth discovery task. Intuitively, if from reliable source, then it more and source that provides trustworthy reliable. Based on this principle, approaches have been proposed infer reliability degrees most (i.e., truth) simultaneously. However, existing...

10.14778/2735496.2735505 article EN Proceedings of the VLDB Endowment 2014-12-01

In crowdsourced data aggregation task, there exist conflicts in the answers provided by large numbers of sources on same set questions. The most important challenge for this task is to estimate source reliability and select that are high-quality sources. Existing work solves problem simultaneously estimating sources' inferring questions' true (i.e., truths). However, these methods assume a has degree all questions, but ignore fact may vary significantly among different topics. To capture...

10.1145/2783258.2783314 article EN 2015-08-07

In the era of big data, information regarding same objects can be collected from increasingly more sources. Unfortunately, there usually exist conflicts among coming different To tackle this challenge, truth discovery, i.e., to integrate multi-source noisy by estimating reliability each source, has emerged as a hot topic. many real world applications, however, may come sequentially, and consequence, well sources dynamically evolving. Existing discovery methods, unfortunately, cannot handle...

10.1145/2783258.2783277 article EN 2015-08-07

We leverage crowd wisdom for multiple-choice question answering, and employ lightweight machine learning techniques to improve the aggregation accuracy of crowdsourced answers these questions. In order develop more effective methods evaluate them empirically, we developed deployed a system playing “Who wants be millionaire?” quiz show. Analyzing our data (which consist than 200,000 answers), find that by just going with most selected answer in aggregation, can over 90% questions correctly,...

10.1609/aaai.v28i2.19016 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2014-07-27

In this paper, we propose a new framework that unifies the output of three information extraction (IE) tasks - entity mentions, relations and events as an network representation, extracts all them using one single joint model based on structured prediction. This novel formulation allows different parts fully interact with each other. For example, many can now be considered resultant states events. Our approach achieves substantial improvements over traditional pipelined approaches,...

10.3115/v1/d14-1198 article EN 2014-01-01

In many applications, one can obtain descriptions about the same objects or events from a variety of sources. As result, this will inevitably lead to data information conflicts. One important problem is identify true (i.e., <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">truths</i> ) among conflicting sources data. It intuitive trust reliable more when deriving truths, but it usually unknown which xmlns:xlink="http://www.w3.org/1999/xlink">a...

10.1109/tkde.2016.2559481 article EN IEEE Transactions on Knowledge and Data Engineering 2016-04-28

With the proliferation of sensor-rich mobile devices, crowd sensing has emerged as a new paradigm collecting information from physical world. However, sensory data provided by participating workers are usually not reliable. In order to identify truthful values data, topic truth discovery, whose goal is estimate each worker's reliability and infer underlying truths through weighted aggregation, widely studied. Since discovery incorporates workers' into aggregation procedure, it shows...

10.1145/3209582.3209594 article EN 2018-06-20

Large language models (LLMs) exhibited powerful capability in various natural processing tasks. This work focuses on exploring LLM performance zero-shot information extraction, with a focus the ChatGPT and named entity recognition (NER) task. Inspired by remarkable reasoning of symbolic arithmetic reasoning, we adapt prevalent methods to NER propose strategies tailored for NER. First, explore decomposed question-answering paradigm breaking down task into simpler subproblems labels. Second,...

10.18653/v1/2023.emnlp-main.493 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

Thanks to information explosion, data for the objects of interest can be collected from increasingly more sources. However, same object, there usually exist conflicts among multi-source information. To tackle this challenge, truth discovery, which integrates noisy by estimating reliability each source, has emerged as a hot topic. Several discovery methods have been proposed various scenarios, and they successfully applied in diverse application domains. In survey, we focus on providing...

10.48550/arxiv.1505.02463 preprint EN other-oa arXiv (Cornell University) 2015-01-01

As an effective way to solicit useful information from the crowd, crowdsourcing has emerged as a popular paradigm solve challenging tasks. However, data provided by participating workers are not always trustworthy. In real world, there may exist malicious in systems who conduct poisoning attacks for purpose of sabotage or financial rewards. Although aggregation methods such majority voting conducted on workers» labels order improve quality, they vulnerable treat all equally. capture variety...

10.1145/3178876.3186032 article EN 2018-01-01

Dian Yu, Luheng He, Yuan Zhang, Xinya Du, Panupong Pasupat, Qi Li. Proceedings of the 2021 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2021.

10.18653/v1/2021.naacl-main.59 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2021-01-01

This study uses deep-learning models to predict city partition crime counts on specific days. It helps police enhance surveillance, gather intelligence, and proactively prevent crimes. We formulate count prediction as a spatiotemporal sequence challenge, where both input data targets are sequences. In order improve the accuracy of forecasting, we introduce new model that combines Convolutional Neural Networks (CNN) Long Short-Term Memory (LSTM) networks. conducted comparative analysis access...

10.48550/arxiv.2502.07465 preprint EN arXiv (Cornell University) 2025-02-11

Genotype imputation is a critical preprocessing step in genome-wide association studies (GWAS), enhancing statistical power for detecting associated single nucleotide polymorphisms (SNPs) by increasing marker size. In response to the needs of researchers seeking user-friendly graphical tools without requiring informatics or computer expertise, we have developed weIMPUTE, web-based user interface (GUI). Unlike existing genotype software, weIMPUTE supports multiple including SHAPEIT, Eagle,...

10.3389/fgene.2025.1532464 article EN cc-by Frontiers in Genetics 2025-03-17

Influenza-like illness (ILI) continues to present significant challenges global health, highlighting the need for accurate forecasting guide timely public health responses. Traditional statistical and deep learning models, though widely applied, often face difficulties in capturing complex nonlinear dynamics addressing data scarcity. This study examines potential of fine-tuned large language models (LLMs), including Llama2 GPT2, multi-step influenza forecasting. A specialized fine-tuning...

10.1101/2025.03.27.25324747 preprint EN cc-by-nc medRxiv (Cold Spring Harbor Laboratory) 2025-03-31

Traditional isolated monolingual name taggers tend to yield inconsistent results across two languages. In this paper, we propose novel approaches jointly and consistently extract names from parallel corpora. The first approach uses standard linear-chain Conditional Random Fields (CRFs) as the learning framework, incorporating cross-lingual features propagated between second is based on a joint CRFs model decode sentence pairs, bilingual factors word alignment. Experiments Chinese-English...

10.1145/2396761.2398506 article EN 2012-10-29

In this paper, we present GDA, a generalized decision aggregation framework that integrates information from distributed sensor nodes for making in resource efficient manner. Traditional approaches target similar problems only take as input the discrete label individual sensors observe same events. Different them, our proposed GDA is able to advantage of confidence each about its decision, and thus achieves higher accuracy. Targeting problem domains, can naturally handle scenarios where...

10.1109/rtss.2014.40 article EN 2014-12-01

Ofer Bronstein, Ido Dagan, Qi Li, Heng Ji, Anette Frank. Proceedings of the 53rd Annual Meeting Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2015.

10.3115/v1/p15-2061 article EN cc-by 2015-01-01

Applications of multiphoton processes in lanthanide-doped nanophosphors (NPs) are often limited by relatively weak and narrow absorbance. Here, the concept an ultimate photosensitization aggregation-induced enhanced emission (AIEE) dyes to overcome this limitation is introduced. Because AIEE do not suffer from concentration quenching, they can fully cover NP surface at high density maximize absorbance while passivating surface. This applied down-conversion quantum cutting. Specifically,...

10.1021/acs.nanolett.8b01724 article EN Nano Letters 2018-06-24
Coming Soon ...