- Bayesian Methods and Mixture Models
- Artificial Intelligence in Healthcare
- Machine Learning in Healthcare
- Face and Expression Recognition
- AI in cancer detection
- Image Retrieval and Classification Techniques
- Soil Geostatistics and Mapping
- Machine Learning and Data Classification
- Mobile Crowdsensing and Crowdsourcing
- Machine Learning and Algorithms
- Text and Document Classification Technologies
- Soil Moisture and Remote Sensing
- Hydrological Forecasting Using AI
- Radiomics and Machine Learning in Medical Imaging
- Neural Networks and Applications
- Gaussian Processes and Bayesian Inference
- Web Data Mining and Analysis
- Domain Adaptation and Few-Shot Learning
- Topic Modeling
- Sparse and Compressive Sensing Techniques
- Complex Network Analysis Techniques
- Biomedical Text Mining and Ontologies
- Auction Theory and Applications
- Blood Pressure and Hypertension Studies
- Advanced Image and Video Retrieval Techniques
Jiangxi University of Science and Technology
2023
LinkedIn (United States)
2016-2023
Chengdu University of Information Technology
2022
Yunnan University
2021
Siemens Healthcare (United States)
2009-2016
Beijing University of Posts and Telecommunications
2015-2016
Chinese Academy of Sciences
2012-2016
Institute of Soil Science
2012-2016
University of Toledo
2015
Siemens (Germany)
2006-2013
For many supervised learning tasks it may be infeasible (or very expensive) to obtain objective and reliable labels. Instead, we can collect subjective (possibly noisy) labels from multiple experts or annotators. In practice, there is a substantial amount of disagreement among the annotators, hence great practical interest address conventional problems in this scenario. paper describe probabilistic approach for when have annotators providing but no absolute gold standard. The proposed...
We describe a probabilistic approach for supervised learning when we have multiple experts/annotators providing (possibly noisy) labels but no absolute gold standard. The proposed algorithm evaluates the different experts and also gives an estimate of actual hidden labels. Experimental results indicate that method is superior to commonly used majority voting baseline.
With the advent of crowdsourcing services it has become quite cheap and reasonably effective to get a data set labeled by multiple annotators in short amount time. Various methods have been proposed estimate consensus labels correcting for bias with different kinds expertise. Since we do not control over quality annotators, very often annotations can be dominated spammers, defined as who assign randomly without actually looking at instance. Spammers make cost acquiring expensive potentially...
In contrast to traditional document retrieval, a web page as whole is not good information unit search because it often contains multiple topics and lot of irrelevant from navigation, decoration, interaction part the page. this paper, we propose VIsion-based Page Segmentation (VIPS) algorithm detect semantic content structure in Compared with simple DOM based segmentation method, our scheme utilizes useful visual cues obtain better partition at level. By using VIPS assist selection query...
Latent semantic indexing (LSI) is a well-known unsupervised approach for dimensionality reduction in information retrieval. However if the output (i.e. category labels) available, it often beneficial to derive not only based on inputs but also target values training data set. This of particular importance applications with multiple labels, which each document can belong several categories simultaneously. In this paper we introduce multi-label informed latent (MLSI) algorithm preserves and...
Multi-label problems arise in various domains such as multi-topic document categorization and protein function prediction. One natural way to deal with is construct a binary classifier for each label, resulting set of independent classification problems. Since the multiple labels share same input space, semantics conveyed by different are usually correlated, it essential exploit correlation information contained labels. In this paper, we consider general framework extracting shared...
Multi-label problems arise in various domains such as multi-topic document categorization, protein function prediction, and automatic image annotation. One natural way to deal with is construct a binary classifier for each label, resulting set of independent classification problems. Since multiple labels share the same input space, semantics conveyed by different are usually correlated, it essential exploit correlation information contained labels. In this paper, we consider general...
We provide an overview of the recent trends toward digitalization and large-scale data analytics in healthcare. It is expected that these are instrumental dramatic changes way healthcare will be organized future. discuss political initiatives designed to shift care delivery processes from paper electronic, with goals more effective treatments better outcomes; cost pressure a major driver innovation. describe newly developed networks providers, research organizations, commercial vendors...
Multiple-topic and varying-length of web pages are two negative factors significantly affecting the performance search. In this paper, we explore use page segmentation algorithms to partition into blocks investigate how take advantage block-level evidence improve retrieval in context. Because special characteristics pages, different method will have impact on search performance. We compare four types methods, including fixed-length segmentation, DOM-based vision-based a combined which...
Principal component analysis (PCA) has been extensively applied in data mining, pattern recognition and information retrieval for unsupervised dimensionality reduction. When labels of are available, e.g., a classification or regression task, PCA is however not able to use this information. The problem more interesting if only part the input labeled, i.e., semi-supervised setting. In paper we propose supervised model called SPPCA S2PPCA, both which extensions probabilistic model. proposed...
Purpose: Classic statistical and machine learning models such as support vector machines (SVMs) can be used to predict cancer outcome, but often only perform well if all the input variables are known, which is unlikely in medical domain. Bayesian network (BN) have a natural ability reason under uncertainty might handle missing data better. In this study, authors hypothesize that BN model two-year survival non-small cell lung (NSCLC) patients accurately SVM, will more when missing. Methods: A...
Co-training (or more generally, co-regularization) has been a popular algorithm for semi-supervised learning in data with two feature representations views), but the fundamental assumptions underlying this type of models are still unclear. In paper we propose Bayesian undirected graphical model co-training, or generally multi-view learning. This makes explicit previously unstated large class co-training algorithms, and also clarifies circumstances under which these fail. Building upon new...
Most current multi-task learning frameworks ignore the robustness issue, which means that presence of "outlier" tasks may greatly reduce overall system performance. We introduce a robust framework for Bayesian multitask learning, t-processes (TP), are generalization Gaussian processes (GP) learning. TP allows to effectively distinguish good from noisy or outlier tasks. Experiments show not only improves performance, but can also serve as an indicator "informativeness" different