NFDI4DS | UHH-SEMS - Publication Details

Shipeng Yu

ORCID: 0000-0002-0262-4031

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5084034136

Research Areas

Bayesian Methods and Mixture Models
Artificial Intelligence in Healthcare
Machine Learning in Healthcare
Face and Expression Recognition
AI in cancer detection
Image Retrieval and Classification Techniques
Soil Geostatistics and Mapping
Machine Learning and Data Classification
Mobile Crowdsensing and Crowdsourcing
Machine Learning and Algorithms
Text and Document Classification Technologies
Soil Moisture and Remote Sensing
Hydrological Forecasting Using AI
Radiomics and Machine Learning in Medical Imaging
Neural Networks and Applications
Gaussian Processes and Bayesian Inference
Web Data Mining and Analysis
Domain Adaptation and Few-Shot Learning
Topic Modeling
Sparse and Compressive Sensing Techniques
Complex Network Analysis Techniques
Biomedical Text Mining and Ontologies
Auction Theory and Applications
Blood Pressure and Hypertension Studies
Advanced Image and Video Retrieval Techniques

Jiangxi University of Science and Technology
2023

LinkedIn (United States)
2016-2023

Chengdu University of Information Technology
2022

Yunnan University
2021

Siemens Healthcare (United States)
2009-2016

Beijing University of Posts and Telecommunications
2015-2016

Chinese Academy of Sciences
2012-2016

Institute of Soil Science
2012-2016

University of Toledo
2015

Siemens (Germany)
2006-2013

Learning From Crowds

OPENALEX - Publications

Vikas C. Raykar Shipeng Yu Linda Zhao Gerardo Hermosillo Valadez Charles Florin and 2 more

For many supervised learning tasks it may be infeasible (or very expensive) to obtain objective and reliable labels. Instead, we can collect subjective (possibly noisy) labels from multiple experts or annotators. In practice, there is a substantial amount of disagreement among the annotators, hence great practical interest address conventional problems in this scenario. paper describe probabilistic approach for when have annotators providing but no absolute gold standard. The proposed...

10.5555/1756006.1859894 article EN Journal of Machine Learning Research 2010-03-01

Supervised learning from multiple experts

OPENALEX - Publications

Vikas C. Raykar Shipeng Yu Linda Zhao Anna Jerebko Charles Florin and 3 more

We describe a probabilistic approach for supervised learning when we have multiple experts/annotators providing (possibly noisy) labels but no absolute gold standard. The proposed algorithm evaluates the different experts and also gives an estimate of actual hidden labels. Experimental results indicate that method is superior to commonly used majority voting baseline.

10.1145/1553374.1553488 article EN 2009-06-14

Eliminating spammers and ranking annotators for crowdsourced labeling tasks

OPENALEX - Publications

Vikas C. Raykar Shipeng Yu

With the advent of crowdsourcing services it has become quite cheap and reasonably effective to get a data set labeled by multiple annotators in short amount time. Various methods have been proposed estimate consensus labels correcting for bias with different kinds expertise. Since we do not control over quality annotators, very often annotations can be dominated spammers, defined as who assign randomly without actually looking at instance. Spammers make cost acquiring expensive potentially...

10.5555/2188385.2188401 article EN Journal of Machine Learning Research 2012-01-01

Improving pseudo-relevance feedback in web information retrieval using web page segmentation

OPENALEX - Publications

Shipeng Yu Deng Cai Ji-Rong Wen Wei‐Ying Ma

In contrast to traditional document retrieval, a web page as whole is not good information unit search because it often contains multiple topics and lot of irrelevant from navigation, decoration, interaction part the page. this paper, we propose VIsion-based Page Segmentation (VIPS) algorithm detect semantic content structure in Compared with simple DOM based segmentation method, our scheme utilizes useful visual cues obtain better partition at level. By using VIPS assist selection query...

10.1145/775152.775155 article EN 2003-01-01

Multi-label informed latent semantic indexing

OPENALEX - Publications

Kai Yu Shipeng Yu Volker Tresp

Latent semantic indexing (LSI) is a well-known unsupervised approach for dimensionality reduction in information retrieval. However if the output (i.e. category labels) available, it often beneficial to derive not only based on inputs but also target values training data set. This of particular importance applications with multiple labels, which each document can belong several categories simultaneously. In this paper we introduce multi-label informed latent (MLSI) algorithm preserves and...

10.1145/1076034.1076080 article EN 2005-08-15

Extracting shared subspace for multi-label classification

OPENALEX - Publications

Shuiwang Ji Lei Tang Shipeng Yu Jieping Ye

Multi-label problems arise in various domains such as multi-topic document categorization and protein function prediction. One natural way to deal with is construct a binary classifier for each label, resulting set of independent classification problems. Since the multiple labels share same input space, semantics conveyed by different are usually correlated, it essential exploit correlation information contained labels. In this paper, we consider general framework extracting shared...

10.1145/1401890.1401939 article EN 2008-08-24

A shared-subspace learning framework for multi-label classification

OPENALEX - Publications

Shuiwang Ji Lei Tang Shipeng Yu Jieping Ye

Multi-label problems arise in various domains such as multi-topic document categorization, protein function prediction, and automatic image annotation. One natural way to deal with is construct a binary classifier for each label, resulting set of independent classification problems. Since multiple labels share the same input space, semantics conveyed by different are usually correlated, it essential exploit correlation information contained labels. In this paper, we consider general...

10.1145/1754428.1754431 article EN ACM Transactions on Knowledge Discovery from Data 2010-05-01

Going Digital: A Survey on Digitalization and Large-Scale Data Analytics in Healthcare

OPENALEX - Publications

Volker Tresp J. Marc Overhage Markus Bundschus Shahrooz Rabizadeh Peter A. Fasching and 1 more

We provide an overview of the recent trends toward digitalization and large-scale data analytics in healthcare. It is expected that these are instrumental dramatic changes way healthcare will be organized future. discuss political initiatives designed to shift care delivery processes from paper electronic, with goals more effective treatments better outcomes; cost pressure a major driver innovation. describe newly developed networks providers, research organizations, commercial vendors...

10.1109/jproc.2016.2615052 article EN Proceedings of the IEEE 2016-10-19

Predicting readmission risk with institution-specific prediction models

OPENALEX - Publications

Shipeng Yu Faisal Farooq Alexander Van Esbroeck Glenn Fung Vikram Anand and 1 more

10.1016/j.artmed.2015.08.005 article EN Artificial Intelligence in Medicine 2015-08-22

Block-based web search

OPENALEX - Publications

Deng Cai Shipeng Yu Ji-Rong Wen Wei‐Ying Ma

Multiple-topic and varying-length of web pages are two negative factors significantly affecting the performance search. In this paper, we explore use page segmentation algorithms to partition into blocks investigate how take advantage block-level evidence improve retrieval in context. Because special characteristics pages, different method will have impact on search performance. We compare four types methods, including fixed-length segmentation, DOM-based vision-based a combined which...

10.1145/1008992.1009070 article EN 2004-07-25

Supervised probabilistic principal component analysis

OPENALEX - Publications

Shipeng Yu Kai Yu Volker Tresp Hans‐Peter Kriegel Mingrui Wu

Principal component analysis (PCA) has been extensively applied in data mining, pattern recognition and information retrieval for unsupervised dimensionality reduction. When labels of are available, e.g., a classification or regression task, PCA is however not able to use this information. The problem more interesting if only part the input labeled, i.e., semi-supervised setting. In paper we propose supervised model called SPPCA S2PPCA, both which extensions probabilistic model. proposed...

10.1145/1150402.1150454 article EN 2006-08-20

Comparison of Bayesian network and support vector machine models for two-year survival prediction in lung cancer patients treated with radiotherapy

OPENALEX - Publications

K. Jayasurya Glenn Fung Shipeng Yu Cary Oberije Dirk De Ruysscher and 5 more

Purpose: Classic statistical and machine learning models such as support vector machines (SVMs) can be used to predict cancer outcome, but often only perform well if all the input variables are known, which is unlikely in medical domain. Bayesian network (BN) have a natural ability reason under uncertainty might handle missing data better. In this study, authors hypothesize that BN model two-year survival non-small cell lung (NSCLC) patients accurately SVM, will more when missing. Methods: A...

10.1118/1.3352709 article EN Medical Physics 2010-03-09

Bayesian Co-Training

OPENALEX - Publications

Shipeng Yu Balaji Krishnapuram Rómer Rosales R. Bharat Rao

Co-training (or more generally, co-regularization) has been a popular algorithm for semi-supervised learning in data with two feature representations views), but the fundamental assumptions underlying this type of models are still unclear. In paper we propose Bayesian undirected graphical model co-training, or generally multi-view learning. This makes explicit previously unstated large class co-training algorithms, and also clarifies circumstances under which these fail. Building upon new...

10.5555/1953048.2078190 article EN Journal of Machine Learning Research 2011-02-01

Development and External Validation of Prognostic Model for 2-Year Survival of Non–Small-Cell Lung Cancer Patients Treated With Chemoradiotherapy

OPENALEX - Publications

Cary Oberije Shipeng Yu Dirk De Ruysscher Sabine Meersschout Karen Van Beek and 6 more

10.1016/j.ijrobp.2008.08.052 article EN International Journal of Radiation Oncology*Biology*Physics 2008-12-26

The importance of patient characteristics for the prediction of radiation-induced lung toxicity

OPENALEX - Publications

Cary Oberije Dirk De Ruysscher Angela van Baardwijk Shipeng Yu Bharat Rao and 1 more

10.1016/j.radonc.2008.12.002 article EN Radiotherapy and Oncology 2009-01-14

Robust multi-task learning with t -processes

OPENALEX - Publications

Shipeng Yu Volker Tresp Kai Yu

Most current multi-task learning frameworks ignore the robustness issue, which means that presence of "outlier" tasks may greatly reduce overall system performance. We introduce a robust framework for Bayesian multitask learning, t-processes (TP), are generalization Gaussian processes (GP) learning. TP allows to effectively distinguish good from noisy or outlier tasks. Experiments show not only improves performance, but can also serve as an indicator "informativeness" different

10.1145/1273496.1273635 article EN 2007-06-20

Development and Validation of a Prognostic Model Using Blood Biomarker Information for Prediction of Survival of Non–Small-Cell Lung Cancer Patients Treated With Combined Chemotherapy and Radiation or Radiotherapy Alone (NCT00181519, NCT00573040, and NCT00572325)

OPENALEX - Publications

Cary Oberije Hugo J.W.L. Aerts Shipeng Yu Dirk De Ruysscher Paul Menheere and 4 more

10.1016/j.ijrobp.2010.06.011 article EN International Journal of Radiation Oncology*Biology*Physics 2010-10-02

GIS-mapping spatial distribution of soil salinity for Eco-restoring the Yellow River Delta in combination with Electromagnetic Induction

OPENALEX - Publications

Guangming Liu Jinbiao Li Xuechen Zhang Xiuping Wang Zhenzhen Lv and 3 more

10.1016/j.ecoleng.2016.05.037 article EN Ecological Engineering 2016-06-19

Coming Soon ...