NFDI4DS | UHH-SEMS - Publication Details

David Källberg

ORCID: 0000-0003-2386-930X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5066238638

Research Areas

Statistical Mechanics and Entropy
Renal and related cancers
Bayesian Methods and Mixture Models
Renal cell carcinoma treatment
Statistical Methods and Inference
Gene expression and cancer classification
Ferroptosis and cancer prognosis
Bioinformatics and Genomic Networks
Neural Networks and Applications
Gaussian Processes and Bayesian Inference
Advanced Causal Inference Techniques
Data Management and Algorithms
Single-cell and spatial transcriptomics
Statistical Methods and Bayesian Inference
Complex Systems and Time Series Analysis
Epigenetics and DNA Methylation
Advanced Database Systems and Queries
Spectroscopy and Chemometric Analyses
Artificial Immune Systems Applications
Financial Risk and Volatility Modeling
Bayesian Modeling and Causal Inference
Chaos-based Image/Signal Encryption
Molecular Biology Techniques and Applications
Genomics and Phylogenetic Studies
Advanced Image and Video Retrieval Techniques

Umeå University
2011-2023

Cluster analysis on high dimensional RNA-seq data with applications to cancer research - An evaluation study

OPENALEX - Publications

Linda Vidman David Källberg Patrik Rydén

Clustering of gene expression data is widely used to identify novel subtypes cancer. Plenty clustering approaches have been proposed, but there a lack knowledge regarding their relative merits and how characteristics influence the performance. We evaluate cluster analysis choices affect performance by studying four publicly available human cancer sets: breast, brain, kidney stomach In particular, we focus on sample size, distribution heterogeneity performance.In general, increasing size had...

10.1371/journal.pone.0219102 article EN cc-by PLoS ONE 2019-12-05

Comparison of Methods for Feature Selection in Clustering of High-Dimensional RNA-Sequencing Data to Identify Cancer Subtypes

OPENALEX - Publications

David Källberg Linda Vidman Patrik Rydén

Cancer subtype identification is important to facilitate cancer diagnosis and select effective treatments. Clustering of patients based on high-dimensional RNA-sequencing data can be used detect novel subtypes, but only a subset the features (e.g., genes) contains information related subtype. Therefore, it reasonable assume that clustering should set carefully selected rather than all features. Several feature selection methods have been proposed, how when use these are still poorly...

10.3389/fgene.2021.632620 article EN cc-by Frontiers in Genetics 2021-02-24

Estimation of entropy-type integral functionals

OPENALEX - Publications

David Källberg Oleg Seleznjev

Entropy-type integral functionals of densities are widely used in mathematical statistics, information theory, and computer science. Examples include measures closeness between distributions (e.g., density power divergence) uncertainty characteristics for a random variable Rényi entropy). In this paper, we study U-statistic estimators class such functionals. The based on ε-close vector observations the corresponding independent identically distributed samples. We prove asymptotic properties...

10.1080/03610926.2013.853789 article EN Communication in Statistics- Theory and Methods 2016-02-10

Statistical estimation of quadratic Rényi entropy for a stationarym-dependent sequence

OPENALEX - Publications

David Källberg Nikolai Leonenko Oleg Seleznjev

The Rényi entropy is a generalisation of the Shannon and widely used in mathematical statistics applied sciences for quantifying uncertainty probability distribution. We consider estimation quadratic related functionals marginal distribution stationary m-dependent sequence. U-statistic estimators under study are based on number ε-close vector observations corresponding sample. A variety asymptotic properties these obtained (e.g. consistency, normality, Poisson convergence). results can be...

10.1080/10485252.2013.854438 article EN Journal of nonparametric statistics 2014-02-07

Combining epigenetic and clinicopathological variables improves specificity in prognostic prediction in clear cell renal cell carcinoma

OPENALEX - Publications

Emma Andersson-Evelönn Linda Vidman David Källberg Mattias Landfors Xijia Liu and 4 more

Abstract Background Metastasized clear cell renal carcinoma (ccRCC) is associated with a poor prognosis. Almost one-third of patients non-metastatic tumors at diagnosis will later progress metastatic disease. These need to be identified already diagnosis, undertake closer follow up and/or adjuvant treatment. Today, clinicopathological variables are used risk classify patients, but molecular biomarkers needed improve classification identify the high-risk which benefit most from modern...

10.1186/s12967-020-02608-1 article EN cc-by Journal of Translational Medicine 2020-11-13

Large Sample Properties of Entropy Balancing Estimators of Average Causal Effects

OPENALEX - Publications

David Källberg Ingeborg Waernbaum

Weighting methods are used in observational studies to adjust for covariate imbalances between treatment and control groups. Entropy balancing (EB) is an alternative inverse probability weighting with estimated propensity score. The EB weights constructed satisfy balance constraints optimized towards stability. Large sample properties of estimators the average causal effect, based on Kullback-Leibler quadratic Rényi relative entropies, described. Additionally, their asymptotic variances...

10.1016/j.ecosta.2023.11.004 article EN cc-by Econometrics and Statistics 2023-11-01

Statistical Modeling for Image Matching in Large Image Databases

OPENALEX - Publications

David Källberg Oleg Seleznjev Nikolai Leonenko Haibo Li

Matching a query (reference) image to an extracted from database containing (possibly) transformed copies is important retrieval task. In this paper we present general method based on matching densities of the corresponding feature vectors by using Bregman distances. We consider statistical estimators for some quadratic entropy-type characteristics. particular, distances can be evaluated in problems whenever images are modeled random large databases. Moreover, used average case analysis...

10.1109/ithings/cpscom.2011.117 article EN 2011-10-01

Estimation of entropy-type integral functionals

OPENALEX - Publications

David Källberg Oleg Seleznjev

Entropy-type integral functionals of densities are widely used in mathematical statistics, information theory, and computer science. Examples include measures closeness between distributions (e.g., density power divergence) uncertainty characteristics for a random variable R\'enyi entropy). In this paper, we study U-statistic estimators class such functionals. The based on epsilon-close vector observations the corresponding independent identically distributed samples. We prove asymptotic...

10.48550/arxiv.1209.2544 preprint EN other-oa arXiv (Cornell University) 2012-01-01

A moment-distance hybrid method for estimating a mixture of two symmetric densities

OPENALEX - Publications

David Källberg Yuri K. Belyaev Patrik Rydén

In clustering of high-dimensional data a variable selection is commonly applied to obtain an accurate grouping the samples. For two-class problems this may be carried out by fitting mixture distribution each variable. We propose hybrid method for estimating parametric two symmetric densities. The estimator combines moments with minimum distance approach. An evaluation study including both extensive simulations and gene expression from acute leukemia patients shows that outperforms...

10.15559/17-vmsta93 article EN cc-by Modern Stochastics Theory and Applications 2018-01-18

Cluster analysis on high dimensional RNA-seq data with applications to cancer research - An evaluation study

OPENALEX - Publications

Linda Vidman David Källberg Patrik Rydén

Abstract Clustering of gene expression data is widely used to identify novel subtypes cancer. Plenty clustering approaches have been proposed, but there a lack knowledge regarding their relative merits and how characteristics influence the performance. We evaluate cluster analysis choices affect performance by studying four publicly available human cancer sets: breast, brain, kidney stomach In particular, we focus on sample size, distribution heterogeneity general, increasing size had...

10.1101/675041 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2019-06-18

Estimation of quadratic density functionals under m-dependence

OPENALEX - Publications

David Källberg Oleg Seleznjev

In this paper, we study estimation of certain integral functionals one or two densities with samples from stationary m-dependent sequences. We consider types U-statistic estimators for these that are functions the number epsilon-close vector observations in samples. show consistent and obtain their rates convergence under weak distributional assumptions. particular, propose based on incomplete U-statistics which have favorable consistency properties even when m-dependence is only dependence...

10.48550/arxiv.1309.5003 preprint EN other-oa arXiv (Cornell University) 2013-01-01

Statistical Inference for Rényi Entropy Functionals

OPENALEX - Publications

David Källberg Nikolai Leonenko Oleg Seleznjev

Numerous entropy-type characteristics (functionals) generalizing R\'enyi entropy are widely used in mathematical statistics, physics, information theory, and signal processing for characterizing uncertainty probability distributions distribution identification problems. We consider estimators of some (integral) functionals discrete continuous based on the number epsilon-close vector records corresponding independent identically distributed samples from two distributions. The form a...

10.48550/arxiv.1103.4977 preprint EN other-oa arXiv (Cornell University) 2011-01-01

Statistical estimation of quadratic Rényi entropy for a stationary m-dependent sequence

OPENALEX - Publications

David Källberg Nikolai Leonenko Oleg Seleznjev

The R\'enyi entropy is a generalization of the Shannon and widely used in mathematical statistics applied sciences for quantifying uncertainty probability distribution. We consider estimation quadratic related functionals marginal distribution stationary m-dependent sequence. U-statistic estimators under study are based on number epsilon-close vector observations corresponding sample. A variety asymptotic properties these obtained (e.g., consistency, normality, Poisson convergence). results...

10.48550/arxiv.1303.1743 preprint EN other-oa arXiv (Cornell University) 2013-01-01

Abstract 5475: Evaluation of feature selection methods used for cluster analysis in identification of novel cancer subtypes

OPENALEX - Publications

Linda Vidman David Källberg Patrik Rydén

Abstract Background: RNA-seq data from tumor samples can be used to identify novel cancer subtypes using cluster analysis. The number of features is often large compared the and different clusters appear in subsets feature space. Feature selection techniques are therefore commonly reduce dimension remove redundant irrelevant before performing An abundance methods have been proposed literature, but it unclear how ability analysis affected by choice method. Method: We evaluated 13 on four...

10.1158/1538-7445.am2020-5475 article EN Cancer Research 2020-08-15

Coming Soon ...