Danilo Horta

ORCID: 0000-0003-4832-2124
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Clustering Algorithms Research
  • Data Management and Algorithms
  • Data Mining Algorithms and Applications
  • Genetic Mapping and Diversity in Plants and Animals
  • Genetic Associations and Epidemiology
  • Bioinformatics and Genomic Networks
  • Single-cell and spatial transcriptomics
  • Bayesian Methods and Mixture Models
  • Gene Regulatory Network Analysis
  • Gene expression and cancer classification
  • RNA Research and Splicing
  • CRISPR and Genetic Engineering
  • Text and Document Classification Technologies
  • Metaheuristic Optimization Algorithms Research
  • Face and Expression Recognition
  • Genomics and Chromatin Dynamics
  • RNA modifications and cancer
  • Genetic and phenotypic traits in livestock
  • Big Data and Business Intelligence

European Bioinformatics Institute
2016-2022

Wellcome Trust
2019-2021

Universidade de São Paulo
2009-2015

Brazilian Society of Computational and Applied Mathematics
2012-2015

Microsoft (United States)
2014

Universidade Federal de São Carlos
2009

Single-cell RNA sequencing (scRNA-seq) enables characterizing the cellular heterogeneity in human tissues. Recent technological advances have enabled first population-scale scRNA-seq studies hundreds of individuals, allowing to assay genetic effects with single-cell resolution. However, existing strategies analyze these data remain based on principles established for analysis bulk RNA-seq. In particular, current methods depend a priori definitions discrete cell types, and hence cannot assess...

10.15252/msb.202110663 article EN cc-by Molecular Systems Biology 2022-08-01

Abstract Motivation: Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has largely ignored even though it may play an important role obtaining optimal power. We compared standard statistical test—a score test—with recently developed likelihood ratio (LR) test. Further, when correction for hidden structure is needed, or gene–gene interactions are sought,...

10.1093/bioinformatics/btu504 article EN cc-by Bioinformatics 2014-07-29

The comparison of ordinary partitions a set objects is well established in the clustering literature, which comprehends several studies on analysis properties similarity measures for comparing partitions. However, clusterings are not readily applicable to biclusterings, since each bicluster tuple two sets (of rows and columns), whereas cluster only single rows). Some biclustering have been defined as minor contributions papers primarily report proposals evaluation algorithms or comparative...

10.1109/tcbb.2014.2325016 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2014-05-16

Joint genetic models for multiple traits have helped to enhance association analyses. Most existing multi-trait been designed increase power detecting associations, whereas the analysis of interactions has received considerably less attention. Here, we propose iSet, a method based on linear mixed test between sets variants and environmental states or other contexts. Our model generalizes previous interaction tests in particular provides local differences architecture We first use simulations...

10.1371/journal.pgen.1006693 article EN cc-by PLoS Genetics 2017-04-20

10.1016/j.tcs.2011.05.039 article EN publisher-specific-oa Theoretical Computer Science 2011-06-04

Abstract Identifying regulatory genetic effects in pluripotent cells provides important insights into disease variants with potentially transient or developmental origins. Combining existing and newly-generated data, we characterized 1,367 iPSC lines from 948 unique donors, collectively analyzed within the “Integrated QTL” (i2QTL) Consortium. The sample size of our study allowed us to derive most comprehensive map quantitative trait loci (QTL) human date. We mapped nearby common on five...

10.1101/784967 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2019-09-28

Abstract Different environmental factors, including diet, physical activity, or external conditions can contribute to genotype-environment interactions (GxE). Although high-dimensional data are increasingly available, and multiple environments have been implicated with GxE at the same loci, multi-environment tests for not established. Such joint analyses increase power detect improve interpretation of these effects. Here, we propose structured linear mixed model (StructLMM), a...

10.1101/270611 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2018-02-25

Similarity measures for comparing clusterings is an important component, e.g., of evaluating clustering algorithms, consensus clustering, and stability assessment. These have been studied over 40 years in the domain exclusive hard (exhaustive mutually object sets). In past years, literature has proposed to handle more general (e.g., fuzzy/probabilistic clusterings). This paper provides overview these new discusses their drawbacks. We ultimately develop a corrected-for-chance measure (13AGRI)...

10.5555/2789272.2912095 article EN Journal of Machine Learning Research 2015-01-01

This paper is concerned with the computational efficiency of clustering algorithms when data set to be clustered described by a proximity matrix only (relational data) and number clusters must automatically estimated from such data. Two relational versions an evolutionary algorithm for are derived compared against two systematic (repetitive) approaches that can also used estimate in Exhaustive experiments involving six artificial real sets reported analyzed.

10.1109/isda.2009.80 article EN 2009-01-01

This paper is concerned with the computational efficiency of clustering algorithms when data set to be clustered described by a proximity matrix only (relational data) and number clusters must automatically estimated from such data.

10.3233/his-2010-0119 article EN International Journal of Hybrid Intelligent Systems 2010-12-13

Abstract Single cell RNA sequencing (scRNA-seq) enables characterizing the cellular heterogeneity in human tissues. Technological advances have enabled first population-scale scRNA-seq studies hundreds of individuals, allowing to assay genetic effects with single-cell resolution. However, existing strategies perform analyses using remain based on principles established for bulk RNA-seq. In particular, current methods depend a priori definitions discrete types, and hence cannot assess allelic...

10.1101/2021.09.01.458524 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2021-09-02

Abstract Joint genetic models for multiple traits have helped to enhance association analyses. Most existing multi-trait been designed increase power detecting associations, whereas the analysis of interactions has received considerably less attention. Here, we propose iSet, a method based on linear mixed test between sets variants and environmental states or other contexts. Our model generalizes previous interaction tests in particular provides local differences architecture We first use...

10.1101/097477 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2016-12-31

The features describing a data set may often be arranged in meaningful subsets, each of which corresponds to different aspect the data. An unsupervised algorithm (SCAD) that performs fuzzy clustering and aspects weighting simultaneously was recently proposed. However, there are several situations where is represented by proximity matrices only (relational data), renders approaches, including SCAD, inappropriate. To handle this kind data, relational CARD, based on SCAD algorithm, has been...

10.1109/isda.2011.6121709 article EN 2011-11-01
Coming Soon ...