Causal Representation Learning from Multimodal Biological Observations
Representation
DOI:
10.48550/arxiv.2411.06518
Publication Date:
2024-11-10
AUTHORS (13)
ABSTRACT
Prevalent in biological applications (e.g., human phenotype measurements), multimodal datasets can provide valuable insights into the underlying mechanisms. However, current machine learning models designed to analyze such still lack interpretability and theoretical guarantees, which are essential applications. Recent advances causal representation have shown promise uncovering interpretable latent variables with formal certificates. Unfortunately, existing works for distributions either rely on restrictive parametric assumptions or rather coarse identification results, limiting their applicability research favors a detailed understanding of In this work, we aim develop flexible conditions data principled methods facilitate datasets. Theoretically, consider nonparametric distribution (c.f., prior work) permitting relationships across potentially different modalities. We establish identifiability guarantees each component, extending subspace results from work. Our key ingredient is structural sparsity connections among distinct modalities, which, as will discuss, natural large collection systems. Empirically, propose practical framework instantiate our insights. demonstrate effectiveness approach through extensive experiments both numerical synthetic Results real-world dataset consistent established medical research, validating methodological framework.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....