- Genomics and Chromatin Dynamics
- Rough Sets and Fuzzy Logic
- Gene expression and cancer classification
- Bioinformatics and Genomic Networks
- Cancer Genomics and Diagnostics
- Data Mining Algorithms and Applications
- RNA modifications and cancer
- Growth Hormone and Insulin-like Growth Factors
- Epigenetics and DNA Methylation
- Machine Learning in Bioinformatics
- RNA Research and Splicing
- Logic, programming, and type systems
- RNA and protein synthesis mechanisms
- Pituitary Gland Disorders and Treatments
- Cancer, Hypoxia, and Metabolism
- Logic, Reasoning, and Knowledge
- Computational Drug Discovery Methods
- Genomics and Phylogenetic Studies
- Angiogenesis and VEGF in Cancer
- Cancer-related molecular mechanisms research
- Biomedical Text Mining and Ontologies
- Thyroid Disorders and Treatments
- Semantic Web and Ontologies
- HIV Research and Treatment
- Protease and Inhibitor Mechanisms
Institute of Computer Science
2015-2024
Uppsala University
2014-2023
Linnaeus University
2004-2023
Science for Life Laboratory
2013-2023
Polish Academy of Sciences
2014-2023
Swedish Collegium for Advanced Study
2021-2022
Medical University of Lodz
2011-2021
Umeå University
2008-2021
Oregon National Primate Research Center
2021
Karolinska Institutet
2021
The discovery of drivers cancer has traditionally focused on protein-coding genes
The genomes of higher organisms are packaged in nucleosomes with functional histone modifications. Until now, genome-wide nucleosome and modification studies have focused on transcription start sites (TSSs) where RNA polymerase II (RNAPII) occupied genes well positioned modifications that characteristic expression status. Using public data, we here show there is a nucleosome-positioning signal internal human exons this positioning independent expression. We observed similarly strong...
Abstract Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for resource lncRNAs with validated roles. Furthermore, it remains debated whether mutated can drive tumorigenesis, and such functions could be conserved during evolution. Here, as part ICGC/TCGA Pan-Cancer Analysis Whole Genomes (PCAWG) Consortium, we introduce Cancer LncRNA Census (CLC), compilation 122 GENCODE causal roles in phenotypes. In contrast to existing databases, CLC requires...
Multi-omics datasets represent distinct aspects of the central dogma molecular biology. Such high-dimensional profiles pose challenges to data interpretation and hypothesis generation. ActivePathways is an integrative method that discovers significantly enriched pathways across multiple using statistical fusion, rationalizes contributing evidence highlights associated genes. As part ICGC/TCGA Pan-Cancer Analysis Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing from...
Abstract Purpose: Patients with metastatic adenocarcinoma of unknown origin are a common clinical problem. Knowledge the primary site is important for their management, but histologically, such tumors appear similar. Better diagnostic markers needed to enable assignment metastases likely sites on pathologic samples. Experimental Design: Expression profiling 27 candidate was done using tissue microarrays and immunohistochemistry. In first (training) round, we studied 352 adenocarcinomas, from...
Pre-selection of informative features for supervised classification is a crucial, albeit delicate, task. It desirable that feature selection provides the contribute most to task per se and which should therefore be used by any classifier later produce rules. In this article, conceptually simple but computer-intensive approach proposed. The reliability rests on multiple construction tree many training sets randomly chosen from original sample set, where samples in each set consist only...
Two major types of genetic variation are known: single nucleotide polymorphisms (SNPs), and a more recently discovered structural variation, involving changes in copy number (CNVs) kilobase- to megabase-sized chromosomal segments. It is unknown whether CNVs arise somatic cells, but it is, however, generally assumed that normal cells genetically identical. We tested 34 tissue samples from three subjects and, having analyzed for each < or =10(-6) all expected an adult human, we observed at...
Most neurological diseases are associated with chronic inflammation initiated by the activation of microglia, which produce cytotoxic and inflammatory factors. Signal transducers activators transcription (STATs) potent regulators gene expression but contribution particular STAT to STAT-dependent transcriptional networks underlying brain need be identified. In present study, we investigated genomic distribution Stat binding sites role Stats in lipopolysaccharide (LPS)-activated primary...
Butyrate is a histone deacetylase inhibitor (HDACi) with anti-neoplastic properties, which theoretically reactivates epigenetically silenced genes by increasing global acetylation. However, recent studies indicate that similar number or even more are down-regulated than up-regulated this drug. We treated hepatocarcinoma HepG2 cells butyrate and characterized the levels of acetylation at DNA-bound histones H3 H4 ChIP-chip along ENCODE regions. In contrast to increases acetylation, many...
Abstract The catalog of cancer driver mutations in protein-coding genes has greatly expanded the past decade. However, non-coding are less well-characterized and only a handful recurrent mutations, most notably TERT promoter have been reported. Here, as part ICGC/TCGA Pan-Cancer Analysis Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 across 38 tumor types, we perform multi-faceted pathway network analyses 2583 genomes 27 types compiled by PCAWG...
An operational semantics of the Prolog programming language is introduced. Meta-IV used to specify semantics. One purpose work provide a specification an implementation interpreter. Another one application this formal description program optimization techniques based on principle partial evaluation.Transformations which account for pruning, forward data structure propagation and opening (which also provides backward propagation) are formally introduced proved preserve meaning programs. The...
Transcription factors and histone modifications are crucial regulators of gene expression that mutually influence each other. We present the DNA binding profiles upstream stimulatory 1 2 (USF1, USF2) acetylated H3 (H3ac) in a liver cell line for whole human genome using ChIP-chip at resolution 35 base pairs. determined these three proteins bind mostly proximity protein coding genes transcription start sites (TSSs), their bindings positively correlated with levels. Based on spatial functional...
Expression of a large number yeast genes is repressed by glucose. The zinc finger protein Mig1 the main effector in glucose repression, but also has two related proteins: Mig2 and Mig3. We have used microarrays to study global gene expression all possible combinations mig1, mig2 mig3 deletion mutants.Mig1 repress largely overlapping set on 2% Genes that are upregulated mig1 double mutant were grouped according contribution Mig2. Most them show partially redundant with being major repressor,...
Relapse is the leading cause of death adult and pediatric patients with acute myeloid leukemia (AML). Numerous studies have helped to elucidate complex mutational landscape at diagnosis AML, improved risk stratification new therapeutic options. However, multi-whole-genome AML relapse are necessary for further advances. To this end, we performed whole-genome whole-exome sequencing analyses longitudinal diagnosis, relapse, and/or primary resistant specimens from 48 25 AML. We identified...
The aim of the present study was to generate hypotheses on involvement uncharacterized genes in biological processes. To this end, supervised learning used analyze microarray-derived time-series gene expression data. Our method objectively evaluated known using cross-validation and provided high-precision Gene Ontology process classifications for 211 213 data set used. In addition, new roles were hypothesized genes. uses knowledge expressed by generates a rule model associating with minimal...
Microarray technology enables large-scale inference of the participation genes in biological process from similar expression profiles. Our aim is to induce classificatory models data and knowledge that can automatically associate with novel hypotheses process.We report a systematic supervised learning approach predicting time series gene knowledge. Biological expressed using ontology this associated discriminatory expression-based features form minimal decision rules. The resulting rule...
Disease-associated SNPs detected in large-scale association studies are frequently located non-coding genomic regions, suggesting that they may be involved transcriptional regulation. Here we describe a new strategy for detecting regulatory (rSNPs), by combining computational and experimental approaches. Whole genome ChIP-chip data USF1 was analyzed using novel motif finding algorithm called BCRANK. 1754 binding sites were identified 140 candidate rSNPs found the predicted sites. For...
Gene expression is regulated by combinations of transcription factors, which can be mapped to regulatory elements on a genome-wide scale using ChIP experiments. In previous ChIP-chip study USF1 and USF2 we found evidence also binding GABP, FOXA2 HNF4a within the enriched regions. Here, have applied ChIP-seq for these factors identified 3064 peaks enrichment 7266 18783 HNF4a. Distal with signal was frequently bound FOXA2. GABP were at start sites, whereas 94% 90% located other positions. We...