NFDI4DS | UHH-SEMS - Publication Details

Arif Canakoglu

ORCID: 0000-0003-4528-6586

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5023498681

Research Areas

Gene expression and cancer classification
Genomics and Phylogenetic Studies
Biomedical Text Mining and Ontologies
Bioinformatics and Genomic Networks
SARS-CoV-2 and COVID-19 Research
Machine Learning in Bioinformatics
Cancer Genomics and Diagnostics
vaccines and immunoinformatics approaches
Scientific Computing and Data Management
Genetic Associations and Epidemiology
Single-cell and spatial transcriptomics
Algorithms and Data Compression
Bacteriophages and microbial interactions
Epigenetics and DNA Methylation
Semantic Web and Ontologies
Genomics and Chromatin Dynamics
RNA modifications and cancer
Evolutionary Algorithms and Applications
Respiratory Support and Mechanisms
COVID-19 diagnosis using AI
Machine Learning and Data Classification
Genomics and Rare Diseases
Cardiac Arrest and Resuscitation
Data Mining Algorithms and Applications
Liver Disease Diagnosis and Treatment

Politecnico di Milano
2013-2025

Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico
2022-2025

Stanford University
2023

University of Cyprus
2013

Chinese University of Hong Kong
2013

Applied Multilayers (United Kingdom)
2013

Association of COVID-19 Vaccinations With Intensive Care Unit Admissions and Outcome of Critically Ill Patients With COVID-19 Pneumonia in Lombardy, Italy

OPENALEX - Publications

Giacomo Grasselli Alberto Zanella Eleonora Carlesso Gaetano Florio Arif Canakoglu and 95 more

Importance Data on the association of COVID-19 vaccination with intensive care unit (ICU) admission and outcomes patients SARS-CoV-2–related pneumonia are scarce. Objective To evaluate whether is associated preventing ICU for to compare baseline characteristics vaccinated unvaccinated admitted an ICU. Design, Setting, Participants This retrospective cohort study regional data sets reports: (1) daily number administered vaccines (2) all consecutive in Lombardy, Italy, from August 1 December...

10.1001/jamanetworkopen.2022.38871 article EN cc-by-nc-nd JAMA Network Open 2022-10-27

Processing of big heterogeneous genomic datasets for tertiary analysis of Next Generation Sequencing data

OPENALEX - Publications

Marco Masseroli Arif Canakoglu Pietro Pinoli Abdulrahman Kaitoua Andrea Gulino and 6 more

We previously proposed a paradigm shift in genomic data management, based on the Genomic Data Model (GDM) for mediating existing formats and GenoMetric Query Language (GMQL) supporting, at high level of abstraction, extraction most common data-driven computations required by tertiary analysis Next Generation Sequencing datasets. Here, we present new GMQL-based system with enhanced accessibility, portability, scalability performance.The has well-designed modular architecture featuring: (i) an...

10.1093/bioinformatics/bty688 article EN Bioinformatics 2018-08-06

GenoSurf: metadata driven semantic search system for integrated genomic datasets

OPENALEX - Publications

Arif Canakoglu Anna Bernasconi Andrea Colombo Marco Masseroli Stefano Ceri

Many valuable resources developed by world-wide research institutions and consortia describe genomic datasets that are both open available for secondary research, but their metadata search interfaces heterogeneous, not interoperable sometimes with very limited capabilities. We implemented GenoSurf, a multi-ontology semantic system providing access to consolidated collection of attributes found in the most relevant datasets; values 10 semantically enriched making use suited ontologies. The...

10.1093/database/baz132 article EN cc-by Database 2019-01-01

ViruSurf: an integrated database to investigate viral sequences

OPENALEX - Publications

Arif Canakoglu Pietro Pinoli Anna Bernasconi Tommaso Alfonsi Damianos P. Melidis and 1 more

ViruSurf, available at http://gmql.eu/virusurf/, is a large public database of viral sequences and integrated curated metadata from heterogeneous sources (RefSeq, GenBank, COG-UK NMDC); it also exposes computed nucleotide amino acid variants, called original sequences. A GISAID-specific ViruSurf database, http://gmql.eu/virusurf_gisaid/, offers subset these functionalities. Given the current pandemic outbreak, SARS-CoV-2 data are collected four sources; but contains other virus species...

10.1093/nar/gkaa846 article EN cc-by Nucleic Acids Research 2020-09-21

VirusViz: comparative analysis and effective visualization of viral nucleotide and amino acid variants

OPENALEX - Publications

Anna Bernasconi Andrea Gulino Tommaso Alfonsi Arif Canakoglu Pietro Pinoli and 2 more

Abstract Variant visualization plays an important role in supporting the viral evolution analysis, extremely valuable during COVID-19 pandemic. VirusViz is a web-based application for comparing variants of selected populations and their sub-populations; it primarily focused on SARS-CoV-2 variants, although tool also supports other species (SARS-CoV, MERS-CoV, Dengue, Ebola). As input, imports results queries extracting metadata from large database ViruSurf, which integrates information about...

10.1093/nar/gkab478 article EN cc-by Nucleic Acids Research 2021-05-24

The ENCODE Imputation Challenge: a critical assessment of methods for cross-cell type imputation of epigenomic profiles

OPENALEX - Publications

Jacob Schreiber Carles Boix Jin wook Lee Hongyang Li Yuanfang Guan and 37 more

A promising alternative to comprehensively performing genomics experiments is to, instead, perform a subset of and use computational methods impute the remainder. However, identifying best imputation what measures meaningfully evaluate performance are open questions. We address these questions by analyzing 23 from ENCODE Imputation Challenge. find that evaluations challenging confounded distributional shifts differences in data collection processing over time, amount available data,...

10.1186/s13059-023-02915-y article EN cc-by Genome biology 2023-04-18

Elucidating the causal relationship of mechanical power and lung injury: a dynamic approach to ventilator management

OPENALEX - Publications

Chao-Ping Wu Arif Canakoglu Jacob Vine Anya Mathur Rahul Nath and 7 more

Abstract Background Mechanical power (MP) serves as a crucial predictive indicator for ventilator-induced lung injury and plays pivotal role in tailoring the management of mechanical ventilation. However, its application across different diseases stages remains nuanced. Methods Using AmsterdamUMCdb, we conducted retrospective study to analyze causal relationship between MP outcomes invasive ventilation, specifically SpO 2 /FiO ratio (P/F) ventilator-free days at day 28 (VFD28). We employed...

10.1186/s40635-025-00736-w article EN cc-by Intensive Care Medicine Experimental 2025-02-28

The road towards data integration in human genomics: players, steps and interactions

OPENALEX - Publications

Anna Bernasconi Arif Canakoglu Marco Masseroli Stefano Ceri

Thousands of new experimental datasets are becoming available every day; in many cases, they produced within the scope large cooperative efforts, involving a variety laboratories spread all over world, and typically open for public use. Although potential collective amount information is huge, effective combination such sources hindered by data heterogeneity, as exhibit wide notations formats, concerning both values metadata. Thus, integration fundamental activity, to be performed prior...

10.1093/bib/bbaa080 article EN Briefings in Bioinformatics 2020-04-22

META-BASE: A Novel Architecture for Large-Scale Genomic Metadata Integration

OPENALEX - Publications

Anna Bernasconi Arif Canakoglu Marco Masseroli Stefano Ceri

The integration of genomic metadata is, at the same time, an important, difficult, and well-recognized challenge. It is important because a wealth public data repositories available to drive biological clinical research; combining information from various heterogeneous widely dispersed sources paramount number discoveries. difficult domain complex there no agreement among definitions, which refer different vocabularies ontologies. in bioinformatics community because, common practice, are...

10.1109/tcbb.2020.2998954 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2020-06-01

Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction

OPENALEX - Publications

Marco Masseroli Arif Canakoglu Stefano Ceri

Understanding complex biological phenomena involves answering biomedical questions on multiple biomolecular information simultaneously, which are expressed through genomic and proteomic semantic annotations scattered in many distributed heterogeneous data sources; such heterogeneity dispersion hamper the biologists' ability of asking global queries performing evaluations.To overcome this problem, we developed a software architecture to create maintain Genomic Proteomic Knowledge Base (GPKB),...

10.1109/tcbb.2015.2453944 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2015-07-08

Investigating Deep Learning Based Breast Cancer Subtyping Using Pan-Cancer and Multi-Omic Data

OPENALEX - Publications

Francisco Cristovao Silvia Cascianelli Arif Canakoglu Mark Carman Luca Nanni and 2 more

Breast Cancer comprises multiple subtypes implicated in prognosis. Existing stratification methods rely on the expression quantification of small gene sets. Next Generation Sequencing promises large amounts omic data next years. In this scenario, we explore potential machine learning and, particularly, deep for breast cancer subtyping. Due to paucity publicly available data, leverage pan-cancer and non-cancer design semi-supervised settings. We make use multi-omic including microRNA...

10.1109/tcbb.2020.3042309 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2020-12-03

OpenGDC: Unifying, Modeling, Integrating Cancer Genomic Data and Clinical Metadata

OPENALEX - Publications

Eleonora Cappelli Fabio Cumbo Anna Bernasconi Arif Canakoglu Stefano Ceri and 2 more

Next Generation Sequencing technologies have produced a substantial increase of publicly available genomic data and related clinical/biospecimen information. New models methods to easily access, integrate search them effectively are needed. An effort was made by the Genomic Data Commons (GDC), which defined strict procedures for harmonizing clinical cancer, created GDC portal with its application programming interface (API). In this work, we enhance harmonization applying state art model...

10.3390/app10186367 article EN cc-by Applied Sciences 2020-09-12

EpiSurf: metadata-driven search server for analyzing amino acid changes within epitopes of SARS-CoV-2 and other viral species

OPENALEX - Publications

Anna Bernasconi Luca Cilibrasi Ruba Al Khalaf Tommaso Alfonsi Stefano Ceri and 2 more

EpiSurf is a Web application for selecting viral populations of interest and then analyzing how their amino acid changes are distributed along epitopes. Viral sequences searched within ViruSurf, which stores curated metadata imported from the most widely used deposition sources databases (GenBank, COVID-19 Genomics UK (COG-UK) Global initiative on sharing all influenza data (GISAID)). Epitopes open source Immune Epitope Database or directly proposed by users indicating start stop positions...

10.1093/database/baab059 article EN cc-by Database 2021-09-01

ViruClust: direct comparison of SARS-CoV-2 genomes and genetic variants in space and time

OPENALEX - Publications

Luca Cilibrasi Pietro Pinoli Anna Bernasconi Arif Canakoglu Matteo Chiara and 1 more

The ongoing evolution of SARS-CoV-2 and the rapid emergence variants concern at distinct geographic locations have relevant implications for implementation strategies controlling COVID-19 pandemic. Combining growing body data evidence on potential functional mutations can suggest highly effective methods prioritization novel concern, e.g. increasing in frequency locally and/or globally. However, these analyses may be complex, requiring integration different resources. We claim need a...

10.1093/bioinformatics/btac030 article EN Bioinformatics 2022-01-13

PyGMQL: scalable data extraction and analysis for heterogeneous genomic datasets

OPENALEX - Publications

Luca Nanni Pietro Pinoli Arif Canakoglu Stefano Ceri

Abstract Background With the growth of available sequenced datasets, analysis heterogeneous processed data can answer increasingly relevant biological and clinical questions. Scientists are challenged in performing efficient reproducible extraction pipelines over heterogeneously datasets. Available software packages suitable for analyzing experimental files from such datasets one by one, but do not scale to thousands experiments. Moreover, they lack proper support metadata manipulation....

10.1186/s12859-019-3159-9 article EN cc-by BMC Bioinformatics 2019-11-08

Integrative warehousing of biomolecular information to support complex multi-topic queries for biomedical knowledge discovery

OPENALEX - Publications

Arif Canakoglu Marco Masseroli Stefano Ceri Luca Tettamanti Giorgio Ghisalberti and 1 more

Biomedical questions are often complex and address multiple topics simultaneously. Answering them requires the comprehensive evaluation of several different types data. They available, but in distributed heterogeneous data sources; this hampers their global evaluation. We developed a software architecture to create maintain updated Genomic Proteomic Data Warehouse (GPDW), which integrates main such dispersed It uses modular multi-level schema based on abstraction generalization integrated...

10.1109/bibe.2013.6701584 article EN 2013-11-01

GeCoAgent: A Conversational Agent for Empowering Genomic Data Extraction and Analysis

OPENALEX - Publications

Pietro Crovari Sara Pidò Pietro Pinoli Anna Bernasconi Arif Canakoglu and 2 more

With the availability of reliable and low-cost DNA sequencing, human genomics is relevant to a growing number end-users, including biologists clinicians. Typical interactions require applying comparative data analysis huge repositories genomic information for building new knowledge, taking advantage latest findings in applied healthcare. Powerful technology extraction available, but broad use hampered by complexity accessing such methods tools. This work presents GeCoAgent, big-data service...

10.1145/3464383 article EN ACM Transactions on Computing for Healthcare 2021-10-15

Coming Soon ...