Damla Senol Cali

ORCID: 0000-0002-3665-6285
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • Algorithms and Data Compression
  • Advanced Data Storage Technologies
  • Parallel Computing and Optimization Techniques
  • Machine Learning in Bioinformatics
  • RNA and protein synthesis mechanisms
  • Chromosomal and Genetic Variations
  • DNA and Biological Computing
  • Evolutionary Algorithms and Applications
  • Network Packet Processing and Optimization
  • Cloud Computing and Resource Management
  • Nanopore and Nanochannel Transport Studies
  • Caching and Content Delivery
  • Genetics, Bioinformatics, and Biomedical Research
  • Advanced Image and Video Retrieval Techniques
  • Genomics and Chromatin Dynamics
  • Molecular Biology Techniques and Applications
  • RNA modifications and cancer
  • Advanced Memory and Neural Computing
  • Ferroelectric and Negative Capacitance Devices
  • Natural Language Processing Techniques
  • Gene expression and cancer classification
  • Interconnection Networks and Systems
  • Protist diversity and phylogeny
  • Microbial Community Ecology and Physiology

BioNano Genomics (United States)
2022-2024

Carnegie Mellon University Australia
2024

Carnegie Mellon University
2018-2023

Associazione Medici Diabetologi
2021

Intel (United States)
2020

Seed location filtering is critical in DNA read mapping, a process where billions of fragments (reads) sampled from donor are mapped onto reference genome to identify genomic variants the donor. State-of-the-art mappers 1) quickly generate possible mapping locations for seeds (i.e., smaller segments) within each read, 2) extract sequences at locations, and 3) check similarity between its associated with computationally-expensive algorithm sequence alignment) determine origin read. A seed...

10.1186/s12864-018-4460-0 article EN cc-by BMC Genomics 2018-05-01

Genome sequence analysis has enabled significant advancements in medical and scientific areas such as personalized medicine, outbreak tracing, the understanding of evolution. To perform genome sequencing, devices extract small random fragments an organism's DNA (known reads). The first step is a computational process known read mapping. In mapping, each fragment matched to its potential location reference with goal identifying original genome. Unfortunately, rapid sequencing currently...

10.1109/micro50266.2020.00081 article EN 2020-10-01

Generating the hash values of short subsequences, called seeds, enables quickly identifying similarities between genomic sequences by matching seeds with a single lookup their values. However, these can be used only for finding exact-matching as conventional hashing methods assign distinct different including highly similar seeds. Finding causes either (i) increasing use costly sequence alignment or (ii) limited sensitivity. We introduce

10.1093/nargab/lqad004 article EN cc-by NAR Genomics and Bioinformatics 2023-01-10

Genome analysis fundamentally starts with a process known as read mapping, where sequenced fragments of an organism's genome are compared against reference genome. Read mapping is currently major bottleneck in the entire pipeline, because state-of-the-art sequencing technologies able to sequence much faster than computational techniques employed analyze We describe ongoing journey significantly improving performance mapping. explain algorithmic methods and hardware-based acceleration...

10.1109/mm.2020.3013728 article EN IEEE Micro 2020-08-03

It has become increasingly difficult to understand the complex interaction between modern applications and main memory, composed of Dynamic Random Access Memory (DRAM) chips. Manufacturers researchers are developing many different types DRAM, with each DRAM type catering needs (e.g., high throughput, low power, memory density). At same time, access patterns prevalent emerging rapidly diverging, as these manipulate larger data sets in very ways. As a result, combined DRAM-workload behavior is...

10.1145/3309697.3331482 article EN 2019-06-20

Long reads produced by third-generation sequencing technologies are used to construct an assembly (i.e., the subject's genome), which is further in downstream genome analysis. Unfortunately, long have high error rates and a large proportion of bps these incorrectly identified. These errors propagate affect accuracy Assembly polishing algorithms minimize such propagation or fixing using information from alignments between read-to-assembly alignment information). However, can only polish...

10.1093/bioinformatics/btaa179 article EN Bioinformatics 2020-03-11

Modern data-intensive applications demand high computation capabilities with strict power constraints. Unfortunately, such suffer from a significant waste of both execution cycles and energy in current computing systems due to the costly data movement between units memory units. Genome analysis weather prediction are two examples applications. Recent FPGAs couple reconfigurable fabric high-bandwidth (HBM) enable more efficient improve overall performance efficiency. This trend is an example...

10.1109/mm.2021.3088396 article EN IEEE Micro 2021-06-10

Read mapping is a fundamental step in many genomics applications. It used to identify potential matches and differences between fragments (called reads) of sequenced genome an already known reference genome). costly because it needs perform approximate string matching (ASM) on large amounts data. To address the computational challenges analysis, prior works propose various approaches such as accurate filters that select reads within dataset genomic read set) must undergo expensive...

10.1145/3503222.3507702 article EN 2022-02-22

A critical step of genome sequence analysis is the mapping sequenced DNA fragments (i.e., reads) collected from an individual to a known linear reference sequence-to-sequence mapping). Recent works replace with graph-based representation genome, which captures genetic variations and diversity across many individuals in population. Mapping reads sequence-to-graph mapping) results notable quality improvements analysis. Unfortunately, while well studied available tools accelerators, more...

10.1145/3470496.3527436 preprint EN 2022-05-31

AirLift is the first read remapping tool that enables users to quickly and comprehensively map a set, had been previously mapped one reference genome, another similar reference. Users can then run downstream analysis of sets for each latest release. Compared state-of-the-art method reads (i.e., full mapping), reduces overall execution time remap between two genome versions by up 27.4×. We validate our results with GATK find provides high accuracy in identifying ground truth SNP/INDEL variants.

10.1109/tcbb.2024.3433378 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2024-01-01

Pairwise sequence alignment is a very time-consuming step in common bioinformatics pipelines. Speeding up this requires heuristics, efficient implementations, and/or hardware acceleration. A promising candidate for all of the above recently proposed GenASM algorithm. We identify and address three inefficiencies algorithm: it has high amount data movement, large memory footprint, does some unnecessary work. propose Scrooge, fast memory-frugal genomic aligner. Scrooge includes novel...

10.1093/bioinformatics/btad151 article EN cc-by Bioinformatics 2023-03-24

Nanopore sequencing is a widely-used high-throughput genome technology that can sequence long fragments of into raw electrical signals at low cost. requires two computationally-costly processing steps for accurate downstream analysis. The first step, basecalling, translates the nucleotide bases (i.e., A, C, G, T). second read mapping, finds correct location in reference genome. In existing analysis pipelines, basecalling and mapping are executed separately. We observe this work such separate...

10.1109/micro56248.2022.00056 article EN 2022-10-01

Read mapping is a fundamental, yet computationally-expensive step in many genomics applications. It used to identify potential matches and differences between fragments (called reads) of sequenced genome an already known reference genome). To address the computational challenges analysis, prior works propose various approaches such as filters that select reads must undergo expensive computation, efficient heuristics, hardware acceleration. While effective at reducing computation overhead,...

10.48550/arxiv.2202.10400 preprint EN cc-by arXiv (Cornell University) 2022-01-01

It has become increasingly difficult to understand the complex interactions between modern applications and main memory, composed of Dynamic Random Access Memory (DRAM) chips. Manufacturers are now selling proposing many different types DRAM, with each DRAM type catering needs (e.g., high throughput, low power, memory density). At same time, access patterns prevalent emerging rapidly diverging, as these manipulate larger data sets in very ways. As a result, combined DRAM-workload behavior is...

10.1145/3366708 article EN Proceedings of the ACM on Measurement and Analysis of Computing Systems 2019-12-17

It has become increasingly difficult to understand the complex interaction between modern applications and main memory, composed of Dynamic Random Access Memory (DRAM) chips. Manufacturers researchers are developing many different types DRAM, with each DRAM type catering needs (e.g., high throughput, low power, memory density). At same time, access patterns prevalent emerging rapidly diverging, as these manipulate larger data sets in very ways. As a result, combined DRAM-workload behavior is...

10.1145/3376930.3376989 article EN ACM SIGMETRICS Performance Evaluation Review 2019-12-17

Generating the hash values of short subsequences, called seeds, enables quickly identifying similarities between genomic sequences by matching seeds with a single lookup their values. However, these can be used only for finding exact-matching as conventional hashing methods assign distinct different including highly similar seeds. Finding causes either 1) increasing use costly sequence alignment or 2) limited sensitivity. We introduce BLEND, first efficient and accurate mechanism that...

10.1101/2022.11.23.517691 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2022-11-25

Profile hidden Markov models (pHMMs) are widely employed in various bioinformatics applications to identify similarities between biological sequences, such as DNA or protein sequences. In pHMMs, sequences represented graph structures, where states and edges capture modifications (i.e., insertions, deletions, substitutions) by assigning probabilities them. These subsequently used compute the similarity score a sequence pHMM graph. The Baum-Welch algorithm, prevalent highly accurate method,...

10.1145/3632950 article EN ACM Transactions on Architecture and Code Optimization 2023-12-28

Abstract Motivation A genome read dataset can be quickly and efficiently remapped from one reference to another similar (e.g., between two versions or species) using a variety of tools, e.g., the commonly used CrossMap tool. With explosion available genomic datasets references, high-performance remapping tools will even more important for keeping up with computational demands assembly analysis. Results We provide FastRemap, fast efficient tool reads assemblies. FastRemap provides 7.82×...

10.1093/bioinformatics/btac554 article EN Bioinformatics 2022-08-17

Optical genome maps (OGM) from Bionano enable the detection of genomic structural and copy number variants that cannot be detected by next-generation sequencing (NGS) technologies are often missed conventional cytogenetic techniques. has developed bioinformatics pipelines for calling including Solve de novo assembly pipeline constitutional analysis Rare Variant Analysis (RVA) low allele-fraction cancer applications.

10.1016/j.gimo.2024.101761 article EN cc-by-nc-nd Genetics in Medicine Open 2024-01-01

Abstract Background Optical genome maps (OGM) from Bionano enable the detection of genomic structural and copy number variants that cannot be detected by next-generation sequencing (NGS) technologies are often missed conventional cytogenetic techniques. has developed bioinformatics pipelines for calling including Solve de novo assembly pipeline constitutional analysis Rare Variant Analysis (RVA) low-allele-fraction cancer applications. Both computationally intensive currently take 5-10 hours...

10.1158/1538-7445.am2024-2337 article EN Cancer Research 2024-03-22
Coming Soon ...