Daehwan Kim

ORCID: 0000-0003-1182-629X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • RNA modifications and cancer
  • RNA and protein synthesis mechanisms
  • Gene Regulatory Network Analysis
  • Single-cell and spatial transcriptomics
  • Gut microbiota and health
  • Machine Learning in Bioinformatics
  • Cancer Genomics and Diagnostics
  • Chromosomal and Genetic Variations
  • RNA Research and Splicing
  • Gene expression and cancer classification
  • Molecular Biology Techniques and Applications
  • Scientific Computing and Data Management
  • Biomedical Text Mining and Ontologies
  • Natural Language Processing Techniques
  • Genomics and Chromatin Dynamics
  • Retinoids in leukemia and cellular processes
  • interferon and immune responses
  • SARS-CoV-2 and COVID-19 Research
  • Cancer-related molecular mechanisms research
  • Advanced biosensing and bioanalysis techniques
  • Viral Infections and Immunology Research
  • Genomic variations and chromosomal abnormalities
  • Algorithms and Data Compression
  • Amino Acid Enzymes and Metabolism

The University of Texas Southwestern Medical Center
2018-2023

Southwestern Medical Center
2021-2023

Johns Hopkins Medicine
2012-2017

Johns Hopkins University
2012-2017

University of Maryland, College Park
2011-2013

Centrifuge is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads quantification species on desktop computers. The system uses an indexing scheme based the Burrows-Wheeler transform (BWT) Ferragina-Manzini (FM) index, optimized specifically for metagenomic problem. requires relatively small index (4.2 GB 4078 bacterial 200 archaeal genomes) classifies sequences at very high speed, allowing it to process millions from typical high-throughput...

10.1101/gr.210641.116 article EN cc-by-nc Genome Research 2016-10-17

Abstract TopHat-Fusion is an algorithm designed to discover transcripts representing fusion gene products, which result from the breakage and re-joining of two different chromosomes, or rearrangements within a chromosome. enhanced version TopHat, efficient program that aligns RNA-seq reads without relying on existing annotation. Because it independent annotation, can products deriving known genes, unknown genes unannotated splice variants genes. Using data breast prostate cancer cell lines,...

10.1186/gb-2011-12-8-r72 article EN cc-by Genome biology 2011-08-11

Sequencing technologies using nucleotide conversion techniques such as cytosine to thymine in bisulfite-seq and SLAM seq are powerful tools explore the chemical intricacies of cellular processes. To date, no one has developed a unified methodology for aligning converted sequences consolidating alignment these package. In this paper, we describe hierarchical indexing spliced transcripts-3 nucleotides (HISAT-3N), which can rapidly accurately align consisting any by leveraging index repeat...

10.1101/gr.275193.120 article EN cc-by-nc Genome Research 2021-06-08

Abstract Centrifuge is a novel microbial classification engine that enables rapid, accurate and sensitive labeling of reads quantification species on desktop computers. The system uses an indexing scheme based the Burrows-Wheeler transform (BWT) Ferragina-Manzini (FM) index, optimized specifically for metagenomic problem. requires relatively small index (4.2 GB 4,078 bacterial 200 archaeal genomes) classifies sequences at very high speed, allowing it to process millions from typical...

10.1101/054965 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2016-05-24

Abstract Rapid advances in next-generation sequencing technologies have dramatically changed our ability to perform genome-scale analyses of human genomes. The reference genome used for most genomic represents only a small number individuals, limiting its usefulness genotyping. We designed novel method, HISAT-genotype, representing and searching an expanded model the genome, which comprehensive catalogue known variants haplotypes is incorporated into data structure alignment. This strategy...

10.1101/266197 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2018-02-15

Abstract HISAT is a new, highly efficient system for alignment of sequences from RNA sequencing experiments that achieves dramatically faster performance than previous methods. uses new indexing scheme, hierarchical indexing, which based on the Burrows-Wheeler transform and Ferragina-Manzini (FM) index. Hierarchical employs two types indexes alignment: (1) whole-genome FM index to anchor each alignment, (2) numerous local very rapid extensions these alignments. HISAT’s human genome contains...

10.1101/012591 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2014-12-12

Algorithms for classifying chromosomes, like convolutional deep neural networks (CNNs), show promise to augment cytogeneticists' workflows; however, a critical limitation is their inability accurately classify various structural chromosomal abnormalities. In hematopathology, recurrent cytogenetic abnormalities herald diagnostic, prognostic and therapeutic implications, but are laborious expert cytogeneticists identify. Non-recurrent also occur frequently cancerous cells. Here, we demonstrate...

10.1093/bioinformatics/btab822 article EN Bioinformatics 2021-12-01

Abstract Objectives Our study aimed to develop a machine learning (ML) model accurately classify acute promyelocytic leukemia (APL) from other types of myeloid (other AML) using multicolor flow cytometry (MFC) data. Multicolor is used determine immunophenotypes that serve as disease signatures for diagnosis. Methods We data set MFC files 27 patients with APL and 41 AML, including those uncommon immunophenotypes. ML pipeline involved training graph neural network (GNN) output graph-level...

10.1093/ajcp/aqad145 article EN American Journal of Clinical Pathology 2023-10-25

BACKGROUND Novel fusion transcripts (FTs) caused by chromosomal rearrangement are common factors in the development of cancers. In current study, authors used massively parallel RNA sequencing to identify new FTs colon METHODS (RNA‐Seq) and TopHat‐Fusion were The then investigated whether novel FT nuclear receptor subfamily 5, group A, member 2 (NR5A2)‐Kelch‐like family 29 (KLHL29FT) was transcribed from a genomic rearrangement. Next, expression NR5A2‐KLHL29FT measured quantitative real‐time...

10.1002/cncr.30510 article EN Cancer 2017-01-12

Each novel SARS-CoV-2 variant renews concerns about decreased vaccine efficacy caused by evasion of induced neutralizing antibodies. However, accumulating epidemiological data show that while prevention infection varies, protection from severe disease and death remains high. Thus, immune responses beyond neutralization could contribute to efficacy. Polyclonal antibodies function through their Fab domains neutralize virus directly, Fc induce non-neutralizing host via engagement receptors on...

10.1101/2022.08.12.22278726 preprint EN cc-by-nc-nd medRxiv (Cold Spring Harbor Laboratory) 2022-08-16

With the vast improvements in sequencing technologies and increased number of protocols, is being used to answer complex biological problems. Subsequently, analysis pipelines have become more time consuming complicated, usually requiring highly extensive prevalidation steps. Here, we present SeqWho, a program designed assess heuristically quality files reliably classify organism protocol type by using Random Forest classifiers trained on biases native k-mer frequencies repeat sequence identities.

10.1093/bioinformatics/btac050 article EN Bioinformatics 2022-01-27

<ns4:p>Introduction: There has long been a desire to understand, describe, and model gene regulatory networks controlling numerous biologically meaningful processes like differentiation. Despite many notable improvements models over the years, do not accurately capture subtle biological chemical characteristics of cell such as high-order chromatin domains chromosomes.</ns4:p><ns4:p> Methods: Topologically Associated Domains (TAD) are one these genomic regions that enriched for contacts...

10.12688/f1000research.110936.1 preprint EN cc-by F1000Research 2022-04-14

Abstract There has long been a desire to understand, describe, and model gene regulatory networks controlling numerous biologically meaningful processes like differentiation. Despite many notable improvements models over the years, do not accurately capture subtle biological chemical characteristics of cell such as high-order chromatin domains chromosomes. Topologically Associated Domains (TAD) are one these genomic regions that enriched for contacts within themselves. Here we present...

10.1101/2021.04.27.441672 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2021-04-28
Coming Soon ...