Raymond Wan

ORCID: 0000-0003-3202-7008
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Algorithms and Data Compression
  • Reproductive tract infections research
  • Muscle Physiology and Disorders
  • Natural Language Processing Techniques
  • Advanced Data Storage Technologies
  • Topic Modeling
  • Genomics and Phylogenetic Studies
  • Urinary Tract Infections Management
  • Bioinformatics and Genomic Networks
  • Gene expression and cancer classification
  • RNA and protein synthesis mechanisms
  • Web Data Mining and Analysis
  • Single-cell and spatial transcriptomics
  • Plant and fungal interactions
  • Peer-to-Peer Network Technologies
  • Diabetes and associated disorders
  • Biomedical Text Mining and Ontologies
  • Machine Learning in Bioinformatics
  • Data Quality and Management
  • Pancreatic function and diabetes
  • Extracellular vesicles in disease
  • Lipoproteins and Cardiovascular Health
  • RNA modifications and cancer
  • RNA Research and Splicing
  • Caching and Content Delivery

Chinese University of Hong Kong
2013-2024

Prince of Wales Hospital
2024

Hong Kong University of Science and Technology
2018-2020

University of Hong Kong
2018-2020

Phillips Exeter Academy
2020

UCSF Benioff Children's Hospital
2017

The University of Melbourne
2001-2011

National Institute of Advanced Industrial Science and Technology
2009-2011

The University of Tokyo
2010-2011

Kyoto University
2005-2011

Szymon M. Kiełbasa1, Raymond Wan2, Kengo Sato3, Paul Horton2 and Martin C. Frith2,4 Department of Computational Biology, Max Planck Institute for Molecular Genetics, Berlin D-14195, Germany; Biology Research Center, Tokyo 135-0064, Japan; Graduate School Frontier Sciences, University Tokyo, Chiba 277-8561, Japan

10.1101/gr.113985.110 article EN cc-by-nc Genome Research 2011-01-05

ABSTRACT Chlamydia trachomatis is an obligate intracellular bacterium that causes a diversity of severe and debilitating diseases worldwide. Sporadic ongoing outbreaks lymphogranuloma venereum (LGV) strains among men who have sex with (MSM) support the need for research on virulence factors associated these organisms. Previous analyses been limited to single genes or genomes laboratory-adapted reference strain L 2 /434 outbreak b/UCH-1/proctitis. We characterized unusual LGV strain, termed...

10.1128/mbio.00045-11 article EN cc-by-nc-sa mBio 2011-05-04

Abstract Chlamydia trachomatis is a global cause of blinding trachoma and sexually transmitted infections (STIs). We used comparative genomics the family Chlamydiaceae to select conserved housekeeping genes for C. multilocus sequencing, characterizing 19 reference 68 clinical isolates from 6 continental/subcontinental regions. There were 44 sequence types (ST). Identical STs STI recovered different regions, whereas restricted by continent. Twenty-nine 52 alleles had nonuniform distributions...

10.3201/eid1509.090272 article EN cc-by Emerging infectious diseases 2009-09-01

Abstract Summary: Insertional mutagenesis from virus infection is an important pathogenic risk for the development of cancer. Despite advent high-throughput sequencing, discovery viral integration sites and expressed fusion events are still limited. Here, we present ViralFusionSeq (VFS), which combines soft-clipping information, read-pair analysis targeted de novo assembly to discover annotate viral–human fusions. VFS was used in RNA-Seq experiment, simulated DNA-Seq experiment re-analysis...

10.1093/bioinformatics/btt011 article EN cc-by Bioinformatics 2013-01-12

New DNA sequencing technologies have achieved breakthroughs in throughput, at the expense of higher error rates. The primary way interpreting biological sequences is via alignment, but standard alignment methods assume are accurate. Here, we describe how to incorporate per-base probabilities reported by sequencers into alignment. Unlike existing tools for read mapping, our method models both sequencer errors and real sequence differences. This approach consistently improves mapping accuracy,...

10.1093/nar/gkq010 article EN cc-by-nc Nucleic Acids Research 2010-01-27

Abstract Motivation: The growth of next-generation sequencing means that more effective and efficient archiving methods are needed to store the generated data for public dissemination in anticipation mature analytical later. This article examines compressing quality score component partly address this problem. Results: We compare several compression policies scores, terms both effectiveness overall efficiency. employ lossy lossless transformations with one coding schemes. Experiments show...

10.1093/bioinformatics/btr689 article EN Bioinformatics 2011-12-13

Optical mapping is a technique for capturing fluorescent signal patterns of long DNA molecules (in the range 0.1–1 Mbp). Recently, it has been complementing widely used short-read sequencing technology by assisting with scaffolding and detecting large complex structural variations (SVs). Here, we introduce fast, robust accurate tool called OMBlast aligning optical maps, set locations on generated from mapping. Our method based seed-and-extend approach sequence alignment, modifications...

10.1093/bioinformatics/btw620 article EN cc-by-nc Bioinformatics 2016-09-29

The clinical utility of personal genomic information in identifying individuals at increased risks for dyslipidemia and cardiovascular diseases remains unclear.We used data from Biobank Japan (n = 70,657-128,305) developed novel East Asian-specific genome-wide polygenic risk scores (PRSs) four lipid traits. We validated 4271) subsequently tested associations these with 3-year changes adolescents 620), carotid intima-media thickness (cIMT) adult women 781), 7723), coronary heart disease (CHD)...

10.1186/s13073-021-00831-z article EN cc-by Genome Medicine 2021-02-19

We designed and implemented a patient-centered, data-driven, holistic care model with evaluation of its impacts on clinical outcomes in patients young-onset type 2 diabetes (T2D) for which there is lack evidence-based practice guidelines. In this 3-year Precision Medicine to Redefine Insulin Secretion Monogenic Diabetes-Randomized Controlled Trial, we evaluate the effects multicomponent integrating use information communication technology (Joint Asia Diabetes Evaluation (JADE) platform),...

10.1136/bmjdrc-2024-004120 article EN cc-by-nc BMJ Open Diabetes Research & Care 2024-06-01

Chlamydia trachomatis (Ct) is the leading cause of bacterial sexually transmitted diseases (STD) worldwide. The Ct Multi Locus Sequence Typing (MLST) scheme effective in differentiating strain types (ST), deciphering transmission patterns and treatment failure, identifying recombinant strains. Here, we analyzed 323 reference clinical samples, including 58 samples from Russia, an area that has not previously been represented typing schemes, to expand our knowledge global diversification STs....

10.3389/fmicb.2017.02195 article EN cc-by Frontiers in Microbiology 2017-11-13

Chlamydia trachomatis causes a high number of sexually transmitted infections worldwide, but reproducible and precise strain typing to link partners is lacking. We evaluated multilocus sequence (MLST) for this purpose by detecting types (STs) concordant the ompA genotype, single-locus standard. tested samples collected during April 2000-October 2003 from members established heterosexual partnerships (dyads) in Indianapolis, Indiana, USA, area who self-reported being coital within previous 30...

10.3201/2011.140604 article EN cc-by Emerging infectious diseases 2014-09-29

Abstract Aims/hypothesis Monogenic diabetes is caused by rare mutations in genes usually implicated beta cell biology. Common variants of monogenic (MDG) may jointly influence the risk young-onset type 2 (YOD, diagnosed before age 40 years) and cardiovascular kidney events. Methods Using whole-exome sequencing data, we constructed a weighted polygenic score (wPRS) consisting 135 common (minor allele frequency >0.01) 34 MDG based on r >0.2 for linkage disequilibrium discovery...

10.1007/s00125-024-06320-3 article EN cc-by Diabetologia 2024-11-23

Lateral gene transfer (LGT) is essential for generating between-strain genomic recombinants of Chlamydia trachomatis to facilitate the organism's evolution. Because there no reliable laboratory-based system C. trachomatis, in vitro generation from antibiotic-resistant strains being used study LGT. However, selection pressures imposed on likely affect statistical properties recombination relative naturally occurring clinical recombinants, including prevalence at particular loci. We examined...

10.1128/jb.06268-11 article EN Journal of Bacteriology 2011-11-29

We describe a software system for managing textjiles of up to several hundred megabytes that combines number useful facilities. First, the text is stored compressed using variant RE-PAIR mechanism described by Larsson and Moflat, with space savings comparable those obtained other widely used general-purpose compression systems. Second, we provide, as byproduct process, phrase-based browsing tool allows users explore contents source in natural manner. And third, once set desiredphrases has...

10.1109/spire.2001.989752 article EN 2005-08-25

Abstract Adult tissue repair and regeneration require the activation of resident stem/progenitor cells that can self-renew generate differentiated progeny. The regenerative capacity skeletal muscle relies on satellite (MuSCs) their interplay with different cell types within niche. Yet, our understanding compose is limited molecular definitions principal are lacking. Using a combined approach single-cell RNA-sequencing mass cytometry, we precisely mapped in adult highlighted previously...

10.1101/304683 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2018-04-19

With the increase usage of next generation sequencing, problem effectively storing and transmitting such massive amounts data will need to be addressed. Current repositories as Sequence Read Archive (SRA) currently use FASTQ format a general-purpose compression systems (GZIP) for archiving. In this work, we investigate how GZIP (and BZIP2) can made more effective read archiving by pre-sorting reads. The improvement in effectiveness just sequences is reduction at most 12% up 6% when original...

10.1109/bibmw.2010.5703863 article EN 2010-12-01
Coming Soon ...