- Genomics and Phylogenetic Studies
- Data Mining Algorithms and Applications
- Chromosomal and Genetic Variations
- RNA and protein synthesis mechanisms
- Epigenetics and DNA Methylation
- RNA modifications and cancer
- Rough Sets and Fuzzy Logic
- Data Management and Algorithms
- Genomics and Chromatin Dynamics
- Genetic Neurodegenerative Diseases
- Advanced Database Systems and Queries
- Neurological diseases and metabolism
- Gene expression and cancer classification
- Neurogenetic and Muscular Disorders Research
- CRISPR and Genetic Engineering
- RNA Research and Splicing
- Hereditary Neurological Disorders
- Mitochondrial Function and Pathology
- Fungal and yeast genetics research
- Amyotrophic Lateral Sclerosis Research
- RNA regulation and disease
- Bioinformatics and Genomic Networks
- Genetics and Neurodevelopmental Disorders
- Genomics and Rare Diseases
- Algorithms and Data Compression
The University of Tokyo
2016-2025
Chiba University Hospital
2019
Tohoku University
2019
Japan Science and Technology Agency
2005-2017
Gunma University
2017
Maebashi Red Cross Hospital
2017
Hirosaki University
2017
Kyorin University
2016
Toranomon Hospital
2016
University of Split
2016
The medaka fish (Oryzias latipes) is a popular pet in Japan and more recently laboratory model organism for developmental genetics evolutionary biology. Now the medaka's genome has been sequenced analysed by large Japanese consortium. Cichlids stickleback, which are emerging systems understanding genetic basis of vertebrate speciation, evolutionarily closer to than zebrafish, so sequence will yield valuable insights into 400 million years evolution. long organism; it now its Teleosts...
Although several vertebrate genomes have been sequenced, little is known about the genome evolution of early vertebrates and how large-scale genomic changes such as two rounds whole-genome duplications (2R WGD) affected evolutionary complexity novelty in vertebrates. Reconstructing ancestral highly nontrivial because difficulty identifying traces originating from 2R WGD. To resolve this problem, we developed a novel method capable pinning down remains WGD human medaka fish using invertebrate...
Whole-genome and -exome resequencing using next-generation sequencers is a powerful approach for identifying genomic variations that are associated with diseases. However, systematic strategies prioritizing causative variants from many candidates to explain the disease phenotype still far being established, because population-specific frequency spectrum of genetic variation has not been characterized. Here, we have collected exomic 1208 Japanese individuals through collaborative effort,...
One of the most powerful techniques for attributing functions to genes in uni- and multicellular organisms is comprehensive analysis mutant traits. In this study, systematic quantitative analyses traits are achieved budding yeast Saccharomyces cerevisiae by investigating morphological phenotypes. Analysis fluorescent microscopic images triple-stained cells makes it possible treat variations as Deletion nearly half not essential growth affects these Similar phenotypes caused deletions...
siDirect (http://design.RNAi.jp/) is a web-based online software system for computing highly effective small interfering RNA (siRNA) sequences with maximum target-specificity mammalian interference (RNAi). Highly siRNA are selected using novel guidelines that were established through an extensive study of the relationship between and RNAi activity. Our efficient avoids off-target gene silencing to enumerate potential cross-hybridization candidates widely used BLAST search may overlook. The...
We discuss data mining based on association rules for two numeric attributes and one Boolean attribute. For example, in a database of bank customers, "Age" "Balance" are attributes, "CardLoan" is Taking the pair (Age, Balance) as point two-dimensional space, we consider an rule form((Age, ∈ P) ⇒ (CardLoan = Yes),which implies that customers whose ages balances fall planar region P tend to use card loan with high probability. classes regions, rectangles admissible (i.e. connected x-monotone)...
To understand the mechanism of transcriptional regulation, it is essential to identify and characterize promoter, which located proximal mRNA start site. promoters from large volumes genomic sequences, we used sites determined by a large-scale sequencing cDNA libraries constructed “oligo-capping” method. We aligned with sequences retrieved adjacent as potential promoter regions (PPRs) for 1031 genes. The PPR were searched determine frequencies major elements. Among PPRs, 329 (32%) contained...
We performed a large-scale cDNA analysis to explore the transcriptome of budding yeast Saccharomyces cerevisiae . sequenced two libraries, one from cells exponentially growing in minimal medium and other meiotic cells. Both libraries were generated by using vector-capping method that allows accurate mapping transcription start sites (TSSs). Consequently, we identified 11,575 TSSs associated with 3,638 annotated genomic features, including 3,599 ORFs, suggest most genes have or more TSSs. In...
Article Free Access Share on Mining optimized association rules for numeric attributes Authors: Takeshi Fukuda IBM Tokyo Research Laboratory, 1623-14, Shimo-tsuruma, Yamato City, Kanagawa Pref, 242, Japan JapanView Profile , Yasuhido Morimoto Shinichi Morishita Tokuyama Authors Info & Claims PODS '96: Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium Principles database systemsJune 1996Pages 182–191https://doi.org/10.1145/237661.237708Published:03 June 1996Publication History...
RNA interference (RNAi), mediated by 21-nucleotide (nt)-length small interfering RNAs (siRNAs), is a powerful tool not only for studying gene function but also therapeutic applications. RNAi, requiring perfect complementarity between the siRNA guide strand and target mRNA, was believed to be extremely specific. However, recent growing body of evidence has suggested that could down-regulate unintended genes whose transcripts possess 7-nt seed region. This off-target silencing may often...
We study how to efficiently compute significant association rules according common statistical measures such as a chi-squared value or correlation coefficient. For this purpose, one might consider use of the Apriori algorithm, but algorithm needs major conversion, because none these metrics are anti-monotone, and higher support for reducing search space cannot guarantee solutions in its space. here present method estimating tight upper bound on metric associated with any superset an itemset,...
Off-target effects are one of the most serious problems in RNA interference (RNAi). Here, we present dsCheck (http://dsCheck.RNAi.jp/), web-based online software for estimating off-target caused by long double-stranded (dsRNA) used RNAi studies. In biochemical process RNAi, dsRNA is cleaved Dicer into short-interfering (siRNA) cocktails. The simulates this and investigates individual 19 nt substrings dsRNA. Subsequently, promptly enumerates a list potential gene candidates based on order...
Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, N2 has many differences any available today. To provide more accurate genome, we performed long-read of VC2010, modern derived N2. Our VC2010 99.98% identity but an additional 1.8 Mb including tandem repeat expansions duplications. For...
Objective The objective of this study was to identify new causes Charcot–Marie–Tooth (CMT) disease in patients with autosomal‐recessive (AR) CMT. Methods To efficiently novel causative genes for AR‐CMT, we analyzed 303 unrelated Japanese CMT using whole‐exome sequencing and extracted recessive variants/genes shared among multiple patients. We performed mutation screening the newly identified membrane metalloendopeptidase ( MME ) gene 354 additional clinically, genetically, pathologically,...
Centromeres and large-scale structural variants evolve contribute to genome diversity during vertebrate speciation. Here, we perform de novo long-read assembly of three inbred medaka strains that are derived from geographically isolated subpopulations undergo Using single-molecule real-time (SMRT) sequencing, obtain chromosome-mapped genomes length ~734, ~678, ~744Mbp with a resource twenty-two centromeric regions 20-345kbp. positionally conserved among the even between four pairs...
Elucidating the ecological and biological identity of extrachromosomal mobile genetic elements (eMGEs), such as plasmids bacteriophages, in human gut remains challenging due to their high complexity diversity. Here, we show efficient identification eMGEs complete circular or linear contigs from PacBio long-read metagenomic data. De novo assembly long reads 12 faecal samples generated 82 eMGE (2.5~666.7-kb), which were classified 71 11 including 58 novel six genomes five diverse crAssphages...
To understand the mechanism of transcriptional regulation, it is essential to identify and characterize promoter, which located proximal mRNA start site. promoters from large volumes genomic sequences, we used sites determined by a large-scale sequencing cDNA libraries constructed "oligo-capping" method. We aligned with sequences retrieved adjacent as potential promoter regions (PPRs) for 1031 genes. The PPR were searched determine frequencies major elements. Among PPRs, 329 (32%) contained...
Might DNA sequence variation reflect germline genetic activity and underlying chromatin structure? We investigated this question using medaka (Japanese killifish, Oryzias latipes ), by comparing the genomic sequences of two strains (Hd-rR HNI) mapping ∼37.3 million nucleosome cores from Hd-rR blastulae 11,654 representative transcription start sites six embryonic stages. observed a distinctive ∼200–base pair (bp) periodic pattern downstream sites; rate insertions deletions longer than 1 bp...