- Genomics and Phylogenetic Studies
- Biomedical Text Mining and Ontologies
- Semantic Web and Ontologies
- RNA and protein synthesis mechanisms
- Bioinformatics and Genomic Networks
- Graph Theory and Algorithms
- Advanced Graph Neural Networks
- Genomics and Rare Diseases
- Machine Learning in Bioinformatics
- Plant biochemistry and biosynthesis
- Genomics and Chromatin Dynamics
- Gene expression and cancer classification
- Invertebrate Immune Response Mechanisms
- Silk-based biomaterials and applications
- Genetics, Bioinformatics, and Biomedical Research
- Chromosomal and Genetic Variations
- Silkworms and Sericulture Research
- Scientific Computing and Data Management
- Natural product bioactivities and synthesis
- Advanced Database Systems and Queries
- Genetic Associations and Epidemiology
- Environmental DNA in Biodiversity Studies
- Phytochemical compounds biological activities
- Glycosylation and Glycoproteins Research
- Research Data Management Practices
Research Organization of Information and Systems
2018-2025
The University of Tokyo
2008-2022
National Institutes of Natural Sciences
2012-2020
National Institute for Basic Biology
2014-2020
National Institute of Advanced Industrial Science and Technology
2010
Tokyo University of Science
2008
The microbial genome database for comparative analysis (MBGD) (available at http://mbgd.genome.ad.jp/) is a comprehensive ortholog flexible of genomes, where the users are allowed to create an table among any specified set organisms. Because rapid increase in data owing next-generation sequencing technology, it becomes increasingly challenging maintain high-quality orthology relationships while allowing incorporate latest genomic available into analysis. many recently accumulating draft...
The application of semantic technologies to the integration biological data and interoperability bioinformatics analysis visualization tools has been common theme a series annual BioHackathons hosted in Japan for past five years. Here we provide review activities outcomes from held 2011 Kyoto 2012 Toyama. In order efficiently implement life sciences, participants formed various sub-groups worked on following topics: Resource Description Framework (RDF) models specific domains, text mining...
Accurate determination of the evolutionary relationships between genes is a foundational challenge in biology. Homology-evolutionary relatedness-is many cases readily determined based on sequence similarity analysis. By contrast, whether or not two directly descended from common ancestor by speciation event (orthologs) duplication (paralogs) more challenging, yet provides critical information history gene. Since 2009, this task has been focus Quest for Orthologs (QFO) Consortium. The sixth...
The Microbial Genome Database for Comparative Analysis (MBGD) is a database comparative genomics based on comprehensive orthology analysis of bacteria, archaea and unicellular eukaryotes. MBGD now contains 6318 genomes. To utilize the both closely related distantly genomes, previously provided two types ortholog tables: standard table containing one representative genome from each genus covering entire taxonomic range taxon specific tables taxon. However, this approach has drawback in that...
TogoID ( https://togoid.dbcls.jp/ ) is an identifier (ID) conversion service designed to link IDs across diverse categories of life science databases. With its ability obtain related in different semantic relationships, a user-friendly web interface, and regular automatic data update system, has been valuable tool for bioinformatics. We have recently expanded TogoID's represent semantics between datasets, enabling it handle multiple relationships within dataset pairs. This enhancement...
The microbial genome database for comparative analysis (MBGD, available at http://mbgd.genome.ad.jp/) is a platform comparison based on orthology analysis. As its unique feature, MBGD allows users to conduct among any specified set of organisms; this flexibility adapt variety genomic study. Reflecting the huge diversity world, number projects now becomes several thousands. To efficiently explore entire data, provides summary pages pre-calculated ortholog tables various taxonomic groups. For...
Abstract Summary: The Quest for Orthologs (QfO) is an open collaboration framework experts in comparative phylogenomics and related research areas who have interest highly accurate orthology predictions their applications. We here report highlights discussion points from the QfO meeting 2015 held Barcelona. Achievements recent years established a basis to support developments improved prediction explore new approaches. Central effort proper benchmarking of methods services, as well design...
Computational comparative analysis of multiple genomes provides valuable opportunities to biomedical research. In particular, orthology can play a central role in genomics; it guides establishing evolutionary relations among genes organisms and allows functional inference gene products. However, the wide variations current databases necessitate research toward shareability content that is generated by different tools stored structures. Exchanging with other communities requires making...
Abstract Motivation Understanding life cannot be accomplished without making full use of biological data, which are scattered across databases diverse categories in sciences. To connect such data seamlessly, identifier (ID) conversion plays a key role. However, existing ID services have disadvantages, as covering only limited range databases, not keeping up with the updates original and outputs being hard to interpret context relations, especially when converting IDs multiple steps. Results...
Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover knowledge from such growing heterogeneous a flexible framework for data integration is necessary. Ortholog information central resource interlinking corresponding genes among different organisms, and the Semantic Web provides key technology data. We constructed an ortholog database using technology, aiming at numerous information. formalize structure in Web, we Ontology...
Abstract Background Interspecies sequence comparison is a powerful tool to extract functional or evolutionary information from the genomes of organisms. A number studies have compared protein sequences promoter between mammals, which provided many insights into genomics. However, correlation conservation and remains controversial. Results We examined as well for 6,901 human mouse orthologous genes, observed very weak them. further investigated their relationship by decomposing it based on...
Plants produce structurally diverse triterpenes (triterpenoids and steroids). Their biosynthesis occurs from a common precursor, namely 2,3-oxidosqualene, followed by cyclization catalyzed oxidosqualene cyclases (OSCs) to yield various triterpene skeletons. Steroids, which are biosynthesized cycloartenol or lanosterol, essential primary metabolites in most plant species, along with lineage-specific steroids, such as steroidal glycoalkaloids found the Solanum species. Other skeletons...
Identification of ortholog groups is a crucial step in comparative analysis multiple genomes. Although several computational methods have been developed to create groups, most those do not evaluate orthology at the sub-gene level. In our method for domain-level clustering, DomClust, proteins are split into domains on basis alignment boundaries identified by all-against-all pairwise comparison, but it often fails determine appropriate boundaries. We improve classification using information....
Toward improved interoperability of distributed biological databases, an increasing number datasets have been published in the standardized Resource Description Framework (RDF). Although powerful SPARQL Protocol and RDF Query Language (SPARQL) provides a basis for exploiting writing code is burdensome users including bioinformaticians. Thus, easy-to-use interface necessary. We developed SPANG, client that has unique features querying datasets. SPANG dynamically generates typical queries...
TogoGenome is a genome database that purely based on the Semantic Web technology, which enables integration of heterogeneous data and flexible semantic searches. All information stored as Resource Description Framework (RDF) data, reporting web pages are generated fly using SPARQL Protocol RDF Query Language (SPARQL) queries. provides semantic-faceted search system by gene functional annotation, taxonomy, phenotypes environment relevant ontologies. also serves an interface to conduct...
Increasing amounts of scientific and social data are published in the Resource Description Framework (RDF). Although RDF can be queried using SPARQL language, even SPARQL-based operation has a limitation implementing traversal or analytical algorithms. Recently, variety graph database implementations dedicated to analyses on property model have emerged. However, not interoperable. Here, we developed framework based Graph Mapping Language (G2GML) for mapping graphs make most accumulated data....
<ns3:p>We report on the activities of 2015 edition BioHackathon, an annual event that brings together researchers and developers from around world to develop tools technologies promote reusability biological data. We discuss issues surrounding representation, publication, integration, mining reuse data metadata across a wide range biomedical types relevance for life sciences, including chemistry, genotypes phenotypes, orthology phylogeny, proteomics, genomics, glycomics, metabolomics....
Abstract Background The RIKEN BRC develops and maintains the BioResource MetaDatabase to help users explore appropriate target bioresources for their experiments prepare precise high-quality data infrastructures. Swiss Institute of Bioinformatics two RDF datasets across multi species study gene expression orthology: Bgee Orthologous MAtrix (OMA, an orthology database). Methods This integrates knowledge graph with Resource Description Framework (RDF) from Bgee, a database, OMA, DisGeNET,...
Recently, a variety of database implementations adopting the property graph model have emerged. However, interoperable management data on these is challenging due to differences in models and formats. Here, we redefine incorporating existing propose serialization formats for graphs. The independent specific provides basis data. proposed not only general but also intuitive, thus it useful creating maintaining To demonstrate practical use our serialization, implemented converters from into...
Abstract Plants produce structurally diverse triterpenes (triterpenoids and steroids). Their biosynthesis occurs from a common precursor, namely 2,3-oxidosqualene, followed by cyclization catalyzed oxidosqualene cyclases (OSCs) to yield various triterpene skeletons. Steroids, which are biosynthesized cycloartenol or lanosterol, essential primary metabolites in most plant species, along with lineage-specific steroids, such as steroidal glycoalkaloids found the Solanum species. Other skeletons...
In disease model mouse strains used for human studies, information on genomic variations is essential elucidating the relationship between haplotypes and susceptibility. To select a appropriately, it crucial to identify variants with same effect as disease-causing in humans. BioHackathon Japan J2023, we focused nucleotide involved amino acid substitutions. We developed an API that matches from MoG+ database within gene regions defined by HGNC identifiers or symbols. After Hackathon, will map...