- Parallel Computing and Optimization Techniques
- Distributed and Parallel Computing Systems
- Advanced Data Storage Technologies
- Cloud Computing and Resource Management
- Distributed systems and fault tolerance
- Scientific Computing and Data Management
- Interconnection Networks and Systems
- Optical Network Technologies
- Graph Theory and Algorithms
- Advanced Combustion Engine Technologies
- Caching and Content Delivery
- Semiconductor Lasers and Optical Devices
- Embedded Systems Design Techniques
- Photonic and Optical Devices
- Advanced Photonic Communication Systems
- Algorithms and Data Compression
- Solid State Laser Technologies
- Advanced Optical Network Technologies
- Peer-to-Peer Network Technologies
- Laser Design and Applications
- Vehicle emissions and performance
- Advanced Graph Neural Networks
- Electromagnetic Scattering and Analysis
- Advanced Neural Network Applications
- Software System Performance and Reliability
RIKEN Center for Computational Science
2018-2025
Tokyo Institute of Technology
2013-2023
RIKEN
2023
Association for Computing Machinery
2019-2020
University of Tennessee at Knoxville
2011-2018
National Institute of Advanced Industrial Science and Technology
2004-2016
Computing Center
2004-2016
Tokyo University of Technology
2016
Japan Science and Technology Agency
2008-2014
Institut national de recherche en informatique et en automatique
2013
The sustained growth of data traffic volume calls for an introduction efficient and scalable transport platform links 100 Gb/s beyond in the future optical network. In this article, after briefly reviewing existing major technology options, we propose a novel, spectrum- efficient, network architecture called SLICE. SLICE enables sub-wavelength, superwavelength, multiple-rate accommodation highly spectrum-efficient manner, thereby providing fractional bandwidth service. Dynamic variation...
Over the last 20 years, open-source community has provided more and software on which world’s high-performance computing systems depend for performance productivity. The invested millions of dollars years effort to build key components. However, although investments in these separate elements have been tremendously valuable, a great deal productivity also lost because lack planning, coordination, integration technologies necessary make them work together smoothly efficiently, both within...
Large scientific applications deployed on current petascale systems expend a significant amount of their execution time dumping checkpoint files to remote storage. New fault tolerant techniques will be critical efficiently exploit post-petascale systems. In this work, we propose low-overhead high-frequency multi-level technique in which integrate highly-reliable topology-aware Reed-Solomon encoding three-level scheme. We hide the using one Fault-Tolerance dedicated thread per node. implement...
Over the past four years, Big Data and Exascale Computing (BDEC) project organized a series of five international workshops that aimed to explore ways in which new forms data-centric discovery introduced by ongoing revolution high-end data analysis (HDA) might be integrated with established, simulation-centric paradigm high-performance computing (HPC) community. Based on those meetings, we argue rapid proliferation digital generators, unprecedented growth volume diversity they generate,...
Recent developments in High Level Synthesis tools have attracted software programmers to accelerate their high-performance computing applications on FPGAs. Even though it has been shown that FPGAs can compete with GPUs terms of performance for stencil computation, most previous work achieve this by avoiding spatial blocking and restricting input dimensions relative FPGA on-chip memory. In we create a accelerator using Intel SDK OpenCL achieves high without having such restrictions. We...
The Grid Datafarm (Gfarm) architecture is designed for global petascale data-intensive computing. It provides a parallel filesystem with online storage, scalable I/O bandwidth, and processing, it can exploit local in grid of clusters tens thousands nodes. Gfarm APIs commands provide single image manipulate metadata consistently. Fault tolerance load balancing are automatically managed by file duplication or recomputation using command history log. Preliminary performance evaluation has shown...
We demonstrated, for the first time, a novel spectrum-efficient elastic optical path network 100 Gb/s services and beyond, based on flexible rate transceivers variable-bandwidth wavelength crossconnects.
The scale of high performance computing (HPC) systems is exponentially growing, potentially causing prohibitive shrinkage mean time between failures (MTBF) while the overall increase in I/O parallel file will be far behind scale. As such, there have been various attempts to decrease checkpoint overhead, one which employ compression techniques files. While most existing focus on lossless compression, their rates and thus effectiveness remain rather limited. Instead, we propose a loss...
Word embedding has been well accepted as an important feature in the area of natural language processing (NLP). Specifically, Word2Vec model learns high-quality word embeddings and is widely used various NLP tasks. The training sequential on a CPU due to strong dependencies between word–context pairs. In this paper, we target scale GPU cluster. To do this, one main challenge reducing inside large batch. We heuristically design variation Word2Vec, which ensures that each pair contains...
While there have been several proposals of high-performance global computing systems, scheduling schemes for the systems not well investigated. The reason is difficulties evaluation by large-scale benchmarks with reproducible results. Our Bricks performance system allows analysis and comparison various in a typical setting. can simulate behaviors especially behavior networks resource algorithms. Moreover, partitioned into components such that only its constituents be replaced to different...
Formation et oxydation des particules de suie dans un moteur diesel a injection directe. Etude experimentale par la methode deux couleurs
MapReduce is a programming model that enables efficient massive data processing in large-scale computing environments such as supercomputers and clouds. Such computers employ GPUs to enjoy its good peak performance high memory bandwidth. Since the of each job depending on running application characteristics underlying environments, scheduling tasks onto CPU cores GPU devices for execution difficult. To address this problem, we have proposed hybrid technique GPU-based computer clusters, which...
General-Purpose computing on Graphics Processing Units (GPGPU) is becoming popular in HPC because of its high peak performance. However, spite the potential performance improvements as well recent promising results scientific applications, real not necessarily higher than that current high-performance CPUs, especially with trends towards increasing number cores a single die. This GPU can be severely limited by such restrictions memory size and bandwidth programming using graphics-specific...
Over the last 20 years, open-source community has provided more and software on which world’s high-performance computing systems depend for performance productivity. The invested millions of dollars years effort to build key components. Although investments in these separate elements have been tremendously valuable, a great deal productivity also lost because lack planning, coordination, integration technologies necessary make them work together smoothly efficiently, both within individual...
As the capability and component count of systems increase, MTBF decreases. Typically, applications tolerate failures with checkpoint/restart to a parallel file system (PFS). While simple, this approach can suffer from contention for PFS resources. Multi-level checkpointing is promising solution. However, while multi-level successful on today's machines, it not expected be sufficient exascale class which are predicted have orders magnitude larger memory sizes failure rates. Our solution...
In high performance computing (HPC), the applications are periodically check pointed to stable storage increase success rate of long executions. Nowadays, overhead imposed by disk-based checkpoint is about 20% execution time and in next years it will be more than 50% if frequency increases as fault increases. Diskless has been introduced a solution avoid IO bottleneck checkpoint. However, encoding time, dedicated resources (the spares) memory diskless significant obstacles against its...
As the capability and component count of systems increase, MTBF decreases. Typically, applications tolerate failures with checkpoint/restart to a parallel file system (PFS). While simple, this approach can suffer from contention for PFS resources. Multi-level checkpointing is promising solution. However, while multi-level successful on today's machines, it not expected be sufficient exascale class which are predicted have orders magnitude larger memory sizes failure rates. Our solution...
The Computational Grid is a promising platform for the deployment of various high-performance computing applications. A number projects have addressed idea software as service on network. These systems usually implement client-server architectures with many servers running distributed resources and commonly been referred to network-enabled (NES). An important question that scheduling in this multi-client multi-server scenario. Note context most requests are computationally intensive they...
One of the advantages in virtualized computing clusters compared to traditional shared HPC environments is their ability accommodate user-specific system customization. However, past attempts providing virtual are not scalable with increasing number VMs, nor do they allow fine-grained customization assuming that preconfigured VM images always available on grid. We propose a new cluster installation technique achieves efficiency and scalability, yet simultaneously customizability. It allows...
The ability of two strains bacteria to cooperate in the synthesis an enzyme complex (a minicellulosome) was examined. Three Bacillus subtilis were constructed express Clostridium cellulovorans genes engB, xynB, and minicbpA. MiniCbpA, EngB, XynB synthesized secreted into medium by B. subtilis. When with minicbpA engB or xynB cocultured, minicellulosomes synthesized, consisting one case miniCbpA EngB second XynB. Both showed their respective enzymatic activities. We call this phenomenon...