- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Advanced Memory and Neural Computing
- Cloud Computing and Resource Management
- Distributed and Parallel Computing Systems
- Interconnection Networks and Systems
- Embedded Systems Design Techniques
- Ferroelectric and Negative Capacitance Devices
- Low-power high-performance VLSI design
- Distributed systems and fault tolerance
- Caching and Content Delivery
- Catalytic Processes in Materials Science
- IoT and Edge/Fog Computing
- Magnetic properties of thin films
- Neural Networks and Applications
- Ammonia Synthesis and Nitrogen Reduction
- Catalysis and Hydrodesulfurization Studies
- Optimization and Search Problems
Technical University of Munich
2022-2024
The University of Tokyo
2013-2021
Ion Technology Center (Japan)
2020
Two performance gaps in the memory hierarchy, between CPU cache and main memory, mass storage, will become increasingly severe bottlenecks for computing-system performance. Although it is necessary to increase capacity fill these gaps, power also increases when conventional volatile memories are used. A new nonvolatile this purpose has been anticipated. Storage class used second gap. Many candidates exist: ReRAM, PRAM, 3D-cross point type with resistive change RAM. However, last level (LLC)...
The supercomputer "Fugaku", which recently ranked number one on multiple supercomputing lists, including the Top500 in June 2020, has various power control features, such as (1) an eco mode that utilizes only of two floating-point pipelines while decreasing supply to chip; (2) a boost increases clock frequency; and (3) core retention function turns unused cores into low-power state. By orchestrating these power-performance features considering characteristics currently running applications,...
Future HPC systems, including post-exascale supercomputers, will face severe problems such as the slowing-down of Moore's law and limitation power supply. To achieve desired system performance improvement while counteracting these issues, hardware design optimization is a key factor. In this paper, we investigate future directions SIMD-based processor architectures by using A64FX chip customized version power/performance/area simulators, i.e., Gem5 McPAT. More specifically, based on chip,...
This paper presents GreenCourier, a novel scheduling framework that enables the runtime of serverless functions across geographically distributed regions based on their carbon efficiencies. Our incorporates an intelligent strategy for Kubernetes and supports Knative as platform. To obtain real-time information different geographical regions, our multiple marginal emissions sources such WattTime Carbon-aware SDK. We comprehensively evaluate performance using Google Engine production function...
Implementing last level caches (LLCs) with STT-MRAM is a promising approach for designing energy efficient microprocessors due to high density and low leakage power of its memory cells. However, peripheral circuits an cache still suffer from because large leaky transistors are required drive write current element. To overcome this problem, we propose new management scheme called Immediate Sleep (IS). IS immediately turns off subarray if the next access predicted be not critical in...
CPU-GPU heterogeneous systems are now commonly used in HPC (High-Performance Computing). However, improving the utilization and energy-efficiency of such is still one most critical issues. As single program typically cannot fully utilize all resources within a node/chip, co-scheduling (or co-locating) multiple programs with complementary resource requirements promising solution. Meanwhile, as power consumption has become first-class design constraint for systems, techniques should be...
This paper describes a proposal of non-volatile cache architecture utilizing novel DRAM / MRAM cell-level hybrid structured memory (D-MRAM) that enables effective power reduction for high performance mobile SoCs without area overhead. Here, the key point to reduce active is intermittent refresh process DRAM-mode. D-MRAM has advantage static consumptions compared conventional SRAM, because there are no leakage paths in cell and it not needed supply voltage its cells when used as MRAM-mode....
This paper describes a proposal of non-volatile cache architecture utilizing novel DRAM / MRAM cell-level hybrid structured memory (D-MRAM) that enables effective power reduction for high performance mobile SoCs without area overhead. Here, the key point to reduce active is intermittent refresh process DRAM-mode. D-MRAM has advantage static consumptions compared conventional SRAM, because there are no leakage paths in cell and it not needed supply voltage its cells when used as MRAM-mode....
This paper describes state-of-the-art STT-MRAM, which can drastically save energy consumption dissipated in cache memory system compared with conventional SRAM-based ones. also presents how to build hierarchy both the state-of-art STT-MRAM and SRAM reduce consumption. The key point is "break-even-time aware design" based on normally-off operation. For further power reduction, an intelligent management technique for STT-MRAM-based discussed.
GPU-based heterogeneous architectures are now commonly used in HPC clusters. Due to their architectural simplicity specialized for data-level parallelism, GPUs can offer much higher computational throughput and memory bandwidth than CPUs the same generation do. However, as available resources have increased exponentially over past decades, it has become increasingly difficult a single program fully utilize them. As consequence, industry started supporting several resource partitioning...
In modern microprocessors, lower level cache memories are usually implemented as unified caches where different classes of cachelines such data, instructions, and Page Table Entries (PTEs) coexist. Particularly, frequent PTE accesses following after TLB missies can happen on systems, which is driven by the increasing demands applications for larger working set size, this trend naturally leads to significant conflicts among these kinds cachelines.This paper targets emerging conflict problem...
Tackling climate change by reducing and eventually eliminating carbon emissions is a significant milestone on the path toward establishing an environmentally sustainable society. As we transition into exascale era, marked increasing demand scale of HPC resources, community must embrace challenge from designing operating modern systems. In this position paper, describe challenges highlight different opportunities that can aid sites in footprint
Tackling climate change by reducing and eventually eliminating carbon emissions is a significant milestone on the path toward establishing an environmentally sustainable society. As we transition into exascale era, marked increasing demand scale of HPC resources, community must embrace challenge from designing operating modern systems. In this position paper, describe challenges highlight different opportunities that can aid sites in footprint