NFDI4DS | UHH-SEMS - Publication Details

Miwako Tsuji

ORCID: 0000-0003-4709-1969

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5079973660

Research Areas

Parallel Computing and Optimization Techniques
Distributed and Parallel Computing Systems
Advanced Data Storage Technologies
Metaheuristic Optimization Algorithms Research
Evolutionary Algorithms and Applications
Interconnection Networks and Systems
Advanced Numerical Methods in Computational Mathematics
Matrix Theory and Algorithms
Quantum Computing Algorithms and Architecture
Embedded Systems Design Techniques
Cloud Computing and Resource Management
Nuclear reactor physics and engineering
Model Reduction and Neural Networks
Advanced Multi-Objective Optimization Algorithms
Semiconductor materials and devices
Nuclear Physics and Applications
Data Mining Algorithms and Applications
Electromagnetic Simulation and Numerical Methods
Software System Performance and Reliability
Electromagnetic Scattering and Analysis
Quantum and electron transport phenomena
Numerical methods for differential equations
Genomics and Phylogenetic Studies
Multi-Criteria Decision Making
Particle physics theoretical and experimental studies

RIKEN Center for Computational Science
2013-2024

RIKEN
2022

Hokkaido University
1998-2015

University of Tsukuba
2009-2014

Institut Lavoisier de Versailles
2013

Muroran Institute of Technology
2008

Fujitsu (Japan)
2002

Osaka University
1998

Enamine (Ukraine)
1948

Co-Design for A64FX Manycore Processor and ”Fugaku”

OPENALEX - Publications

Mitsuhisa Sato Yutaka Ishikawa Hirofumi Tomita Yuetsu Kodama Tetsuya Odajima and 10 more

We have been carrying out the FLAGSHIP 2020 Project to develop Japanese next-generation flagship supercomputer, Post-K, recently named "Fugaku". designed an original many core processor based on Armv8 instruction sets with Scalable Vector Extension (SVE), A64FX processor, as well a system including interconnect and storage subsystem industry partner, Fujitsu. The "co-design" of applications is key making it power efficient high performance. determined architectural parameters by reflecting...

10.1109/sc41405.2020.00051 article EN 2020-11-01

First-principles calculations of electron states of a silicon nanowire with 100,000 atoms on the K computer

OPENALEX - Publications

Yukihiro Hasegawa Junichi Iwata Miwako Tsuji Daisuke Takahashi Atsushi Oshiyama and 8 more

Real space DFT (RSDFT) is a simulation technique most suitable for massively-parallel architectures to perform first-principles electronic-structure calculations based on density functional theory. We here report unprecedented simulations the electron states of silicon nanowires with up 107,292 atoms carried out during initial performance evaluation phase K computer being developed at RIKEN.

10.1145/2063384.2063386 article EN 2011-11-08

Preliminary Performance Evaluation of the Fujitsu A64FX Using HPC Applications

OPENALEX - Publications

Tetsuya Odajima Yuetsu Kodama Miwako Tsuji Motohiko Matsuda Yutaka Maruyama and 1 more

RIKEN Center for Computational Science has been installing the supercomputer Fugaku. The Fujitsu A64FX, based on Armv8.2-A+SVE architecture, is used in system. In this paper, we evaluated seven HPC applications and benchmarks A64FX. a performance comparison with Marvell (Cavium) ThunderX2 processor Intel Xeon Skylake processor, A64FX achieved higher memory bandwidth-intensive application thanks to its high bandwidth. However, confirmed that of decreased from lack out-of-order resources. To...

10.1109/cluster49012.2020.00075 article EN 2020-09-01

Performance evaluation of ultra-large-scale first-principles electronic structure calculation code on the K computer

OPENALEX - Publications

Yukihiro Hasegawa Junichi Iwata Miwako Tsuji Daisuke Takahashi Atsushi Oshiyama and 6 more

Silicon nanowires are potentially useful in next-generation field-effect transistors, and it is important to clarify the electron states of silicon know behavior new devices. Computer simulations promising tools for calculating states. Real-space density functional theory (RSDFT) code performs first-principles electronic structure calculations. To obtain higher performance, we applied various optimization techniques code: multi-level parallelization, load balance management, sub-mesh/torus...

10.1177/1094342013508163 article EN The International Journal of High Performance Computing Applications 2013-10-17

Quantum circuit synthesis via a random combinatorial search

OPENALEX - Publications

Sahel Ashhab Fumiki Yoshihara Miwako Tsuji Mitsuhisa Sato Kouichi Semba

We use a random search technique to find quantum gate sequences that implement perfect state preparation or unitary operator synthesis with arbitrary targets. This approach is based on the recent discovery there large multiplicity of circuits achieve unit fidelity in performing given target operation, even at minimum number single-qubit and two-qubit gates needed fidelity. show fraction perfect-fidelity increases rapidly as soon circuit size exceeds required for achieving result implies...

10.1103/physreva.109.052605 article EN Physical review. A/Physical review, A 2024-05-06

Preliminary Performance Evaluation of Application Kernels Using ARM SVE with Multiple Vector Lengths

OPENALEX - Publications

Yuetsu Kodama Tetsuya Odajima Motohiko Matsuda Miwako Tsuji Jinpil Lee and 1 more

Modern high performance processors are equipped with very wide SIMD instruction set. SVE (Scalable Vector Extension) is an ARM® technology that supports vector lengths from 128 bits to 2048 bits. One of its promising features offer "vector-length agnostic" programming allow the same code run on hardware any length without modification code. This feature would be useful explore best appropriate resources in space various combinations parameters order make more efficient use resources, since...

10.1109/cluster.2017.93 article EN 2017-09-01

Co-Design and System for the Supercomputer “Fugaku”

OPENALEX - Publications

Mitsuhisa Sato Yuetsu Kodama Miwako Tsuji Tesuya Odajima

The supercomputer "Fugaku" is an exascale manycore-based parallel system developed as a Japanese national flagship in the FLAGSHIP 2020 Project. While was ranked first for several benchmarks such TOP500, HPCG, HPL-AI, and Graph500 2020, major design concept application-first by co-design power efficiency high performance. We have designed original manycore processor based on Armv8 instruction sets with scalable vector extension, A64FX processor, Fujitsu, our industry partner. consists of...

10.1109/mm.2021.3136882 article EN cc-by IEEE Micro 2021-12-21

Power/Performance/Area Evaluations for Next-Generation HPC Processors using the A64FX Chip

OPENALEX - Publications

Eishi Arima Yuetsu Kodama Tetsuya Odajima Miwako Tsuji Mitsuhisa Sato

Future HPC systems, including post-exascale supercomputers, will face severe problems such as the slowing-down of Moore's law and limitation power supply. To achieve desired system performance improvement while counteracting these issues, hardware design optimization is a key factor. In this paper, we investigate future directions SIMD-based processor architectures by using A64FX chip customized version power/performance/area simulators, i.e., Gem5 McPAT. More specifically, based on chip,...

10.1109/coolchips52128.2021.9410320 article EN 2021-04-14

Multiple-SPMD Programming Environment Based on PGAS and Workflow toward Post-petascale Computing

OPENALEX - Publications

Miwako Tsuji Mitsuhisa Sato Maxime Hugues Serge G. Petiton

In this paper, we propose a new development and execution environment based on workflow PGAS methodologies for parallel programmings in post-petascale systems. It is expected that systems will have huge highly hierarchical architecture with nodes of many-core processors accelerators. For current programs, MPI, MPI/OpenMP hybrid, so on, it would be sometimes difficult to exploit the efficiently. The proposed environment, called FP2C (Framework Post-Petascale Computing), supports multi-program...

10.1109/icpp.2013.58 article EN 2013-10-01

102 PFLOPS lattice QCD quark solver on Fugaku

OPENALEX - Publications

Ken-Ichi Ishikawa Issaku Kanamori Hideo Matsufuru Ikuo Miyoshi Yuta Mukai and 3 more

10.1016/j.cpc.2022.108510 article EN Computer Physics Communications 2022-09-05

152K-computer-node parallel scalable implicit solver for dynamic nonlinear earthquake simulation

OPENALEX - Publications

Tsuyoshi Ichimura Kohei Fujita Kentaro Koyama Ryota Kusakabe Yuma Kikuchi and 10 more

We have used data learning and low-precision computation to develop an implicit solver that demonstrates high performance up 152,352 computer nodes (609,408 MPI processes × 12 OpenMP threads = 7,312,896 parallel computation) conducted unprecedented ultra-large-scale analysis of ultra-high-fidelity fault-structure systems using nonlinear dynamic finite element on three-dimensional low-order unstructured elements. The developed achieved 25.45-fold speedup from the state art Fugaku attained...

10.1145/3492805.3492814 article EN 2022-01-07

Linkage Identification by Fitness Difference Clustering

OPENALEX - Publications

Miwako Tsuji Masaharu Munetomo Kiyoshi Akama

Genetic Algorithms perform crossovers effectively when linkage sets - of variables tightly linked to form building blocks are identified. Several methods have been proposed detect the sets. Perturbation (PMs) investigate fitness differences by perturbations gene values and Estimation distribution algorithms (EDAs) estimate promising strings. In this paper, we propose a novel approach combining both them, which detects dependencies estimating strings clustered according differences. The...

10.1162/evco.2006.14.4.383 article EN Evolutionary Computation 2006-11-16

An Overview on Mixing MPI and OpenMP Dependent Tasking on A64FX

OPENALEX - Publications

Romain Pereira Adrien Roussel Miwako Tsuji Patrick Carribault Mitsuhisa Sato and 2 more

The adoption of ARM processor architectures is on the rise in HPC ecosystem. Fugaku supercomputer a homogeneous ARM-based machine, and one among most powerful machine world. In programming world, dependent task-based models are gaining tractions due to their many advantages: dynamic load balancing, implicit expression communication/computation overlap, early-bird communication posting,...MPI OpenMP two widespreads standards that make possible at distributed memory level. Despite its...

10.1145/3636480.3637094 preprint EN 2024-01-08

A crossover for complex building blocks overlapping

OPENALEX - Publications

Miwako Tsuji Masaharu Munetomo Kiyoshi Akama

We propose a crossover method to combine complexly overlapping building blocks (BBs). Although there have been several techniques identify linkage sets of loci o form BB [4, 6, 7, 10, 11], the way realize effective from information such has not studied enough. Especially for problems with BBs, proposed by Yu et al. [13] is first and only known research, however it cannot perform well BBs due insufficient variety sites. In this paper, we which examines values given parental strings minutely...

10.1145/1143997.1144204 article EN 2006-07-08

The Transient Dynamics of a Small Bubble Rising in a Low Morton Number Regime

OPENALEX - Publications

Mitsuhiro Ohta Miwako Tsuji Yutaka Yoshida Mark Sussman

Abstract The effect of initial bubble conditions on the transient dynamics a small ( d = 1.5 mm) rising in water is computationally considered by fully three‐dimensional direct numerical simulation. algorithm based coupled level set/volume‐of‐fluid (CLSVOF) method for representing and updating air‐water interface sharp approach used to treat interfacial boundary conditions. are investigated using bubbles with five different kinds shapes as It shown that states shape, trajectory, terminal...

10.1002/ceat.200700507 article EN Chemical Engineering & Technology 2008-07-30

Biochemical Study of Microorganisms. XIV

OPENALEX - Publications

Yukio Kameda Etsuko Toyoura Shoichi Ohshima Miwako Tsuji Chiyoko Iriye

In previous reports, the authors showed results of experiments on growth inhibition tests 12 kinds acylamino acid against Staphylococci (Terashima strain) and in present expt., same were made with 17 acids. Results as follows: (1) Growth increases following order. Lauryl-dl-alanine (1, 000-2, 000); lauryl-dl-α amino-n-butyric (2, lauryl-dl valine (4, lauryl-dl-phenylalanine (16, 000-32, 000). (2) Laurination amino is more effective than caprinylation. Benzoylation ineffective. (3)...

10.1248/yakushi1947.68.5-6_143 article EN YAKUGAKU ZASSHI 1948-01-01

High-fidelity nonlinear low-order unstructured implicit finite-element seismic simulation of important structures by accelerated element-by-element method

OPENALEX - Publications

Kohei Fujita Kentaro Koyama Kazuo Minami Hikaru Inoue Seiya Nishizawa and 5 more

10.1016/j.jocs.2020.101277 article EN Journal of Computational Science 2020-12-10

Performance Evaluation of OpenMP and MPI Hybrid Programs on a Large Scale Multi-core Multi-socket Cluster, T2K Open Supercomputer

OPENALEX - Publications

Miwako Tsuji Mitsuhisa Sato

Non-uniform memory access (NUMA) systems, where each processor has its own memory, have been popular platform in high-end computing. While some early studies had reported that a flat-MPI programming model outperformed an OpenMP/MPI hybrid on SMP clusters, the of shared-memory, thread-based and distributed-memory, message passing is considered to be promising multi-core multi-socket NUMA clusters. We explore performance large scale cluster called T2K Open Supercomputer. Both benchmark (NPB,...

10.1109/icppw.2009.73 article EN 2009-09-01

Sensitivity Analysis for Reactor Period Induced by Positive Reactivity Using One-point Adjoint Kinetic Equation

OPENALEX - Publications

Go Chiba Miwako Tsuji Tadashi NARABAYASHI

10.1016/j.nds.2014.04.091 article EN Nuclear Data Sheets 2014-04-01

A Hierarchical Domain Decomposition Boundary Element Method Applied to the Multiregion Problems of Neutron Diffusion Equations

OPENALEX - Publications

Mohammad Dhandhang Purwadi Miwako Tsuji M. Narita Masafumi Itagaki

A technique is presented for solving neutron diffusion equations with the boundary element method (BEM) based on a hierarchical domain decomposition technique. In this method, reactor decomposed into homogeneous regions and condition common of initially assumed. The equation solved iteratively at two levels structure: First, BEM applied to solve each region under given assumed conditions an multiplication factor. Then, these values are modified satisfy continuity flux current.The proposed...

10.13182/nse98-a1966 article EN Nuclear Science and Engineering 1998-05-01

A Performance Projection of Mini-Applications onto Benchmarks Toward the Performance Projection of Real-Applications

OPENALEX - Publications

Miwako Tsuji William Kramer Mitsuhisa Sato

Widely used benchmarks, such as High Performance Linpack (HPL), do not always provide direct insights are notoriously poor indicators of into the actual application performance systems. When real applications used, and there have been criticisms indicating that simplified benchmarks HPL no longer strongly correlate to performance. In contrast, evaluations based on or mini may give a estimation The Sustained System (SSP) metric, which is evaluate systems at scale various applications, has...

10.1109/cluster.2017.123 article EN 2017-09-01

Distributed and Parallel Programming Paradigms on the K computer and a Cluster

OPENALEX - Publications

Jérôme Gurhem Miwako Tsuji Serge G. Petiton Mitsuhisa Sato

In this paper, we focus on a distributed and parallel programming paradigm for massively multicore supercomputers. We introduce YML, development execution environment applications based graph of task components scheduled at runtime optimized several middlewares. Then show why YML may be well adapted to running lot cores. The tasks are developed with the PGAS language XMP directives. use YML/XMP implement block-wise Gaussian elimination solve linear systems. also implemented it MPI without...

10.1145/3293320.3293330 preprint EN 2019-01-14

Coming Soon ...