Hiroyuki Takizawa

ORCID: 0000-0003-2858-3140
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Distributed and Parallel Computing Systems
  • Advanced Data Storage Technologies
  • Cloud Computing and Resource Management
  • Interconnection Networks and Systems
  • Embedded Systems Design Techniques
  • Distributed systems and fault tolerance
  • Peer-to-Peer Network Technologies
  • Computer Graphics and Visualization Techniques
  • Advanced Data Compression Techniques
  • Software System Performance and Reliability
  • Real-Time Systems Scheduling
  • Caching and Content Delivery
  • Advanced Vision and Imaging
  • Radiation Effects in Electronics
  • Scientific Computing and Data Management
  • Neural Networks and Applications
  • Lattice Boltzmann Simulation Studies
  • Software Engineering Research
  • Machine Learning and Data Classification
  • Low-power high-performance VLSI design
  • Algorithms and Data Compression
  • Quantum Computing Algorithms and Architecture
  • 3D IC and TSV technologies
  • Neural Networks and Reservoir Computing

Tohoku University
2016-2025

Tohoku University Hospital
2016-2025

NEC (Japan)
2016-2024

Japan Science and Technology Agency
2010-2015

Centre for Research in Engineering Surface Technology
2012-2014

University of Electro-Communications
2013

University of Amsterdam
2013

Karlsruhe Institute of Technology
2013

Carnegie Mellon University
2013

Hanyang University
2013

In this paper, a tool named CheCUDA is designed to checkpoint CUDA applications that use GPUs as accelerators. As existing checkpoint/restart implementations do not support checkpointing the GPU status, hooks part of basic driver API calls in order record status changes on main memory. At checkpointing, stores file after copying all necessary data video memory and then disabling runtime. restarting, reads file, re-initializes runtime, recovers resources so restart from stored status. This...

10.1109/pdcat.2009.78 article EN 2009-12-01

In this paper, we propose a new transparent checkpoint/restart (CPR) tool, named CheCL, for high-performance and dependable GPU computing. CheCL can perform CPR on an OpenCL application program without any modification recompilation of its code. A conventional check pointing system fails to checkpoint process if the uses OpenCL. Therefore, in every API call is forwarded another called proxy, proxy invokes function, two processes, are launched application. case, as not but standard process,...

10.1109/ipdps.2011.85 article EN 2011-05-01

Today, CUDA is the de facto standard programming framework to exploit computational power of graphics processing units (GPUs) accelerate various kinds applications. For efficient use a large GPU-accelerated system, one important mechanism checkpoint-restart that can be used not only improve fault tolerance but also optimize node/slot allocation by suspending job on node and migrating another node. Although several implementations have been developed so far, they do support applications or...

10.1109/ipdps.2011.131 article EN 2011-05-01

Achieving a high sustained simulation performance is the most important concern in HPC community. To this end, many kinds of system architectures have been proposed, and diversity systems grows rapidly. Under circumstance, vector-parallel supercomputer SX-ACE has designed to achieve memory-intensive applications by providing memory bandwidth commensurate with its computational capability. This paper examines potential modern through evaluation using practical engineering scientific...

10.1007/s11227-017-1993-y article EN cc-by The Journal of Supercomputing 2017-03-07

Abstract This study aimed to evaluate the impact of climate change on flood damage and effects mitigation measures combinations multiple adaptation in reducing damage. The inundation depth was calculated using a two-dimensional unsteady flow model. cost estimated from unit evaluation value set for each land use prefectures distribution. To estimate near future late twenty-first century, five global models were used. These provided daily precipitation, extreme precipitation calculated. In...

10.1007/s10584-021-03081-5 article EN cc-by Climatic Change 2021-04-01

The growing amount of data and advances in science have created a need for new kind cloud platform that provides users with flexibility, strong security, the ability to couple supercomputers edge devices through high-performance networks. We built such nation-wide platform, called "mdx" meet this need. mdx platform's virtualization service, jointly operated by 9 national universities 2 research institutes Japan, launched 2021, more features are development. Currently is used researchers wide...

10.1109/dasc/picom/cbdcom/cy55231.2022.9927975 article EN 2021 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech) 2022-09-12

This paper proposes an extensible programming framework to separate platform-specific optimizations from application codes. The allows programmers define their own code translation rules for special demands of individual systems, compilers, libraries, and applications. Code associated with user-defined compiler directives are defined in external file, the is just annotated by directives. For transformations based on rules, exposes abstract syntax tree (AST) as XML document expert...

10.1109/hipc.2014.7116902 article EN 2014-12-01

Abstract As the first experiment at BL10U in NanoTerasu, tender X-ray ptychographic coherent diffraction imaging (PCDI) was conducted using a photon energy of 3.5 keV. The patterns from 200 nm thick Ta test chart and micrometer-sized particle sulfurized polymer were collected. Subsequently, phase images reconstructed with resolutions sub-20 sub-50 nm, respectively. In near future, PCDI sub-10 resolution is anticipated to potentially revolutionize visualization nanoscale structures chemical...

10.35848/1882-0786/ad4846 article EN cc-by Applied Physics Express 2024-05-01

Abstract Objective: Computational uncertainty and variability of power absorption temperature rise in humans for radiofrequency (RF) exposure is a critical factor ensuring human protection. This aspect has been emphasized as priority. However, accurately modeling head tissue composition assigning dielectric thermal properties remains challenging task. study investigated the impact segmentation-based versus segmentation-free models assessing localized RF exposure. 
Approach: Two...

10.1088/1361-6560/adb935 article EN Physics in Medicine and Biology 2025-02-21

This paper describes a new-generation vector parallel supercomputer, NEC SX-9 system. The processor has an outstanding core to achieve over 100Gflop/s, and software-controllable on-chip cache keep the high ratio of memory bandwidth floating-point operation rate. Moreover, its large SMP nodes 16 processors with 1.6Tflop/s performance 1TB are connected dedicated network switches, which can inter-node communication at 128GB/s per direction. sustained is evaluated using six practical...

10.1145/1654059.1654088 article EN 2009-11-14

A commodity personal computer (PC) can be seen as a hybrid computing system equipped with two different kinds of processors, i.e. CPU and graphics processing unit (GPU). Since the superiorities GPUs in performance power efficiency strongly depend on configuration data size determined at runtime, programmer cannot always know which processor should used to execute certain kernel. Therefore, this paper presents runtime environment that dynamically selects an appropriate so improve energy...

10.1109/clustr.2008.4663799 article EN 2008-09-01

Recently, the high-performance computing world has moved to more heterogeneous architectures. Thus, it become a standard practice offload part of application execution dedicated accelerators. However, disadvantage in productivity is still problem programming for This paper proposes neoSYCL: SYCL implementation SX-Aurora TSUBASA, aiming improve and achieve comparable performance with native implementations. Unlike other implementations, neoSYCL can identify separate kernel code at source...

10.1145/3432261.3432268 article EN 2021-01-14

NEC SX-series vector supercomputers have provided outstanding memory bandwidths to meet the strong demands for efficient execution of memory-intensive scientific applications in practice. Inheriting advantage, 2nd generation SX-Aurora TSUBASA, Type 20B, provides an extremely high bandwidth 1.53 TB/s per processor. Unlike conventional systems, TSUBASA also offers various modes execute a diversity emerging workloads efficiently. As result, application developers need understand their and...

10.1109/pmbs51919.2020.00010 article EN 2020-11-01

The number of patients with heat illness transported by ambulance has been gradually increasing due to global warming. In intense waves, it is crucial accurately estimate the cases for management medical resources. Ambient temperature an essential factor respect illness, although thermophysiological response a more relevant causing symptoms. this study, we computed daily maximum core increase and total amount sweating in test subject using large-scale, integrated computational method...

10.3389/fpubh.2023.1061135 article EN cc-by Frontiers in Public Health 2023-02-17

This paper proposes a distributed and cooperative scheduling mechanism for dynamic load-balancing on large-scale computing environment. In the proposed mechanism, processes are performed by independent schedulers individual resources. Decentralized mechanisms more suitable of environment than centralized in terms scalability fault tolerance. Experimental results show that has high efficiency, without any excessive concentration processing even if number resources increases

10.1109/saint-w.2006.2 article EN 2006-02-15

In August 2023, we released the latest version of our ABINIT-MP program, Open Version 2 Revision 8. this version, most commonly used FMO-MP2 calculations are even faster than in previous 4. It is now also possible to calculate excitation and ionization energies for regions interest. Improved interaction analysis available. addition, have started GPU-oriented modifications. preliminary report, present current status ABINIT-MP.

10.2477/jccj.2024-0001 article EN Journal of Computer Chemistry Japan 2024-01-01

Peer to peer (P2P) systems are extremely vulnerable Sybil attacks, in which a malicious user controls large number of peers collude break the system laws. This paper proposes distributed algorithm, named Resisting Network Clustering (SRNC), resist attack by preventing honest from communicating with Peers. SRNC is based on social network model. In this model, and can be largely classified into two clusters, connected small edges, called edges. tries explicitly detect then prohibits...

10.1109/saint.2010.32 article EN 2010-07-01

Recently, chip multiprocessors (CMPs) that can simultaneously execute multiple workloads using cores have become a key to achieve high-performance processing. To improve CMP performance, various shared resource management mechanisms been proposed. In particular, cache partitioning is significantly effective avoid conflicts at memory. As most methods need predict the changes in access characteristics of each workload when partition moves, it important for establish an accurate prediction model.

10.1145/1509084.1509086 article EN 2008-10-26

HPC scientific codes are less readable and manageable because of complex hand optimization which is often platform-dependent. We developing a toolset that hopefully mitigates maintainability problem by user-defined easy-to-use code transformation: The written in simpler form, coding technique for high performance introduced transformations. In this paper, we present xevtgen, transformation generator our toolset. Transformation rules defined using dummy Fortran with some directives, expect...

10.1109/candar.2015.63 article EN 2015-12-01
Coming Soon ...