NFDI4DS | UHH-SEMS - Publication Details

Tao Liu

ORCID: 0000-0002-9653-4108

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100735051

Research Areas

Parallel Computing and Optimization Techniques
Advanced Data Storage Technologies
Interconnection Networks and Systems
Distributed and Parallel Computing Systems
Cloud Computing and Resource Management
Matrix Theory and Algorithms
Stochastic Gradient Optimization Techniques
Embedded Systems Design Techniques
Embedded Systems and FPGA Design
Advanced Memory and Neural Computing
Network Packet Processing and Optimization
Software-Defined Networks and 5G
Nuclear reactor physics and engineering
Advanced Neural Network Applications
Complexity and Algorithms in Graphs
Advanced Sensor and Energy Harvesting Materials
Wireless Sensor Networks and IoT
Generative Adversarial Networks and Image Synthesis
Real-Time Systems Scheduling
Dielectric materials and actuators
Aluminum Alloy Microstructure Properties
Advanced machining processes and optimization
Powder Metallurgy Techniques and Materials
Radiation Therapy and Dosimetry
Advanced Welding Techniques Analysis

Qilu University of Technology
2019-2024

Shandong Academy of Sciences
2019-2024

Jilin University
2024

Florida International University
2019

Institute of Software
2013-2017

Beihang University
2013-2017

University of Science and Technology of China
2016

Southwest University of Science and Technology
2016

Harbin Institute of Technology
2009-2012

Shenyang University of Technology
2012

Shangri-La

OPENALEX - Publications

Michael K. Chen Xiaofeng Li Ruiqi Lian Jason H. Lin Lixia Liu and 2 more

Programming network processors is challenging. To sustain high line rates, have extremely tight memory access and instruction budgets. Achieving desired performance has traditionally required hand-coded assembly. Researchers recently proposed high-level programming languages for packet processing, but the challenges of compiling these into code that competitive with hand-tuned assembly remain unanswered.This paper describes Shangri-La compiler, which accepts a program written in C-like...

10.1145/1064978.1065038 article EN ACM SIGPLAN Notices 2005-06-12

Shangri-La

OPENALEX - Publications

Michael K. Chen Xiaofeng Li Ruiqi Lian Jason H. Lin Lixia Liu and 2 more

10.1145/1065010.1065038 article EN Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation 2005-06-12

SW-DDFT: Parallel Optimization of the Dynamical Density Functional Theory Algorithm Based on Sunway Bluelight II Supercomputer

OPENALEX - Publications

X. Lv Tao Liu Han Qin Ying Guo Jingshan Pan and 3 more

10.32604/cmc.2025.063852 article EN Computers, materials & continua/Computers, materials & continua (Print) 2025-01-01

Parallel optimization of Monte Carlo neutron transport method based on Sunway Bluelight II supercomputer

OPENALEX - Publications

Zhong-Liang Zhang Tao Liu Chengzhi Wang Ying Guo Jingshan Pan and 3 more

10.1007/s11227-025-07190-1 article EN The Journal of Supercomputing 2025-04-07

HSAS: Efficient task scheduling for large scale heterogeneous systolic array accelerator cluster

OPENALEX - Publications

Kaige Yan Yanshuang Song Tao Liu Jingweijia Tan Xiaohui Wei and 1 more

10.1016/j.future.2024.01.023 article EN Future Generation Computer Systems 2024-01-23

swSuperLU: A highly scalable sparse direct solver on Sunway manycore architecture

OPENALEX - Publications

Min Tian Junjie Wang Zanjun Zhang Wei Du Jingshan Pan and 1 more

10.1007/s11227-021-04270-w article EN The Journal of Supercomputing 2022-02-11

Parallel Implementation and Optimization of Regional Ocean Modeling System (ROMS) Based on Sunway SW26010 Many-Core Processor

OPENALEX - Publications

Tao Liu Yuan Zhuang Min Tian Jingshan Pan Yunhui Zeng and 2 more

Nowadays, the ocean numerical models are gradually developing towards multi-physical process and high resolution, with increment of measured data more in-depth research in field. Therefore, general computing capability is no longer able to meet these models' needs. It necessary utilize powerful hardware parallel software model programs. China has made great development homegrown performance processors, sunway sw26010 many-core processor most outstanding representative. This paper focuses lag...

10.1109/access.2019.2944922 article EN cc-by IEEE Access 2019-01-01

Pipeline-Based Parallel Framework for Mass File Processing

OPENALEX - Publications

Tao Liu Yi Liu Qingquan Wang Xiangrong Wang Fei Gao and 1 more

Currently, there exists billions of files on the Internet, such as pictures, web pages, audio and video files, etc., number is still growing rapidly. These huge amount need to be processed by some applications quickly possible with parallel processing. With increasing cores in processors, programming becomes more complex. The behavior that multiple processes/threads access simultaneously may interfere each other cause extra performance loss. Consequently, this paper proposes a pipeline-based...

10.1109/csc.2013.15 article EN 2013-11-01

Research on Subsystem Hybrid Scheduling and Priority Inversion Based uCOS-II

OPENALEX - Publications

Xibo Wang Tao Liu

Uc/OS-II is an open-code real-time kernel based preemptive priority scheduling strategy. It assigns a unique for each task and does not support to schedule same tasks. In practical applications, assigning different tasks which realizing the function very good logical design. Moreover it can only create maximum of 64 tasks, meet needs increasingly complex applications. Aiming at these problems, in paper, real time uC/OS-II modified. The new kernal creatively gives approach layered hybird...

10.1109/icinis.2012.69 article EN 2012-11-01

Parallel optimization of method of characteristics based on Sunway Bluelight II supercomputer

OPENALEX - Publications

Renjiang Chen Tao Liu Zhaoyuan Liu Li Wang Min Tian and 4 more

10.1007/s11227-023-05313-0 article EN The Journal of Supercomputing 2023-04-26

Parallel optimization of plasma single particle simulation program based on Sunway Bluelight II supercomputer

OPENALEX - Publications

Zihe Wang Tao Liu Ranran Tao Zenghui Ren Yuhui Li and 2 more

10.1145/3675018.3675774 article EN 2024-06-07

A Set of Resource Scheduling Methods Based on New Sunway Many-core Processor

OPENALEX - Publications

Yuhui Li Tao Liu Zenghui Ren Zihe Wang Y. Q. Guo and 1 more

10.1145/3675018.3675777 article EN 2024-06-07

Research on the Microwave Sharpening Process for Coarse-Grained Metal-Bonded Diamond Grinding Wheels and Their Wear Performance

OPENALEX - Publications

Jiaying Yan Shichun Li Zhi Bo Yang Tao Liu Chen Chen

10.2139/ssrn.4984534 preprint EN 2024-01-01

Leveraging Graph Analysis to Pinpoint Root Causes of Scalability Issues for Parallel Applications

OPENALEX - Publications

Yuyang Jin Haojie Wang Xiongchao Tang Zhenhua Guo Yaqian Zhao and 4 more

10.1109/tpds.2024.3485789 article EN IEEE Transactions on Parallel and Distributed Systems 2024-01-01

Insights into multi-scale structural evolution and dielectric response of poly(methyl acrylate) under pre-strain: A simulation study

OPENALEX - Publications

Han Qin Tao Liu Zhaoyuan Liu Meng Guo Ying Guo and 1 more

The structural evolution of dielectric elastomer induced by pre-strain is a complex, multi-scale process that poses significant challenge to deep understanding the effect pre-strain. Through simulation results, we identify variation in constant and (electronic structure, molecular chain conformation, aggregation structure) response poly(methyl acrylate). As increases, initially rises (below 200% pre-strain) then declines (above pre-strain). Analysis charge distribution, surface electrostatic...

10.1063/5.0238343 article EN The Journal of Chemical Physics 2024-12-09

Ultrasonic vibration enhanced friction stir welding process of aluminum/steel dissimilar metals

OPENALEX - Publications

Wu Chenghao Tao Liu Song Gao Lei Shi Hongtao Liu

10.11868/j.issn.1001-4381.2021.000338 article EN DOAJ (DOAJ: Directory of Open Access Journals) 2022-01-01

Dynamic Cache Reservation to Maximize Efficiency in Shared Cache Multicores

OPENALEX - Publications

Qing Wang Zhenzhou Ji Tao Liu Zhu Suxia

Extracting performance from modern multicore architectures requires that parallel sections be divided into many threads of execution. In order to fully utilize these effectively, load balancing has become one the most important factors affect applications on multicores. this paper, we have shown belong a single, multithreaded application can exhibit poorly performance. We propose dynamic cache reservation scheme which redistribute reserved space critical thread for speeding up during...

10.1109/imccc.2011.61 article EN 2011-10-01

Design and Implementation of Intelligent Window Control System Based on Multi-sensor Fusion

OPENALEX - Publications

Tao Liu Hongqian Lu Zihao Wei

This paper proposes a new intelligent window based on multi-sensor fusion. The is controlled by ARDUINO UNO development board. It has the functions of "Automatic Control" "Manual and "Close". In automatic control mode, will be parameters such as humidity, temperature, light intensity, wind speed air quality. project arduino MCU, PM2.5 detection, temperature humidity detection technology to design, mainly in "safety, intelligent, practical, market-oriented" four unity objective concept,...

10.1109/ddcls.2019.8908967 article EN 2022 IEEE 11th Data Driven Control and Learning Systems Conference (DDCLS) 2019-05-01

Thread Batching for High-performance Energy-efficient GPU Memory Design

OPENALEX - Publications

Bing Li Mengjie Mao Xiaoxiao Liu Tao Liu Zihao Liu and 3 more

Massive multi-threading in GPU imposes tremendous pressure on memory subsystems. Due to rapid growth thread-level parallelism of and slowly improved peak bandwidth, becomes a bottleneck GPU’s performance energy efficiency. In this article, we propose an integrated architectural scheme optimize the accesses therefore boost efficiency GPU. First, thread batch enabled partitioning (TEMP) improve access parallelism. particular, TEMP groups multiple blocks that share same set pages into applies...

10.1145/3330152 article EN ACM Journal on Emerging Technologies in Computing Systems 2019-10-31

iBalancer: Load-Aware in-Server Flow Scheduling for Sub-Millisecond Tail Latency

OPENALEX - Publications

Qi Zhang Yi Liu Tao Liu

Achieving microsecond-scale tail latency poses an extreme challenge to the conventional architecture of “NIC-OS-Application” in face high concurrent requests. Existing kernel-bypass network systems improve this situation significantly. Still, they cannot achieve load-aware in-server requests distribution, which turn not only harms resource efficiency but, more importantly, beats goal squeezing latency. This paper proposes iBalancer, proactive load balancer for system, aggressively handles...

10.1109/tpds.2021.3120021 article EN IEEE Transactions on Parallel and Distributed Systems 2021-10-15

Heterogeneous multi-core optimization of MUMPS solver and its application

OPENALEX - Publications

Jingshan Pan Lei Xiao Min Tian Tao Liu Li Wang

With the development of electromagnetic simulation technology and increasing demand for simulation, verification based on numerical has received extensive attention from various research fields at home abroad. Solving linear sparse matrix equation generated in process is biggest bottleneck restricting running time program. Parallel computing, as an effective means to improve calculation speed processing capacity computer systems, can further expand scale problem solving shorten time. Next,...

10.1145/3491396.3506501 article EN 2021-12-28

Parallelizing Back Propagation Neural Network on Speculative Multicores

OPENALEX - Publications

Yaobin Wang Hong An Zhiqin Liu Tao Liu Dongmei Zhao

Applications typically exhibit extremely different performance characteristics depending on the accelerator. Back propagation neural network (BPNN) has been parallelized into platforms. However, it not yet explored speculative multicore architecture thoroughly. This paper presents a study of parallelizing BPNN architecture, including its execution model, hardware design and programming model. The implementation was analyzed with seven well-known benchmark data sets. Furthermore, trades off...

10.1109/icpads.2016.0121 article EN 2016-12-01

Coming Soon ...