- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Interconnection Networks and Systems
- Distributed and Parallel Computing Systems
- Cloud Computing and Resource Management
- Matrix Theory and Algorithms
- Stochastic Gradient Optimization Techniques
- Embedded Systems Design Techniques
- Embedded Systems and FPGA Design
- Advanced Memory and Neural Computing
- Network Packet Processing and Optimization
- Software-Defined Networks and 5G
- Nuclear reactor physics and engineering
- Advanced Neural Network Applications
- Complexity and Algorithms in Graphs
- Advanced Sensor and Energy Harvesting Materials
- Wireless Sensor Networks and IoT
- Generative Adversarial Networks and Image Synthesis
- Real-Time Systems Scheduling
- Dielectric materials and actuators
- Aluminum Alloy Microstructure Properties
- Advanced machining processes and optimization
- Powder Metallurgy Techniques and Materials
- Radiation Therapy and Dosimetry
- Advanced Welding Techniques Analysis
Qilu University of Technology
2019-2024
Shandong Academy of Sciences
2019-2024
Jilin University
2024
Florida International University
2019
Institute of Software
2013-2017
Beihang University
2013-2017
University of Science and Technology of China
2016
Southwest University of Science and Technology
2016
Harbin Institute of Technology
2009-2012
Shenyang University of Technology
2012
Programming network processors is challenging. To sustain high line rates, have extremely tight memory access and instruction budgets. Achieving desired performance has traditionally required hand-coded assembly. Researchers recently proposed high-level programming languages for packet processing, but the challenges of compiling these into code that competitive with hand-tuned assembly remain unanswered.This paper describes Shangri-La compiler, which accepts a program written in C-like...
Programming network processors is challenging. To sustain high line rates, have extremely tight memory access and instruction budgets. Achieving desired performance has traditionally required hand-coded assembly. Researchers recently proposed high-level programming languages for packet processing, but the challenges of compiling these into code that competitive with hand-tuned assembly remain unanswered.This paper describes Shangri-La compiler, which accepts a program written in C-like...
Nowadays, the ocean numerical models are gradually developing towards multi-physical process and high resolution, with increment of measured data more in-depth research in field. Therefore, general computing capability is no longer able to meet these models' needs. It necessary utilize powerful hardware parallel software model programs. China has made great development homegrown performance processors, sunway sw26010 many-core processor most outstanding representative. This paper focuses lag...
Currently, there exists billions of files on the Internet, such as pictures, web pages, audio and video files, etc., number is still growing rapidly. These huge amount need to be processed by some applications quickly possible with parallel processing. With increasing cores in processors, programming becomes more complex. The behavior that multiple processes/threads access simultaneously may interfere each other cause extra performance loss. Consequently, this paper proposes a pipeline-based...
Uc/OS-II is an open-code real-time kernel based preemptive priority scheduling strategy. It assigns a unique for each task and does not support to schedule same tasks. In practical applications, assigning different tasks which realizing the function very good logical design. Moreover it can only create maximum of 64 tasks, meet needs increasingly complex applications. Aiming at these problems, in paper, real time uC/OS-II modified. The new kernal creatively gives approach layered hybird...
The structural evolution of dielectric elastomer induced by pre-strain is a complex, multi-scale process that poses significant challenge to deep understanding the effect pre-strain. Through simulation results, we identify variation in constant and (electronic structure, molecular chain conformation, aggregation structure) response poly(methyl acrylate). As increases, initially rises (below 200% pre-strain) then declines (above pre-strain). Analysis charge distribution, surface electrostatic...
Extracting performance from modern multicore architectures requires that parallel sections be divided into many threads of execution. In order to fully utilize these effectively, load balancing has become one the most important factors affect applications on multicores. this paper, we have shown belong a single, multithreaded application can exhibit poorly performance. We propose dynamic cache reservation scheme which redistribute reserved space critical thread for speeding up during...
This paper proposes a new intelligent window based on multi-sensor fusion. The is controlled by ARDUINO UNO development board. It has the functions of "Automatic Control" "Manual and "Close". In automatic control mode, will be parameters such as humidity, temperature, light intensity, wind speed air quality. project arduino MCU, PM2.5 detection, temperature humidity detection technology to design, mainly in "safety, intelligent, practical, market-oriented" four unity objective concept,...
Massive multi-threading in GPU imposes tremendous pressure on memory subsystems. Due to rapid growth thread-level parallelism of and slowly improved peak bandwidth, becomes a bottleneck GPU’s performance energy efficiency. In this article, we propose an integrated architectural scheme optimize the accesses therefore boost efficiency GPU. First, thread batch enabled partitioning (TEMP) improve access parallelism. particular, TEMP groups multiple blocks that share same set pages into applies...
Achieving microsecond-scale tail latency poses an extreme challenge to the conventional architecture of “NIC-OS-Application” in face high concurrent requests. Existing kernel-bypass network systems improve this situation significantly. Still, they cannot achieve load-aware in-server requests distribution, which turn not only harms resource efficiency but, more importantly, beats goal squeezing latency. This paper proposes iBalancer, proactive load balancer for system, aggressively handles...
With the development of electromagnetic simulation technology and increasing demand for simulation, verification based on numerical has received extensive attention from various research fields at home abroad. Solving linear sparse matrix equation generated in process is biggest bottleneck restricting running time program. Parallel computing, as an effective means to improve calculation speed processing capacity computer systems, can further expand scale problem solving shorten time. Next,...
Applications typically exhibit extremely different performance characteristics depending on the accelerator. Back propagation neural network (BPNN) has been parallelized into platforms. However, it not yet explored speculative multicore architecture thoroughly. This paper presents a study of parallelizing BPNN architecture, including its execution model, hardware design and programming model. The implementation was analyzed with seven well-known benchmark data sets. Furthermore, trades off...