- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Cloud Computing and Resource Management
- Advanced Image Processing Techniques
- Image Processing Techniques and Applications
- Advanced Vision and Imaging
- Algorithms and Data Compression
- Cryptography and Data Security
- Cryptography and Residue Arithmetic
- Software System Performance and Reliability
- Privacy-Preserving Technologies in Data
- Error Correcting Code Techniques
- Distributed and Parallel Computing Systems
- Coding theory and cryptography
- Internet Traffic Analysis and Secure E-voting
- Low-power high-performance VLSI design
- Machine Learning in Materials Science
- Genomics and Phylogenetic Studies
- Interconnection Networks and Systems
- Embedded Systems Design Techniques
- Network Security and Intrusion Detection
- Graph Theory and Algorithms
- DNA and Biological Computing
- Advanced Memory and Neural Computing
- Advanced Computing and Algorithms
National University of Defense Technology
2013-2024
Changsha University
2021
PLA Army Engineering University
2009
With the fast development of Internet Things (IoT) technology, normal people and organizations can produce massive data every day. Due to a lack mining expertise computation resources, most them choose use services. Unfortunately, directly sending query cloud may violate their privacy. In this work, we mainly consider designing scheme that enables provide an efficient privacy-preserving decision tree evaluation service for resource-constrained clients in IoT. To design such scheme, new...
Deep learning has achieved outstanding results in various tasks machine under the background of rapid increase equipment’s computing capacity. However, while achieving higher performance and effects, model size is larger, training inference time longer, memory storage occupancy increasing, efficiency shrinking, energy consumption augmenting. Consequently, it’s difficult to let these models run on edge devices such as micro mobile devices. Model compression technology gradually emerging...
Motif searching, i.e., identifying meaningful patterns from biological data, has been studied extensively due to its importance in the biomedical sciences. In this work, we seek improve performance of Weeder, a widely-used tool for automatic de novo motif searching. Weeder consists several functions, among which find that function oligo_scan, handles pattern matching, is bottleneck, especially when dealing with large datasets. Motivated by observation, adopt Micron Automata Processor (AP)...
With the development of Deep Learning (DL), Neural Network (DNN) models have become more complex. At same time, Internet makes it easy to obtain large data sets for DL training. Large-scale model parameters and training enhance level AI by improving accuracy DNN models. But on other hand, they also present severe challenges hardware platform because a needs lot computing memory resources that can easily exceed capacity single processor. In this context, integrating processors hierarchical...
As computer networks continue to grow in size and complexity, fault management todaypsilas high speed telecommunications is becoming ever more difficult. a kernel aspect of network management, diagnosis by performance data process deducing the exact source failure from set symptoms. In this paper some existing approach for are firstly discussed. It concentrates on analyzing alarm propagation wide area model then proposed. The composed two parts: Self-organizing maps training historical...
As a popular research field of computer vision, super resolution(SR) has received more and attention in recent years. Although the deep learning methods have achieved good results SR, there are still some problems. For example, previous models often based on single depth mechanism. This means that SR reconstruction problem all images is regarded as equal complexity. And we found details suitable for recovering complex models, while other less texture information simple models. At same time,...
Recently, GPUs are found to be used across a broad range of domains. To support virtual memory, which is required by most applications at present, the address translation process introduced GPU side. However, many demonstrate that an irregular memory access pattern, in accesses poor structured and often data dependent, makes performance worse especially with virtual-to-physical translations. Modern management unit (MMU) employs caching, e.g. page walk buffer (PWB) cache (PWC), scheduling...
GPUs are of increasing interests in the multi-core era due to their high computing power. However, power consumption caused by rising performance has been a general concern. As consequence, it is becoming an imperative demand optimize GPU consumption, among which estimation one important and useful solutions. In this work, we give survey modeling for GPU. We first introduce current development heterogeneous architectures then summarize existing techniques consumption. The main two types...
GPU enables shared virtual memory (SVM) to eliminate complex data transfer in programming for programmers. However, SVM bring the expensive overhead of address translation due large number requests which are generated simultaneously GPU. Memory Management Unit (MMU) is designed handle translation. page walk buffer (PWB) and cache (PWC) have lots redundant information limited improvement performance. To these problems, we propose a unified PWB PWC. Unified PWC abandon traditional linear table...
Stencil code is widely used in the field of scientific computing. Currently, researchers are focusing on performance optimization for stencil applications by data-level parallelism or thread-level parallelism. Using vector/SIMD instructions, which commonly to achieve parallelism, could effectively improve computation with a large number repetitive operations, but usually limited due access memory bandwidth, data and control dependencies. The Scalable Vector Extension (SVE), Vector-Length...