- Advanced Data Storage Technologies
- Distributed and Parallel Computing Systems
- Scientific Computing and Data Management
- Parallel Computing and Optimization Techniques
- Cryptography and Data Security
- Data Management and Algorithms
- Interconnection Networks and Systems
- Advanced Database Systems and Queries
- Privacy-Preserving Technologies in Data
- Cloud Computing and Resource Management
- Cloud Data Security Solutions
- Embedded Systems Design Techniques
- Satellite Communication Systems
- Telecommunications and Broadcasting Technologies
- Software System Performance and Reliability
- Advanced Graph Neural Networks
- Advanced Wireless Communication Techniques
- Energy Efficient Wireless Sensor Networks
- Computational Drug Discovery Methods
- Bioinformatics and Genomic Networks
- Algorithms and Data Compression
- Deception detection and forensic psychology
- Cryptography and Residue Arithmetic
- Fluoride Effects and Removal
- Internet Traffic Analysis and Secure E-voting
Jiangxi University of Finance and Economics
2024
Peng Cheng Laboratory
2024
People's Bank of China
2024
Sichuan University
2013-2024
Minzu University of China
2023
Wuhan Business University
2023
Shanxi University of Finance and Economics
2019-2023
East China University of Science and Technology
2023
Huazhong Agricultural University
2007-2021
Ocean University of China
2021
Can artificial intelligence (AI) assist human employees in increasing employee creativity? Drawing on research AI–human collaboration, job design, and creativity, we examine AI assistance the form of a sequential division labor within organizations: task, handles initial portion, which is well-codified repetitive, focus subsequent involving higher-level problem-solving. First, provide causal evidence from field experiment conducted at telemarketing company. We find that generating sales...
Since IO performance on HPC machines strongly depends machine characteristics and configuration, it is important to carefully tune libraries make good use of appropriate library APIs. For instance, current petascale machines, independent tends outperform collective IO, in part due bottlenecks at the metadata server. The problem exacerbated by scaling issues, since each scales differently machine, typically, operates efficiently different levels machines. With scientific codes being run a...
Peta-scale scientific applications running on High End Computing (HEC) platforms can generate large volumes of data. For high performance storage and in order to be useful science end users, such data must organized its layout, indexed, sorted, otherwise manipulated for subsequent presentation, visualization, detailed analysis. In addition, scientists desire gain insights into selected characteristics `hidden' or `latent' these massive datasets while is being produced by simulations....
Significant challenges exist for achieving peak or even consistent levels of performance when using IO systems at scale. They stem from sharing system resources across the processes single largescale applications and/or multiple simultaneous programs causing internal and external interference, which in turn, causes substantial reductions performance. This paper presents interference effects measurements two different file supercomputing sites. These motivate developing a 'managed' approach...
Known challenges for petascale machines are that (1) the costs of I/O high performance applications can be substantial, especially output tasks like checkpointing, and (2) noise from actions inject undesirable delays into runtimes such codes on individual compute nodes. This paper introduces flexible 'DataStager' framework data staging alternative services within jointly address (2). Data moving nodes to or prior storage used reduce overheads applications' total processing times, explicit...
Increasingly severe I/O bottlenecks on High-End Computing machines are prompting scientists to process simulation output data online while simulations running and before storing disk. There several options place analytics along the path: compute nodes, separate nodes dedicated analytics, or after is stored persistent storage. Since different placements have impact performance cost, there a consequent need for flexibility in location of analytics. The FlexIO middleware described this paper...
Severe I/O bottlenecks on High End Computing platforms call for running data analytics in situ. Demonstrating that there exist considerable resources compute nodes un-used by typical high end scientific simulations, we leverage this fact creating an agile runtime, termed GoldRush, can harvest those otherwise wasted, idle to efficiently run situ analytics. GoldRush uses fine-grained scheduling "steal" resources, ways minimize interference between the simulation and This involves recognizing...
With data explosion in scale and variety, OLAP databases play an increasingly important role serving real-time analysis with low latency (e.g., hundreds of milliseconds), especially when incoming queries are complex ad hoc nature. Moreover, these systems expected to provide high query concurrency write throughput, support over structured types JSON, vector texts). In this paper, we introduce AnalyticDB, a database system developed at Alibaba. AnalyticDB maintains all-column indexes...
The complexity of HPC systems has increased the burden on developer as applications scale to hundreds thousands processing cores. Moreover, additional efforts are required achieve acceptable I/O performance, where it is important how performed, which resources used, and functionality deployed. Specifically, by scheduling data movement effectively placing operators affecting volumes or information about data, tremendous gains can be achieved both in performance simulation output usability...
Despite the implementation of safety alignment strategies, large language models (LLMs) remain vulnerable to jailbreak attacks, which undermine these guardrails and pose significant security threats. Some defenses have been proposed detect or mitigate jailbreaks, but they are unable withstand test time due an insufficient understanding mechanisms. In this work, we investigate mechanisms behind jailbreaks based on Linear Representation Hypothesis (LRH), states that neural networks encode...
The remote visual exploration of live data generated by scientific simulations is useful for discovery, performance monitoring, and online validation the simulation results. Online visualization methods are challenged, however, continued growth in volume output that has to be transferred from its source - running on high end machine where it analyzed, visualized, displayed. A specific challenge this context limits communication bandwidth between source(s) sinks. Previous work places queries...
The skyline query can help identify the “best” objects in a multi-attribute dataset. During past decade, this has received considerable attention database research community. Most focused on computing “skyline” of dataset, or set “skyline objects” that are not dominated by any other object. Such algorithms appropriate an online system, which should respond real time to requests with arbitrary subsets attributes (also called subspaces). To guarantee real-time response, system precompute...
Lack of I/O scalability is known to cause measurable slowdowns for large-scale scientific applications running on high end machines. This prompting researchers devise 'I/O staging' methods in which outputs are processed via online analysis and visualization support desired science outcomes. Organized as workflows carried out pipelines, these components run concurrently with simulations, often using a smaller set nodes the machine termed 'staging areas'. paper presents new approach dealing...
In order to understand the complex physics of mother nature, physicist often use many approximations one area and then write a simulation reduce these equations ones that can be solved on computer. Different lead different model physics, which completely code. As computers become more powerful, scientists either models all or they produce several codes each for portions 'couple' together. this paper, we concentrate latter, where look at our code coupling approach modeling full device fusion...
Increasingly severe I/O bottlenecks on High-End Computing machines are prompting scientists to process output data during simulation time, "in-situ", and before placing disks. This paper argues for flexibility in the implementation of such in-situ analytics, using measurements a performance model that demonstrate potential advantages limitations performing analytics at different levels hierarchy, including machine's compute nodes vs. separate "staging" dedicated analysis tasks. Model...
Increasingly larger scale simulations are generating an unprecedented amount of output data, causing researchers to explore new `data staging' methods that buffer, use, and/or reduce such data online rather than simply pushing it disk. Leveraging the capabilities staging, this study explores potential for reduction via compression, first using general compression techniques and then proposing use-specific permit users define simple queries cause only identified by those be emitted. Using...
Scientific Data Management has become essential to the productivity of scientists using ever larger machines and running applications that produce more data. There are several specific issues when on petascale (and beyond) machines. One is need for massively parallel data output, which in part, depends formats semantics being used. Here, inhibition parallelism by file system notions strict immediate consistency can be addressed with ldrdelayed consistencypsila methods. Such methods also used...
Data Stream Processing is an important class of data intensive applications in the "Big Data" era. Chip Multi-Processors (CMPs) are standard hosting platforms modern centers. Gaining high performance for stream processing on CMPs therefore great interest. Since largely depends their effective use complex cache structure present CMPs, this paper proposes StreamMap approach tuning streaming applications' cache. Our major idea to map application threads CPU cores facilitate sharing AND mitigate...
In-situ analysis on the output data of scientific simulations has been made necessary by ever-growing volumes and increasing costs movement as supercomputing is moving towards exascale. With hardware accelerators like GPUs becoming increasingly common in high end machines, new opportunities arise to co-locate online performed generated simulations. However, asynchronous nature GPGPU programming models limited context-switching capabilities GPU pose challenges co-locating simulation same GPU....