- Data Management and Algorithms
- Advanced Database Systems and Queries
- Advanced Data Storage Technologies
- Cryptography and Data Security
- Cloud Computing and Resource Management
- Distributed systems and fault tolerance
- Privacy-Preserving Technologies in Data
- Vehicle Dynamics and Control Systems
- Power Systems and Technologies
- Power Systems and Renewable Energy
- Complexity and Algorithms in Graphs
- Stochastic Gradient Optimization Techniques
- Smart Grid and Power Systems
- Electric and Hybrid Vehicle Technologies
- Internet Traffic Analysis and Secure E-voting
- Sparse and Compressive Sensing Techniques
- Advanced Manufacturing and Logistics Optimization
- Power Systems Fault Detection
- Energy Load and Power Forecasting
- Bayesian Methods and Mixture Models
- Microgrid Control and Optimization
- Distributed and Parallel Computing Systems
- Anomaly Detection Techniques and Applications
- Advanced Image and Video Retrieval Techniques
- High-Voltage Power Transmission Systems
Pennsylvania State University
2021-2024
Shanghai Jiao Tong University
2008-2022
University of North Texas
2021-2022
University of Utah
2016-2020
Microsoft (United States)
2019
Henan Polytechnic University
2013
Hunan First Normal University
2007
Large spatial data becomes ubiquitous. As a result, it is critical to provide fast, scalable, and high-throughput queries analytics for numerous applications in location-based services (LBS). Traditional databases systems are disk-based optimized IO efficiency. But increasingly, stored processed memory achieve low latency, CPU time the new bottleneck. We present Simba (Spatial In-Memory Big Analytics) system that offers scalable efficient in-memory query processing big data. based on Spark...
Mobile and sensing devices have already become ubiquitous. They made tracking moving objects an easy task. As a result, mobile applications like Uber many IoT projects generated massive amounts of trajectory data that can no longer be processed by single machine efficiently. Among the typical query operations over trajectories, similarity search is common yet expensive operator in querying data. It useful for different domains such as traffic transportation optimizations, weather forecast...
Serverless computing has gained attention due to its fine-grained provisioning, large-scale multi-tenancy, and on-demand scaling. However, it also forces applications externalize state in remote storage, adding substantial overheads. To fix this "data shipping problem" we built Shredder, a low-latency multi-tenant cloud store that allows small units of computation be performed directly within storage nodes. Storage tenants provide Shredder with JavaScript functions (or WebAssembly programs),...
胶东是全球唯一已知赋存于前寒武纪变质地体中的晚中生代巨型金矿省,其成矿系统独具特色,具体表现为:(1)位于陆内复合构造域,经历了多期重大构造-热事件,大规模金成矿作用受控于120±2Ma古太平洋板块俯冲方向变化及其诱发的软流圈上涌、岩石圈改造和伸展-挤压变形交替及控矿断裂剪压-剪张转换;(2)多重控矿构造和多样赋矿建造联合控制了不同规模和类型金矿的发育,形成了三山岛、焦家、招平、栖霞、郭即和牟乳六条NE向金矿带和三山岛-栖霞EW向富金廊带,导致了金矿化类型(焦家式/破碎带蚀变岩型、玲珑式/石英脉型、蓬家夼式/蚀变砾岩±角砾岩型、辽上式/黄铁矿-碳酸盐脉型)及其地质-地球化学特征的多样性;(3)主要矿化元素Au、Ag、Cu、Pb和Zn均达到工业利用要求,并有多种共/伴生关键金属超常富集;(4)不同金矿带中硫化物Pb同位素组成与探明金资源储量及到郯庐断裂带的距离线性相关,反映距离幔源流体主通道越近、金属硫化物中放射性成因Pb含量和幔源组分占比越多、金成矿强度越大;(5)区域总体相对均一的Δ<sup>199</sup>Hg(平均~0.012‰)及Δ<sup>199<...
Many companies choose the cloud as their data and IT infrastructure platform. The remote access of brings issue trust. Despite use strong encryption schemes, adversaries can still learn valuable information regarding encrypted by observing patterns. To that end, one hide patterns, which may leak sensitive information, using Oblivious RAMs (ORAMs). Numerous works have proposed different ORAM constructions, but they never been thoroughly compared against tested on large databases. There are...
As location-based services (LBSs) become popular, location-dependent queries have raised serious privacy concerns since they may disclose sensitive information in query processing. Among typical supported by LBSs, shortest path reveal about not only current locations of the clients, but also their potential destinations and travel plans. Unfortunately, existing methods for private computation suffer from issues weak property, low performance or poor scalability. In this paper, we aim at a...
Many individuals and companies choose the public cloud as their data IT infrastructure platform. But remote accesses over inevitably bring issue of trust. Despite strong encryption schemes, adversaries can still learn sensitive information from encrypted by observing access patterns. Oblivious RAMs (ORAMs) are proposed to protect against pattern attacks. However, directly deploying ORAM constructions in an database brings large computational overhead.
We present the Simba (<u>S</u>patial <u>I</u>n-Memory <u>B</u>ig data <u>A</u>nalytics) system, which offers scalable and efficient in-memory spatial query processing analytics for big data. natively extends Spark SQL engine to support rich queries through both DataFrame API. It enables construction of indexes over RDDs inside in order work with complex operations. also comes an effective optimizer, leverages its novel spatial-aware optimizations, achieve low latency high throughput...
Increasingly, individuals and companies adopt a cloud service provider as primary data IT infrastructure platform. The remote access of the inevitably brings issue trust. Data encryption is necessary to keep sensitive information secure private on cloud. Yet adversaries can still learn valuable regarding encrypted by observing patterns. To solve such problem, Oblivious RAMs (ORAMs) are proposed completely hide However, most ORAM constructions expensive not suitable deploy in database for...
The last decade has witnessed a huge increase in data being ingested into the cloud, forms such as JSON, CSV, and binary formats. Traditionally, is either storage raw form, indexed ad-hoc using range indices, or cooked analytics-friendly columnar None of these solutions able to handle modern requirements on storage: making available immediately for streaming queries while ingesting at extremely high throughputs. This paper builds recent advances parsing indexing techniques propose FishStore,...
In cloud computing, remote accesses over the data inevitably bring issue of trust. Despite strong encryption schemes, adversaries can still learn sensitive information from encrypted by observing access patterns. Oblivious RAMs (ORAMs) are proposed to protect against pattern attacks. However, directly deploying ORAM constructions in an database brings large computational overhead. this work, we focus on oblivious joins a database. Existing studies literature restricted either primary-foreign...
The energy storage system participates in the power grid Frequency Regulation (FR), which can give full play to advantages of fast return speed and high adjustment precision. Based on optimal response FR scheduling instruction station, based K-means clustering method, comprehensive performance index (adjustment speed, time precision) is analyzed. different flow states unit are summarized. impact indicators, explored battery cell control strategy achieve a network-storage win-win storage. By...
Efficient transaction processing over large databases is a key requirement for many mission-critical applications. Although modern have achieved good performance through horizontal partitioning, their deteriorates when cross-partition distributed transactions to be executed. This article presents SolarDB, relational database system that has been successfully tested at commercial bank. The features of SolarDB include (1) shared-everything architecture based on two-layer log-structured...
Thanks to the wide adoption of GPS-equipped devices, volume collected spatial data is exploding. To achieve interactive exploration and analysis over big data, people are willing trade off accuracy for performance through approximation. As a foundation in many approximate algorithms, sampling now requires more flexibility better performance. In this paper, we study independent range (SIRS) problem aiming at retrieving random samples with independence points residing query region....
The total transfer capability (TTC) of flowgate is an important concern for operator during the power system operation. To provide fast and accurate TTC evaluation, this paper presents a evaluation methods using stacking ensemble learning method. Firstly, repeated flow applied to calculate value under different scenarios. Then, steady variables, including active reactive power, voltage angle generators, loads on transmission lines, are used as input features. Finally, based XGBoost, RF,...
To meet the requirements of ultra-high speed protection on UHV transmission lines, this paper proposes a fault phase selection based wavelet analysis theory, which can detect mutations' time and size singular signal in signal. Firstly, algorithm transforms transient current into model to eliminate impact various phases' coupling. Then modulus maximum is extracted from using discrete transform (DWT) treated as criterion factors after normalizing. Finally, relationship among summarized...
With the rapid growth of penetration rate renewable energy, construction Integrated electricity-natural gas system (IEGS) has important economic and environmental significance. Power-to-gas (P2G) technology, as a new energy conversion storage method, provides an way for wind power consumption. Due to limited local consumption hydrogen difficulty in exporting, this paper proposes IEGS optimal scheduling model that considers mixing natural pipeline aims at lowest operating cost. After...
There has been an increasing demand for real-time data analytics. Approximate Query Processing (AQP) is a popular option that because it can use random sampling to trade some accuracy lower query latency. However, the state-of-the-art AQP system either relies on scan-based algorithms draw samples, which still incur non-trivial cost of table scan, or creates samples database in preprocessing step, are hard update. The alternative aggregate B-tree indexes support both and updates with...
We define several new models for how to anomalous regions among enormous sets of trajectories. These are based on spatial scan statistics, and identify a geometric region which captures subset trajectories significantly different in measured characteristic from the background population. The model definition depends much is contributed by some overlapping trajectory. This contribution can be full trajectory, proportional length within region, or dependent flux across boundary that region....
This paper investigates the stochastic distributed nonconvex optimization problem of minimizing a global cost function formed by summation n local functions. We solve such involving zeroth-order (ZO) information exchange. In this paper, we propose ZO primal–dual coordinate method (ZODIAC) to problem. Agents approximate their own oracle along with coordinates under an adaptive smoothing parameter. show that proposed algorithm achieves convergence rate ${\mathcal{O}}(\sqrt p /\sqrt T )$ for...