Hao Jiang

ORCID: 0000-0003-3627-8672
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Data Storage Technologies
  • Parallel Computing and Optimization Techniques
  • Algorithms and Data Compression
  • Distributed and Parallel Computing Systems
  • Music and Audio Processing
  • Advanced Database Systems and Queries
  • Speech and Audio Processing
  • Multimodal Machine Learning Applications
  • Advanced Vision and Imaging
  • Language, Metaphor, and Cognition
  • Natural Language Processing Techniques
  • Subtitles and Audiovisual Media
  • Multisensory perception and integration
  • Cloud Computing and Resource Management
  • Distributed systems and fault tolerance
  • Face recognition and analysis
  • Image and Signal Denoising Methods
  • Video Analysis and Summarization
  • Digital Games and Media
  • Internet Traffic Analysis and Secure E-voting
  • Network Security and Intrusion Detection
  • Human Pose and Action Recognition
  • Web Data Mining and Analysis
  • Indoor and Outdoor Localization Technologies
  • Simulation Techniques and Applications

META Health
2022-2024

Peking University
2022-2024

University of Illinois Chicago
2021

University of Chicago
2016-2021

Florida International University
2014

Clarkson University
2013

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of dailylife activity spanning hundreds scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers from 74 worldwide locations 9 different countries. The approach to collection is designed uphold rigorous privacy ethics standards, with consenting participants robust de-identification procedures where relevant. Ego4D dramatically expands the volume...

10.1109/cvpr52688.2022.01842 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Virtual machine (VM) live migration is a critical feature for managing virtualized environments, enabling dynamic load balancing, consolidation power management, preparation planned maintenance, and other management features. However, not all virtual created equal. Variants include memory migration, which relies on shared backend storage between the source destination of migrates state as well state. We have developed an automated testing framework that measures important performance...

10.1145/2494621.2494622 article EN 2013-08-09

Augmented reality devices have the potential to enhance human perception and enable other assistive functionalities in complex conversational environments. Effectively capturing audio-visual context necessary for understanding these social interactions first requires detecting localizing voice activities of device wearer surrounding people. These tasks are challenging due their egocentric nature: wearer's head motion may cause blur, people appear difficult viewing angles, there be...

10.1109/cvpr52688.2022.01029 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Video summarization has recently engaged increasing attention in computer vision communities. However, the scarcity of annotated data been a key obstacle this task. To address it, work explores new solution for video by transferring samples from correlated task (i.e., moment localization) equipped with abundant training data. Our main insight is that moments also indicate semantic highlights video, essentially similar to summary. Approximately, summary can be treated as sparse,...

10.1109/cvpr52688.2022.01590 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Modern data-intensive applications often generate large amounts of low precision float data with a limited range values. Despite the prevalence such data, there is lack an effective solution to ingest, store, and analyze bounded, low-precision, numeric data. To address this gap, we propose Buff, new compression technique that uses decomposed columnar storage encoding methods provide compression, fast ingestion, high-speed in-situ adaptive query operators SIMD support.

10.14778/3476249.3476305 article EN Proceedings of the VLDB Endowment 2021-07-01

Columnar databases rely on specialized encoding schemes to reduce storage requirements. These encodings also enable efficient in-situ data processing. Nevertheless, many existing columnar are encoding-oblivious. When storing the data, these systems a global understanding of dataset or types derive simple rules for selection. Such rule-based selection leads unsatisfactory performance. Specifically, when performing queries, always decode into memory, ignoring possibility optimizing access...

10.1145/3448016.3457283 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

Augmented Reality (AR) as a platform has the potential to facilitate reduction of cocktail party effect. Future AR headsets could potentially leverage information from an array sensors spanning many different modalities. Training and testing signal processing machine learning algorithms on tasks such beam-forming speech enhancement require high quality representative data. To best author's knowledge, publication there are no available datasets that contain synchronized egocentric...

10.48550/arxiv.2107.04174 preprint EN other-oa arXiv (Cornell University) 2021-01-01

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity spanning hundreds scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers from 74 worldwide locations 9 different countries. The approach to collection is designed uphold rigorous privacy ethics standards, with consenting participants robust de-identification procedures where relevant. Ego4D dramatically expands the volume...

10.1109/tpami.2024.3381075 article EN cc-by IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-01-01

The ability to accurately determine the geographic location of an arbitrary IP address has potential in many applications. Previous methods based on observing relationship between network delay and physical distance are inaccurate. Methods similarity more accurate, but inefficient because they need information a large number landmark nodes near destination be collected maintained. We propose method that can overcome both problems. Our maintains stable collection observers covers target area....

10.1109/infcomw.2016.7562066 article EN 2016-04-01

We propose PIDS, Pattern Inference Decomposed Storage, an innovative storage method for decomposing string attributes in columnar stores. Using unsupervised approach, PIDS identifies common patterns from relational databases, and uses the discovered pattern to split each attribute into sub-attributes. First, by storing encoding sub-attribute individually, can achieve a compression ratio comparable Snappy Gzip. Second, attribute, push down many query operators sub-attributes, thereby...

10.14778/3380750.3380761 article EN Proceedings of the VLDB Endowment 2020-02-01

Dictionary encoding, or domain is an important form of compression that uses a bijective mapping to replace attributes from large (i.e. strings) with finite 32 bit integers). This encoding both reduces data storage and allows for more efficient query execution. Traditional dictionary only supports equality queries, while range queries require encoded values are decoded evaluating the predicates. An order preserving without decoding by ensuring keys follow same as in dictionary. While this...

10.1109/icde.2019.00111 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2019-04-01

In columnar databases, data is generally stored in an encoded format to save storage space and reduce I/O. Popular encoding schemes include dictionary encoding, delta run-length bit-packed encoding. many open-source formats, performing queries on requires the be first decoded memory, which time-consuming. this paper, we design several novel SIMD-based algorithms speed up query execution data. Our use SIMD vectorize skip unnecessary decoding for higher efficiency, achieving a throughput of...

10.1145/3211922.3211932 article EN 2018-06-05

Scalable Simulation Framework (SSF), a parallel simulation application programming interface (API) for large-scale discrete-event models, has been widely adopted in many areas. This paper presents simplified and yet more streamlined implementation, called MiniSSF. MiniSSF maintains the core design concept of SSF, while removing some complex but rarely used features, sake efficiency. It also introduces several new features that can greatly simplify model development efforts and/or improve...

10.1145/2616498.2616512 article EN 2014-07-11

10.1109/cvpr52733.2024.02599 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16
Coming Soon ...