NFDI4DS | UHH-SEMS - Publication Details

Hao Jiang

ORCID: 0000-0003-3627-8672

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5082485594

Research Areas

Advanced Data Storage Technologies
Parallel Computing and Optimization Techniques
Algorithms and Data Compression
Distributed and Parallel Computing Systems
Music and Audio Processing
Advanced Database Systems and Queries
Speech and Audio Processing
Multimodal Machine Learning Applications
Advanced Vision and Imaging
Language, Metaphor, and Cognition
Natural Language Processing Techniques
Subtitles and Audiovisual Media
Multisensory perception and integration
Cloud Computing and Resource Management
Distributed systems and fault tolerance
Face recognition and analysis
Image and Signal Denoising Methods
Video Analysis and Summarization
Digital Games and Media
Internet Traffic Analysis and Secure E-voting
Network Security and Intrusion Detection
Human Pose and Action Recognition
Web Data Mining and Analysis
Indoor and Outdoor Localization Technologies
Simulation Techniques and Applications

META Health
2022-2024

Peking University
2022-2024

University of Illinois Chicago
2021

University of Chicago
2016-2021

Florida International University
2014

Clarkson University
2013

Ego4D: Around the World in 3,000 Hours of Egocentric Video

OPENALEX - Publications

Kristen Grauman Andrew Westbury Eugene H. Byrne Zachary Chavis Antonino Furnari and 80 more

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of dailylife activity spanning hundreds scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers from 74 worldwide locations 9 different countries. The approach to collection is designed uphold rigorous privacy ethics standards, with consenting participants robust de-identification procedures where relevant. Ego4D dramatically expands the volume...

10.1109/cvpr52688.2022.01842 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

A quantitative study of virtual machine live migration

OPENALEX - Publications

Wenjin Hu Andrew A. Hicks Long Zhang Eli M. Dow Vinay Kumar Soni and 3 more

Virtual machine (VM) live migration is a critical feature for managing virtualized environments, enabling dynamic load balancing, consolidation power management, preparation planned maintenance, and other management features. However, not all virtual created equal. Variants include memory migration, which relies on shared backend storage between the source destination of migrates state as well state. We have developed an automated testing framework that measures important performance...

10.1145/2494621.2494622 article EN 2013-08-09

Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization

OPENALEX - Publications

Hao Jiang Calvin Murdock Vamsi Krishna Ithapu

Augmented reality devices have the potential to enhance human perception and enable other assistive functionalities in complex conversational environments. Effectively capturing audio-visual context necessary for understanding these social interactions first requires detecting localizing voice activities of device wearer surrounding people. These tasks are challenging due their egocentric nature: wearer's head motion may cause blur, people appear difficult viewing angles, there be...

10.1109/cvpr52688.2022.01029 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Joint Video Summarization and Moment Localization by Cross-Task Sample Transfer

OPENALEX - Publications

Hao Jiang Yadong Mu

Video summarization has recently engaged increasing attention in computer vision communities. However, the scarcity of annotated data been a key obstacle this task. To address it, work explores new solution for video by transferring samples from correlated task (i.e., moment localization) equipped with abundant training data. Our main insight is that moments also indicate semantic highlights video, essentially similar to summary. Approximately, summary can be treated as sparse,...

10.1109/cvpr52688.2022.01590 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Decomposed bounded floats for fast compression and queries

OPENALEX - Publications

Chunwei Liu Hao Jiang John Paparrizos Aaron J. Elmore

Modern data-intensive applications often generate large amounts of low precision float data with a limited range values. Despite the prevalence such data, there is lack an effective solution to ingest, store, and analyze bounded, low-precision, numeric data. To address this gap, we propose Buff, new compression technique that uses decomposed columnar storage encoding methods provide compression, fast ingestion, high-speed in-situ adaptive query operators SIMD support.

10.14778/3476249.3476305 article EN Proceedings of the VLDB Endowment 2021-07-01

Good to the Last Bit: Data-Driven Encoding with CodecDB

OPENALEX - Publications

Hao Jiang Chunwei Liu John Paparrizos Andrew A. Chien Jihong Ma and 1 more

Columnar databases rely on specialized encoding schemes to reduce storage requirements. These encodings also enable efficient in-situ data processing. Nevertheless, many existing columnar are encoding-oblivious. When storing the data, these systems a global understanding of dataset or types derive simple rules for selection. Such rule-based selection leads unsatisfactory performance. Specifically, when performing queries, always decode into memory, ignoring possibility optimizing access...

10.1145/3448016.3457283 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

EasyCom: An Augmented Reality Dataset to Support Algorithms for Easy Communication in Noisy Environments

OPENALEX - Publications

Jacob Donley Vladimir Tourbabin Jung‐Suk Lee Mark Broyles Hao Jiang and 4 more

Augmented Reality (AR) as a platform has the potential to facilitate reduction of cocktail party effect. Future AR headsets could potentially leverage information from an array sensors spanning many different modalities. Training and testing signal processing machine learning algorithms on tasks such beam-forming speech enhancement require high quality representative data. To best author's knowledge, publication there are no available datasets that contain synchronized egocentric...

10.48550/arxiv.2107.04174 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Ego4D: Around the World in 3,000 Hours of Egocentric Video

OPENALEX - Publications

Kristen Grauman Andrew Westbury Eugene H. Byrne Vincent Cartillier Zachary Chavis and 81 more

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity spanning hundreds scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers from 74 worldwide locations 9 different countries. The approach to collection is designed uphold rigorous privacy ethics standards, with consenting participants robust de-identification procedures where relevant. Ego4D dramatically expands the volume...

10.1109/tpami.2024.3381075 article EN cc-by IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-01-01

The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective

OPENALEX - Publications

Wenqi Jia Miao Liu Hao Jiang Ishwarya Ananthabhotla James M. Rehg and 2 more

10.1109/cvpr52733.2024.02493 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

IP geolocation estimation using neural networks with stable landmarks

OPENALEX - Publications

Hao Jiang Yaoqing Liu Jeanna Matthews

The ability to accurately determine the geographic location of an arbitrary IP address has potential in many applications. Previous methods based on observing relationship between network delay and physical distance are inaccurate. Methods similarity more accurate, but inefficient because they need information a large number landmark nodes near destination be collected maintained. We propose method that can overcome both problems. Our maintains stable collection observers covers target area....

10.1109/infcomw.2016.7562066 article EN 2016-04-01

PIDS

OPENALEX - Publications

Hao Jiang Chunwei Liu Jin Qi John Paparrizos Aaron J. Elmore

We propose PIDS, Pattern Inference Decomposed Storage, an innovative storage method for decomposing string attributes in columnar stores. Using unsupervised approach, PIDS identifies common patterns from relational databases, and uses the discovered pattern to split each attribute into sub-attributes. First, by storing encoding sub-attribute individually, can achieve a compression ratio comparable Snappy Gzip. Second, attribute, push down many query operators sub-attributes, thereby...

10.14778/3380750.3380761 article EN Proceedings of the VLDB Endowment 2020-02-01

Mostly Order Preserving Dictionaries

OPENALEX - Publications

Chunwei Liu McKade Umbenhower Hao Jiang Pranav Subramaniam Jihong Ma and 1 more

Dictionary encoding, or domain is an important form of compression that uses a bijective mapping to replace attributes from large (i.e. strings) with finite 32 bit integers). This encoding both reduces data storage and allows for more efficient query execution. Traditional dictionary only supports equality queries, while range queries require encoded values are decoded evaluating the predicates. An order preserving without decoding by ensuring keys follow same as in dictionary. While this...

10.1109/icde.2019.00111 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2019-04-01

Boosting data filtering on columnar encoding with SIMD

OPENALEX - Publications

Hao Jiang Aaron J. Elmore

In columnar databases, data is generally stored in an encoded format to save storage space and reduce I/O. Popular encoding schemes include dictionary encoding, delta run-length bit-packed encoding. many open-source formats, performing queries on requires the be first decoded memory, which time-consuming. this paper, we design several novel SIMD-based algorithms speed up query execution data. Our use SIMD vectorize skip unnecessary decoding for higher efficiency, achieving a throughput of...

10.1145/3211922.3211932 article EN 2018-06-05

Performance Study of a Minimalistic Simulator on XSEDE Massively Parallel Systems

OPENALEX - Publications

Rong Rong Hao Jiang Jason Liu

Scalable Simulation Framework (SSF), a parallel simulation application programming interface (API) for large-scale discrete-event models, has been widely adopted in many areas. This paper presents simplified and yet more streamlined implementation, called MiniSSF. MiniSSF maintains the core design concept of SSF, while removing some complex but rarely used features, sake efficiency. It also introduces several new features that can greatly simplify model development efforts and/or improve...

10.1145/2616498.2616512 article EN 2014-07-11

Ink Dot-Oriented Differentiable Optimization for Neural Image Halftoning

OPENALEX - Publications

Hao Jiang Bingfeng Zhou Yadong Mu

10.1109/cvpr52733.2024.02599 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Coming Soon ...