Hao Zhou

ORCID: 0000-0001-9764-1012
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Human Pose and Action Recognition
  • Hand Gesture Recognition Systems
  • Multimodal Machine Learning Applications
  • Particle physics theoretical and experimental studies
  • Hearing Impairment and Communication
  • Computational Drug Discovery Methods
  • High-Energy Particle Collisions Research
  • Advanced Vision and Imaging
  • Text Readability and Simplification
  • Computational Physics and Python Applications
  • Advanced Image Processing Techniques
  • Speech and Audio Processing
  • Speech and dialogue systems
  • Machine Learning in Materials Science
  • Sustainable Building Design and Assessment
  • Particle Detector Development and Performance
  • Environmental Impact and Sustainability
  • Gait Recognition and Analysis
  • Generative Adversarial Networks and Image Synthesis
  • Genomics and Phylogenetic Studies
  • Full-Duplex Wireless Communications
  • Software-Defined Networks and 5G
  • Human Motion and Animation

Tsinghua University
2023-2024

Soochow University
2024

First Affiliated Hospital of Soochow University
2024

Jiangsu Normal University
2024

Nanjing University of Information Science and Technology
2024

University of Science and Technology of China
2015-2024

Shanghai Jiao Tong University
2022-2024

Huaqiao University
2024

Beijing Information Science & Technology University
2024

Tencent (China)
2023

Despite the recent success of deep learning in continuous sign language recognition (CSLR), models typically focus on most discriminative features, ignoring other potentially non-trivial and informative contents. Such characteristic heavily constrains their capability to learn implicit visual grammars behind collaboration different cues (i,e., hand shape, facial expression body posture). By injecting multi-cue into neural network design, we propose a spatial-temporal (STMC) solve...

10.1609/aaai.v34i07.7001 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Despite existing pioneering works on sign language translation (SLT), there is a non-trivial obstacle, i.e., the limited quantity of parallel sign-text data. To tackle this data bottleneck, we propose back-translation (SignBT) approach, which incorporates massive spoken texts into SLT training. With text-to-gloss model, first back-translate monolingual text to its gloss sequence. Then, paired sequence generated by splicing pieces from an estimated gloss-to-sign bank at feature level....

10.1109/cvpr46437.2021.00137 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

The life-cycle assessment method, which originates from general products and services, has gradually come to be applied investigations of the carbon emissions (LCCE) buildings. A literature review was conducted clarify LCCE implications, calculations, reductions in context total 826 global building emission calculation cases were obtained 161 studies based on framework stage division stipulated by ISO 21930 basic principles factor (EF) approach. methods results are discussed herein, modules...

10.1016/j.eng.2023.08.019 article EN cc-by-nc-nd Engineering 2024-01-17

Great progress has been made in face sketch synthesis recent years. State-of-the-art methods commonly apply a Markov Random Fields (MRF) model to select local patches from set of training data. Such methods, however, have two major drawbacks. Firstly, the MRF used cannot synthesize new patches. Secondly, optimization problem solving is NP-hard. In this paper, we propose novel Weight (MWF) that capable synthesizing We formulate our into convex quadratic programming (QP) which optimal solution...

10.1109/cvpr.2012.6247788 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2012-06-01

Despite the recent success of deep learning in video-related tasks, models typically focus on most discriminative features, ignoring other potentially non-trivial and informative contents. Such characteristic heavily constrains their capability to learn implicit visual grammars sign videos behind collaboration different cues (i.e., hand shape, facial expression body posture). To this end, we approach video-based language understanding with multi-cue propose a spatial-temporal (STMC) network...

10.1109/tmm.2021.3059098 article EN IEEE Transactions on Multimedia 2021-02-17

Continuous sign language recognition is a weakly supervised problem to translate video sequence gloss sequence, where temporal boundary of each not annotated. The CNN-RNN-CTC framework shows effectiveness in this task by estimating pseudo label for clip and retraining the feature extractor alternately. quality labels greatly impacts final performance. In contrast existing methods which select maximum posterior probability, we propose dynamic decoding method find reasonable alignment path via...

10.1109/icme.2019.00223 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2019-07-01

This paper does not aim at introducing a novel model for document-level neural machine translation. Instead, we head back to the original Transformer and hope answer following question: Is capacity of current models strong enough translation? Interestingly, observe that with appropriate training techniques can achieve results document translation, even length 2000 words. We evaluate this several recent approaches on nine datasets two sentence-level across six languages. Experiments show...

10.18653/v1/2022.findings-acl.279 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2022-01-01

This paper presents WordRecorder, an efficient and accurate handwriting recognition system that identifies words using acoustic signals generated by pens paper, thus enabling ubiquitous recognition. To achieve this, we carefully craft a new deep-learning based sensing framework with three major components, i.e., segmentation, classification, word suggestion. First, design dual-window approach to segment the raw signal into series of letters exploiting subtle features handwriting. Then...

10.1109/infocom.2018.8486285 article EN IEEE INFOCOM 2022 - IEEE Conference on Computer Communications 2018-04-01

Yu Bao, Hao Zhou, Shujian Huang, Dongqi Wang, Lihua Qian, Xinyu Dai, Jiajun Chen, Lei Li. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.

10.18653/v1/2022.acl-long.575 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Currently, the route planning functions in 2D/3D campus navigation systems market are unable to process indoor and outdoor localization information simultaneously, UI experiences not optimal because they limited by service platforms. An ARCore-based augmented reality system is designed this paper order solve relevant problems. Firstly, proposed uses ARCore enhance presenting 3D real scenes. Secondly, a visual inertial ranging algorithm for real-time locating map generating mobile devices....

10.3390/app11167515 article EN cc-by Applied Sciences 2021-08-16

10.1016/j.jclepro.2023.137047 article EN publisher-specific-oa Journal of Cleaner Production 2023-04-05

In this paper, we introduce FROSTER, an effective framework for open-vocabulary action recognition. The CLIP model has achieved remarkable success in a range of image-based tasks, benefiting from its strong generalization capability stemming pretaining on massive image-text pairs. However, applying directly to the recognition task is challenging due absence temporal information CLIP's pretraining. Further, fine-tuning datasets may lead overfitting and hinder generalizability, resulting...

10.48550/arxiv.2402.03241 preprint EN arXiv (Cornell University) 2024-02-05

In the field of skeleton-based action recognition, current top-performing graph convolutional networks (GCNs) exploit intra-sequence context to construct adaptive graphs for feature aggregation. However, we argue that such is still \textit{local} since rich cross-sequence relations have not been explicitly investigated. this paper, propose a contrastive learning framework recognition (\textit{SkeletonGCL}) explore \textit{global} across all sequences. specific, SkeletonGCL associates...

10.48550/arxiv.2301.10900 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Smart wearable devices are becoming smaller, cheaper and popular. Smartwatch is one of the most popular devices. The smartwatch has rich applications such as messages, email voice by connecting to smartphone via Bluetooth. It hard interact with due small screen way it worn. Since usually equipped sensors like accelerometer gyroscope worn on wist, which makes possible identify user's gestures tracking movement finger, hand arm. Furthermore, user can control other nearby smart if they be...

10.1109/bigcom.2018.00018 article EN 2018-08-01

To further improve the convenience and effectiveness of human computer interaction (HCI) with smart devices, activity recognition (HAR) has been widely studied from various aspects. Unfortunately, deep learning based methods often suffer either expensive labeling efforts or weak generalization ability. Inspired by recently developed domain adaptation strategies, we propose XHAR, a novel adversarial framework for HAR using providing better device user adaptation. XHAR first selects most...

10.1109/secon48991.2020.9158431 article EN 2020-06-01

An efficient resource management scheme is critical to enable network slicing in 5G networks and envisioned 6G networks, artificial intelligence (AI) techniques offer promising solutions. Considering the rapidly emerging new machine learning techniques, such as graph learning, federated transfer a timely survey needed provide an overview of AI-enabled wireless networks. This article provides along with application knowledge radio access (RAN) slicing. In particular, we first some background...

10.1109/mwc.004.2200025 article EN IEEE Wireless Communications 2022-12-26

This paper presents IP-SLT, a simple yet effective framework for sign language translation (SLT). Our IP-SLT adopts recurrent structure and enhances the semantic representation (prototype) of input video via an iterative refinement manner. idea mimics behavior human reading, where sentence can be digested repeatedly, till reaching accurate understanding. Technically, consists feature extraction, prototype initialization, refinement. The initialization module generates initial based on visual...

10.1109/iccv51070.2023.01429 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Abstract Our study presents the assembly of a high-quality Taihu goose genome at Telomere-to-Telomere (T2T) level. By employing advanced sequencing technologies, including Pacific Biosciences HiFi reads, Oxford Nanopore long Illumina short and chromatin conformation capture (Hi-C), we achieved an exceptional assembly. The T2T encompasses total length 1,197,991,206 bp, with contigs N50 reaching 33,928,929 bp scaffold attaining 81,007,908 bp. It consists 73 scaffolds, 38 autosomes one pair Z/W...

10.1038/s41597-024-03567-8 article EN cc-by Scientific Data 2024-07-07

Currently, masked language modeling (e.g., BERT) is the prime choice to learn contextualized representations. Due pervasiveness, it naturally raises an interesting question: how do models (MLMs) contextual representations? In this work, we analyze learning dynamics of MLMs and find that adopts sampled embeddings as anchors estimate inject semantics representations, which limits efficiency effectiveness MLMs. To address these problems, propose TACO, a simple yet effective representation...

10.18653/v1/2022.acl-long.193 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Recent advancements indicate that scaling up Multimodal Large Language Models (MLLMs) effectively enhances performance on downstream multimodal tasks. The prevailing MLLM paradigm, \emph{e.g.}, LLaVA, transforms visual features into text-like tokens using a \emph{static} vision-language mapper, thereby enabling LLMs to develop the capability comprehend information through instruction tuning. Although promising, tuning strategy~\footnote{The static refers trained model with parameters.}...

10.48550/arxiv.2403.13447 preprint EN arXiv (Cornell University) 2024-03-20

Drug design is a crucial step in the drug discovery cycle. Recently, various deep learning-based methods drugs by generating novel molecules from scratch, avoiding traversing large-scale libraries. However, they depend on scarce experimental data or time-consuming docking simulation, leading to overfitting issues with limited training and slow generation speed. In this study, we propose zero-shot method DESERT (Drug dEsign SkEtching geneRaTing). Specifically, splits process into two stages:...

10.48550/arxiv.2209.13865 preprint EN cc-by-sa arXiv (Cornell University) 2022-01-01

Abstract The Alectoris Chukar (chukar) is the most geographically widespread partridge species in world, demonstrating exceptional adaptability to diverse ecological environments. However, scarcity of genetic resources for chukar has hindered research into its adaptive evolution and molecular breeding. In this study, we have sequenced assembled a high-quality, phased genome that consists 31 pairs relatively complete diploid chromosomes. Our BUSCO analysis reported high completeness score...

10.1038/s41597-024-02991-0 article EN cc-by Scientific Data 2024-02-02

Dynamic spectrum reallocation, under which the owners temporarily share underutilized to secondary users for economic profit, is an important approach improve utilization ratio. Auction believed be a natural marketing tool incentivize owners, and thus redistribute idle efficiently. Extensive researches have been done in problem of truthful auction, bidders bid based on their true valuations spectrum. The valuation individual bidder, however, private information should protected against...

10.1109/glocom.2015.7417163 article EN 2015 IEEE Global Communications Conference (GLOBECOM) 2015-12-01
Coming Soon ...