Hang Shao

ORCID: 0000-0002-1322-4789
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech Recognition and Synthesis
  • Speech and Audio Processing
  • Neural Networks and Applications
  • Generative Adversarial Networks and Image Synthesis
  • Machine Learning and ELM
  • Advanced Vision and Imaging
  • Music and Audio Processing
  • Face and Expression Recognition
  • Advanced Image Processing Techniques
  • Domain Adaptation and Few-Shot Learning
  • Non-Invasive Vital Sign Monitoring
  • Visual Attention and Saliency Detection
  • Anomaly Detection Techniques and Applications
  • Water Systems and Optimization
  • ECG Monitoring and Analysis
  • Infrastructure Maintenance and Monitoring
  • Natural Language Processing Techniques
  • Urban Stormwater Management Solutions
  • Image Processing Techniques and Applications
  • Heart Rate Variability and Autonomic Control
  • Radiomics and Machine Learning in Medical Imaging
  • Face recognition and analysis
  • Advanced Image and Video Retrieval Techniques
  • Advanced Neural Network Applications
  • Image Processing and 3D Reconstruction

Nanjing University of Science and Technology
2023-2024

Shanghai Jiao Tong University
2023-2024

Tsinghua University
2009-2024

Ministry of Public Security of the People's Republic of China
2024

University of Shanghai for Science and Technology
2019-2021

University of Utah
2018

University of Ottawa
2012-2014

Deep generative models learn a mapping from low-dimensional latent space to high-dimensional data space. Under certain regularity conditions, these parameterize nonlinear manifolds in the In this paper, we investigate Riemannian geometry of generated manifolds. First, develop efficient algorithms for computing geodesic curves, which provide an intrinsic notion distance between points on manifold. Second, algorithm parallel translation tangent vector along path We show how can be used...

10.1109/cvprw.2018.00071 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2018-06-01

An automatic vision-based sewer inspection plays a key role of sewage system in modern city. Recent advances focus on utilizing deep learning model to realize the system, benefiting from capability data-driven feature extraction. However, ambiguity defects space is ignored, deteriorating performance inspection. There are two reasons for such ambiguity. First, defect-irrelevant region interferes extraction model. Second, setting multilabel an inherent challenge extracting discriminative...

10.1109/tim.2023.3250306 article EN IEEE Transactions on Instrumentation and Measurement 2023-01-01

Various Large Language Models (LLMs) from the Generative Pretrained Transformer (GPT) family have achieved outstanding performances in a wide range of text generation tasks. However, enormous model sizes hindered their practical use real-world applications due to high inference latency. Therefore, improving efficiencies LLMs through quantization, pruning, and other means has been key issue LLM studies. In this work, we propose method based on Hessian sensitivity-aware mixed sparsity pruning...

10.1109/icassp48485.2024.10445737 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Background and purpose We investigated the baseline demographics of patients with severe unilateral atherosclerotic stenosis middle cerebral artery (MCA) using multimodal MRI evaluated haemodynamic impairments plaque characteristics who had a recurrent stroke. Materials methods retrospectively recruited consecutive MCA underwent arterial spin labelling (ASL) postlabelling delay (PLD) 1.5 2.5 s, vessel wall MRI. For each PLD, blood flow (CBF) maps were generated. Hypoperfusion volume ratio...

10.1136/svn-2018-000228 article EN cc-by-nc Stroke and Vascular Neurology 2019-06-21

Subtle variations are invisible to the naked eyes in human physiological signals can reflect important biological and health indicators. Although numerous computer vision methods have been proposed recover magnify these changes, most of them either only focus on identifying recognizing explicit features such as shapes textures, or weak long-term temporal modeling spatiotemporal interactive perception implicit biometrics. Therefore, it is difficult for robustly overcome various disturbances...

10.1109/tcsvt.2023.3307700 article EN IEEE Transactions on Circuits and Systems for Video Technology 2023-08-23

In this paper, a novel objective evaluation of depth image based rendering(DIBR) is proposed for the 3D video in format monocular augmented by gray-scale image. The metric composed Color and Sharpness Edge Distortion(CSED) measure. distortion measures luminance loss rendered compared with reference, sharpness edge calculates depth-weighted proportion remaining to original edge. Comparing conventional quality metrics such as MSE PSNR, our represents not only color artifact but also synthesis...

10.1109/3dtv.2009.5069619 article EN 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video 2009-05-01

Background Precise diagnosis and early appropriate treatment are of importance to reduce neuromyelitis optica spectrum disorder (NMOSD) multiple sclerosis (MS) morbidity. Distinguishing NMOSD from MS based on clinical manifestations neuroimaging remains challenging. Purpose To investigate radiomic signatures as potential imaging biomarkers for distinguishing MS, develop validate a diagnostic radiomic‐signature‐based nomogram individualized disease discrimination. Study Type Retrospective,...

10.1002/jmri.26287 article EN Journal of Magnetic Resonance Imaging 2018-11-08

An automatic vision-based sewer inspection plays a key role of sewage system in modern city. Recent advances focus on utilizing deep learning model to realize the system, benefiting from capability data-driven feature representation. However, inherent uncertainty defects is ignored, resulting missed detection serious unknown defect categories. In this paper, we propose trustworthy multi-label classification (TMSDC) method, which can quantify prediction via evidential learning. Meanwhile,...

10.1109/icassp49357.2023.10096569 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

We studied the correlation of central macular fluid volume (CMFV) and subfield thickness (CST) with best-corrected visual acuity (BCVA) in treatment-naïve eyes diabetic edema (DME) 1 month after anti-vascular endothelial growth factor (VEGF) therapy.This retrospective cohort study investigated that received anti-VEGF therapy. All participants underwent comprehensive examinations optical coherence tomography (OCT) scans at baseline (M0) first treatment (M1). Two deep learning models were...

10.1007/s40123-023-00746-5 article EN cc-by-nc Ophthalmology and Therapy 2023-06-15

Deep generative models learn a mapping from low dimensional latent space to high-dimensional data space. Under certain regularity conditions, these parameterize nonlinear manifolds in the In this paper, we investigate Riemannian geometry of generated manifolds. First, develop efficient algorithms for computing geodesic curves, which provide an intrinsic notion distance between points on manifold. Second, algorithm parallel translation tangent vector along path We show how can be used...

10.48550/arxiv.1711.08014 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Detecting the anomalous information in multimedia is valuable to many computer vision applications. Recently, pixel-wise methods modeling by deep learning model have been presented, which can be divided reconstruction-based and distance-based methods. However, suffer from low precision of pixel reconstructions. Distance-based extract hierarchical features a pre-trained model, order estimate anomalies distances between normal features. Nevertheless, multi-level are ignored these methods,...

10.1109/icme51207.2021.9428370 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2021-06-09

Real world data mining applications such as Mine Countermeasure Missions (MCM) involve learning from imbalanced sets, which contain very few instances of the minority classes and many majority class. For instance, number naturally occurring clutter objects (such rocks) that are detected typically far outweighs relatively rare event detecting a mine. In this paper we propose support vector machine with adaptive asymmetric misclassification costs (instances weighted) to solve skewed spaces...

10.1109/icmla.2012.227 article EN 2012-12-01

Tracking moving objects is a task of the utmost importance to defence community. As this requires high accuracy, rather than employing single detector, it has become common use multiple ones. In such cases, tracks produced by these detectors need be correlated (if they belong same sensing modality) or associated were different modalities). work, we introduce Computational-Intelligence-based methods for correlating and associating various contacts pertaining maritime vessels in an area...

10.1109/cec.2014.6900231 article EN 2022 IEEE Congress on Evolutionary Computation (CEC) 2014-07-01

10.1016/j.jvcir.2021.103231 article EN Journal of Visual Communication and Image Representation 2021-07-16

End-to-end automatic speech recognition (ASR) systems have gained popularity given their simplified architecture and promising results. However, text-only domain adaptation remains a big challenge for E2E systems. Text-to-speech (TTS) based approaches fine-tune ASR models by synthesized with an auxiliary TTS model, thus increase deployment costs. Language model (LM) fusion can achieve good performance but are sensitive to interpolation parameters. In order factorize out the language...

10.1109/icassp49357.2023.10095937 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

In the casting manufacturing, dependable automation of classifying types on digital radiography (DR) images is a crucial technology to automate downstream tasks, such as defects detection. Generally, DR are constructed by single gray-scale information, which constricts feature representations castings images. Meanwhile, complicated background image acquisition an undesirable issue for classification performance. Recently, neural network, especially convolutional network (CNN), has great...

10.1109/cac48633.2019.8996501 article EN 2019-11-01

How to distinguish the low-contrast area near boundaries is a basic challenge in salient object detection. Most of recent state-of-the-art methods can achieve good performance but still can't work well boundaries. In this paper, we propose novel network based on multi-level feature fusion with boundary information solve problem. Our model includes two separate decoding sub-networks, one sub-network detect objects and another which outputs error maps get by maps. Moreover, design connection...

10.1109/icme46284.2020.9102715 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2020-06-09

Representing the spatial properties of facial attributes is a vital challenge for attribute recognition (FAR). Recent advances have achieved reliable performances FAR, benefiting from description via extra prior information. However, information might not be always available, resulting in restricted application scenario prior-based methods. Meanwhile, ambiguity caused by inherent diversities parts ignored. To address these issues, we propose prior-free method decomposition (ASD), mitigating...

10.2139/ssrn.4473318 preprint EN 2023-01-01

To evaluate the diagnostic efficacy of traditional radiomics, deep learning, and learning radiomics in differentiating normal inner ear malformations on temporal bone computed tomography(CT).

10.13201/j.issn.2096-7993.2024.06.017 article EN PubMed 2024-06-01

Self-supervised speech representation learning has shown remarkable capability in automatic recognition. However, it requires substantial computations and storage capacity. Pruning is an effective method for model compression. In this work, we propose SparseWAV, a fast accurate unstructured pruning framework designed large foundation models, which can efficiently remove unimportant parameters without sacrificing performance. It adaptively determines the sparsity ratio each weight matrix...

10.21437/interspeech.2024-607 article EN Interspeech 2022 2024-09-01

10.1109/icme57554.2024.10688212 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2024-07-15
Coming Soon ...