NFDI4DS | UHH-SEMS - Publication Details

Bing Yang

ORCID: 0000-0002-8978-2322

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5086513946

Research Areas

Speech and Audio Processing
Music and Audio Processing
Parallel Computing and Optimization Techniques
Advancements in Semiconductor Devices and Circuit Design
Indoor and Outdoor Localization Technologies
Hearing Loss and Rehabilitation
Speech Recognition and Synthesis
Underwater Acoustics Research
Advanced Algorithms and Applications
Analog and Mixed-Signal Circuit Design
Acoustic Wave Phenomena Research
Advanced Data Storage Technologies
Advancements in PLL and VCO Technologies
Distributed systems and fault tolerance
Semiconductor materials and devices
Advanced Adaptive Filtering Techniques
Silicon Carbide Semiconductor Technologies
Hydrological Forecasting Using AI
Flood Risk Assessment and Management
Advanced Sensor and Control Systems
Digital Filter Design and Implementation
Interconnection Networks and Systems
Video Surveillance and Tracking Methods
Smart Grid and Power Systems
Advanced Wireless Communication Techniques

Beijing Microelectronics Technology Institute
1998-2024

Westlake University
2021-2024

North China University of Technology
2008-2024

Southwestern University of Finance and Economics
2024

Hubei Zhongshan Hospital
2024

Wuhan University
2024

Sichuan Agricultural University
2024

Hangzhou Dianzi University
2023

Peking University
2002-2022

Shandong University of Technology
2022

Multi-Scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition

OPENALEX - Publications

Zhan Chen Sicheng Li Bing Yang Qinghan Li Hong Liu

Graph convolutional networks have been widely used for skeleton-based action recognition due to their excellent modeling ability of non-Euclidean data. As the graph convolution is a local operation, it can only utilize short-range joint dependencies and short-term trajectory but fails directly model distant joints relations long-range temporal information that are vital distinguishing various actions. To solve this problem, we present multi-scale spatial (MS-GC) module (MT-GC) enrich...

10.1609/aaai.v35i2.16197 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Development of a Calibration System for Passive Infrared Imaging Gas Leakage Detectors

OPENALEX - Publications

Po Ma Bing Yang C. Chao Xiaohu Shen Huaiqian Yi and 7 more

10.2139/ssrn.5101040 preprint EN 2025-01-01

Enhancing English Language Proficiency for Primary School Students Through the Implementation of Online Interactive Multimedia Learning

OPENALEX - Publications

Bing Yang Piyanan Pannim Vipahasna

10.22492/issn.2186-5892.2025.114 article EN Asian Conference on Education official conference proceedings/ACE 2025-03-13

An Adaptive Method Based on Multiscale Dilated Convolutional Network for Binaural Speech Source Localization

OPENALEX - Publications

Lulu Wu Hong Liu Bing Yang Runwei Ding

Most binaural speech source localization models perform poorly in unprecedentedly noisy and reverberant situations. Here, this issue is approached by modelling a multiscale dilated convolutional neural network (CNN). The time-related crosscorrelation function (CCF) energy-related interaural level differences (ILD) are preprocessed separate branches of network. CNN can encode discriminative representations for CCF ILD, respectively. After encoding, the individual fused to map direction....

10.1155/2020/5819624 article EN cc-by Complexity 2020-12-30

Enhancing direct‐path relative transfer function using deep neural network for robust sound source localization

OPENALEX - Publications

Bing Yang Runwei Ding Yutong Ban Xiaofei Li Hong Liu

This article proposes a deep neural network (DNN)-based direct-path relative transfer function (DP-RTF) enhancement method for robust direction of arrival (DOA) estimation in noisy and reverberant environments. The DP-RTF refers to the ratio between acoustic functions two microphone channels. First, complex-value is decomposed into inter-channel intensity difference, sinusoidal phase difference time-frequency domain. Then, features from series temporal context frames are utilized train DNN...

10.1049/cit2.12024 article EN cc-by-nc-nd CAAI Transactions on Intelligence Technology 2021-04-14

Multiple Sound Source Counting and Localization Based on TF-Wise Spatial Spectrum Clustering

OPENALEX - Publications

Bing Yang Hong Liu Cheng Pang Xiaofei Li

This paper addresses the problem of multiple sound source counting and localization in adverse acoustic environments, using microphone array recordings. The proposed time-frequency (TF) wise spatial spectrum clustering based method contains two stages. First, given received sensor signals, correlation matrix is computed denoised TF domain. TF-wise estimated on signal subspace information, further enhanced by an exponential transform, which can increase reliability presence possibility...

10.1109/taslp.2019.2915785 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2019-05-10

Learning Deep Direct-Path Relative Transfer Function for Binaural Sound Source Localization

OPENALEX - Publications

Bing Yang Hong Liu Xiaofei Li

Direct-path relative transfer function (DP-RTF) refers to the ratio between direct-path acoustic functions of two microphone channels. Though DP-RTF fully encodes sound spatial cues and serves as a reliable localization feature, it is often erroneously estimated in presence noise reverberation. This paper proposes learn with deep neural networks for robust binaural source localization. A learning network designed regress sensor signals real-valued representation DP-RTF. It consists branched...

10.1109/taslp.2021.3120641 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Effects of an intelligent web‐based English instruction system on students' academic performance

OPENALEX - Publications

Jinzhu Jia Ying Chen Zijian Ding Yun Bai Bing Yang and 2 more

Abstract This research conducted quasi‐experiments in four middle schools to evaluate the long‐term effects of an intelligent web‐based English instruction system, C omputer S imulation E ducational ommunication ( CSIEC ), on students' academic attainment. The analysis regular examination scores and vocabulary test validates positive impact , most cases, is statistically significant. reliability ensured by spectrum students from Grade 1 3 three junior high 2 one senior school, teachers with...

10.1111/jcal.12016 article EN Journal of Computer Assisted Learning 2013-05-13

SRP-DNN: Learning Direct-Path Phase Difference for Multiple Moving Sound Source Localization

OPENALEX - Publications

Bing Yang Hong Liu Xiaofei Li

Multiple moving sound source localization in real-world scenarios remains a challenging issue due to interaction between sources, time-varying trajectories, distorted spatial cues, etc. In this work, we propose use deep learning techniques learn competing and direct-path phase differences for localizing multiple sources. A causal convolutional recurrent neural network is designed extract the difference sequence from signals of each microphone pair. To avoid assignment ambiguity problem...

10.1109/icassp43922.2022.9746624 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

A noval approach based on TCN-LSTM network for predicting waterlogging depth with waterlogging monitoring station

OPENALEX - Publications

Jinliang Yao Zhipeng Cai Zheng Qian Bing Yang

As a result of climate change and rapid urbanization, urban waterlogging commonly caused by rainstorm, is becoming more frequent severe in developing countries. Urban sometimes results significant financial losses as well human casualties. Accurate depth prediction critical for early warning system emergency response. However, the existing hydrological models need to obtain abundant data, model construction complicated. The technology based on object detection are highly dependent image...

10.1371/journal.pone.0286821 article EN cc-by PLoS ONE 2023-10-12

Lip Graph Assisted Audio-Visual Speech Recognition Using Bidirectional Synchronous Fusion

OPENALEX - Publications

Hong Liu Zhan Chen Bing Yang

10.21437/interspeech.2020-3146 article EN Interspeech 2022 2020-10-25

Multiple sound source localization based on TDOA clustering and multi-path matching pursuit

OPENALEX - Publications

Hong Liu Bing Yang Cheng Pang

Multiple sound source localization in wireless acoustic sensor networks (WASNs) is a challenging problem. Although compressive sensing based methods have shown effectiveness uncorrelated sources localization, their performance degrades significantly when they are used to locate multiple speech sources. To this end, we propose method on the time difference of arrival (TDOA) clustering and multi-path matching pursuit algorithm. First, TDOAs calculated locally time-frequency (TF) bins...

10.1109/icassp.2017.7952755 article EN 2017-03-01

Research on small objects detection algorithm of UAV photography based on improved YOLOv7

OPENALEX - Publications

XuLiang Duan BuYuan Zhang QinWen Deng HongYang Ma Bing Yang

<title>Abstract</title> Unmanned Aerial Vehicles (UAVs) capture aerial photographs with a wide viewing angle, variable backgrounds, and high-speed motion imaging. Object detection in UAV images is challenging due to significant changes object scale, small mutually occluded objects, lack of feature information. Conventional algorithms have poor real-time performance accuracy this field. The YOLO algorithm prone high false omission rates for objects complex scenes, leading accuracy. To address...

10.21203/rs.3.rs-4302780/v1 preprint EN cc-by Research Square (Research Square) 2024-04-25

3D Audio-Visual Speaker Tracking with A Two-Layer Particle Filter

OPENALEX - Publications

Hong Liu Yidi Li Bing Yang

Audio-visual speaker tracking in 3D space is a challenging problem. Although the classical particle filter based methods have shown effectiveness audio-visual tracking, performance degrades considerably when measurements are disturbed by noise. To this end, novel two-layer proposed for tracking. Firstly, two groups of particles, which generated from audio and video streams respectively, propagated independently layer visual layer. Then, likelihoods combined an adaptive sigmoid function, can...

10.1109/icip.2019.8803117 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2019-08-26

Defect Detection of Composite Material Terahertz Image Based on Faster Region-Convolutional Neural Networks

OPENALEX - Publications

Xiuwei Yang Pingan Liu Shujie Wang Biyuan Wu Kaihua Zhang and 2 more

Terahertz (THz) nondestructive testing (NDT) technology has been increasingly applied to the internal defect detection of composite materials. However, THz image is affected by background noise and power limitation, leading poor quality. The recognition rate based on traditional machine vision algorithms not high. above methods are usually unable determine surface defects in a timely accurate manner. In this paper, we propose method detect materials using terahertz images faster...

10.3390/ma16010317 article EN Materials 2022-12-29

Head‐related transfer function–reserved time‐frequency masking for robust binaural sound source localization

OPENALEX - Publications

Hong Liu Peipei Yuan Bing Yang Yang Ge Yang Chen

Various time-frequency (T-F) masks are being applied to sound source localization tasks. Moreover, deep learning has dramatically advanced T-F mask estimation. However, existing usually designed for speech separation tasks and suitable only single-channel signals. A novel complex-valued is proposed that reserves the head-related transfer function (HRTF), customized binaural localization. In addition, because convolutional neural network exploited estimate takes spectral information as input...

10.1049/cit2.12010 article EN cc-by-nc-nd CAAI Transactions on Intelligence Technology 2021-03-02

Focal Auxiliary Classifier Generative Adversarial Network for Defective Wafer Pattern Recognition with Imbalanced Data

OPENALEX - Publications

Jiahao Liu Fuzuo Zhang Bing Yang Fuquan Zhang Ying Gao and 1 more

Defective wafer pattern recognition is important for quality control and yield enhancement in semiconductor fabrication systems. The collected maps are usually imbalanced, which may degrade the performance of classifier. In this paper, a focal auxiliary classifier generative adversarial network (FAC-GAN) defective with imbalanced data proposed. FAC-GAN composed AC-GAN modified loss generation deep neural network. proposed measured on real-world map dataset "WM-811k" it outperforms SVM CNN.

10.1109/edtm50988.2021.9421037 article EN 2022 6th IEEE Electron Devices Technology & Manufacturing Conference (EDTM) 2021-04-08

Combined Physical Process and Deep Learning for Daily Water Level Simulations across Multiple Sites in the Three Gorges Reservoir, China

OPENALEX - Publications

Mingjiang Xie Kun Shan Sidong Zeng Lan Wang Zhigang Gong and 3 more

Water level prediction in large dammed rivers is an important task for flood control, hydropower generation, and ecological protection. The variations of water levels are traditionally simulated based on hydrological models. Recently, most studies have begun applying deep learning (DL) models as alternative method forecasting the dynamics levels. However, it still challenging to directly apply DL simultaneous across multiple sites. This study attempts develop a hybrid framework by combining...

10.3390/w15183191 article EN Water 2023-09-07

Improved Deep Learning Predictions for Chlorophyll Fluorescence Based on Decomposition Algorithms: The Importance of Data Preprocessing

OPENALEX - Publications

Lan Wang Mingjiang Xie Min Pan Feng He Bing Yang and 4 more

Harmful algal blooms (HABs) have been deteriorating global water bodies, and the accurate prediction of dynamics using modelling method is a challenging research area. High-frequency monitoring deep learning technology opened up new horizons for HAB forecasting. However, non-stationary stochastic process behind largely limits performance early warning booms. Through an analysis published literature, we found that decomposition methods are widely used in time-series hydrological processes....

10.3390/w15234104 article EN Water 2023-11-27

Audio-Visual Speech Recognition Using A Two-Step Feature Fusion Strategy

OPENALEX - Publications

Hong Liu Wanlu Xu Bing Yang

Lip-reading methods and fusion strategy are crucial for audio-visual speech recognition. In recent years, most approaches involve two separate audio visual streams with early or late strategies. Such a single-stage method may fail to guarantee the integrity representativeness of information simultaneously. This paper extends traditional network two-step feature by adding an (AV-EFF) stream baseline model. can learn different stages, preserving original features as much possible ensuring...

10.1109/icpr48806.2021.9412454 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2021-01-10

Robust Audio-Visual Speech Recognition Based on Hybrid Fusion

OPENALEX - Publications

Hong Liu Wenhao Li Bing Yang

The fusion of audio and visual modalities is an important stage audio-visual speech recognition (AVSR), which generally approached through feature or decision fusion. Feature can exploit the covariations between features from different effectively, whereas shows robustness capturing optimal combination multimodality. In this work, to take full advantage complementarity two strategies address challenge inherent ambiguity in noisy environments, we propose a novel hybrid based AVSR method with...

10.1109/icpr48806.2021.9412817 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2021-01-10

Part-Based Lipreading for Audio-Visual Speech Recognition

OPENALEX - Publications

Ziling Miao Hong Liu Bing Yang

Lipreading is an important component of audio-visual speech recognition. However, lips are usually modeled as a whole in lipreading, which ignores that each part lip focuses on different characteristics mouth and the overall model can not fit perfectly. Besides, features based vary lot according to speakers, leads training databases need contain much speakers possible. In this paper, A part-based lipreading (PBL) method proposed deal with mismatch between separate parts lips, also excessive...

10.1109/smc42975.2020.9283044 article EN 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2020-10-11

Research on a New Fault Diagnosis Method Based on WT, Improved PSO and SVM for Motor

OPENALEX - Publications

Huimin Zhao Wu Deng Guangyu Li Lifeng Yin Bing Yang

Background: Motor is a device that converts electrical energy into mechanical energy. It one of the most widely used equipments. Its running state directly affects performance machinery. Keywords: Alternating current motor, fault diagnosis, feature extraction, improved particle swarm optimization, support vector machine, wavelet packet.

10.2174/2212797609666161018164249 article EN Recent Patents on Mechanical Engineering 2017-01-02

Robust Interaural Time Difference Estimation Based on Convolutional Neural Network

OPENALEX - Publications

Hong Liu Peipei Yuan Bing Yang Lulu Wu

This paper proposes a novel cross correlation function (CCF) extraction method based on convolutional neural network for time difference of arrival (TDOA) estimation or further direction (DOA) estimation. CNN is utilized to learn the relationship between localization features and pre-processed waveform signal which may include not only source but also background noise reverberation. In contrast many previous sound approaches, proposed focuses spatial feature extraction. Two kind outputs,...

10.1109/robio49542.2019.8961817 article EN 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO) 2019-12-01

Multiple Sound Source Counting and Localization Based on Spatial Principal Eigenvector

OPENALEX - Publications

Bing Yang Hong Liu Cheng Pang

10.21437/interspeech.2017-940 article EN Interspeech 2022 2017-08-16

Coming Soon ...