Bing Yang

ORCID: 0000-0002-8978-2322
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Music and Audio Processing
  • Parallel Computing and Optimization Techniques
  • Advancements in Semiconductor Devices and Circuit Design
  • Indoor and Outdoor Localization Technologies
  • Hearing Loss and Rehabilitation
  • Speech Recognition and Synthesis
  • Underwater Acoustics Research
  • Advanced Algorithms and Applications
  • Analog and Mixed-Signal Circuit Design
  • Acoustic Wave Phenomena Research
  • Advanced Data Storage Technologies
  • Advancements in PLL and VCO Technologies
  • Distributed systems and fault tolerance
  • Semiconductor materials and devices
  • Advanced Adaptive Filtering Techniques
  • Silicon Carbide Semiconductor Technologies
  • Hydrological Forecasting Using AI
  • Flood Risk Assessment and Management
  • Advanced Sensor and Control Systems
  • Digital Filter Design and Implementation
  • Interconnection Networks and Systems
  • Video Surveillance and Tracking Methods
  • Smart Grid and Power Systems
  • Advanced Wireless Communication Techniques

Beijing Microelectronics Technology Institute
1998-2024

Westlake University
2021-2024

North China University of Technology
2008-2024

Southwestern University of Finance and Economics
2024

Hubei Zhongshan Hospital
2024

Wuhan University
2024

Sichuan Agricultural University
2024

Hangzhou Dianzi University
2023

Peking University
2002-2022

Shandong University of Technology
2022

Graph convolutional networks have been widely used for skeleton-based action recognition due to their excellent modeling ability of non-Euclidean data. As the graph convolution is a local operation, it can only utilize short-range joint dependencies and short-term trajectory but fails directly model distant joints relations long-range temporal information that are vital distinguishing various actions. To solve this problem, we present multi-scale spatial (MS-GC) module (MT-GC) enrich...

10.1609/aaai.v35i2.16197 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Most binaural speech source localization models perform poorly in unprecedentedly noisy and reverberant situations. Here, this issue is approached by modelling a multiscale dilated convolutional neural network (CNN). The time-related crosscorrelation function (CCF) energy-related interaural level differences (ILD) are preprocessed separate branches of network. CNN can encode discriminative representations for CCF ILD, respectively. After encoding, the individual fused to map direction....

10.1155/2020/5819624 article EN cc-by Complexity 2020-12-30

This article proposes a deep neural network (DNN)-based direct-path relative transfer function (DP-RTF) enhancement method for robust direction of arrival (DOA) estimation in noisy and reverberant environments. The DP-RTF refers to the ratio between acoustic functions two microphone channels. First, complex-value is decomposed into inter-channel intensity difference, sinusoidal phase difference time-frequency domain. Then, features from series temporal context frames are utilized train DNN...

10.1049/cit2.12024 article EN cc-by-nc-nd CAAI Transactions on Intelligence Technology 2021-04-14

This paper addresses the problem of multiple sound source counting and localization in adverse acoustic environments, using microphone array recordings. The proposed time-frequency (TF) wise spatial spectrum clustering based method contains two stages. First, given received sensor signals, correlation matrix is computed denoised TF domain. TF-wise estimated on signal subspace information, further enhanced by an exponential transform, which can increase reliability presence possibility...

10.1109/taslp.2019.2915785 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2019-05-10

Direct-path relative transfer function (DP-RTF) refers to the ratio between direct-path acoustic functions of two microphone channels. Though DP-RTF fully encodes sound spatial cues and serves as a reliable localization feature, it is often erroneously estimated in presence noise reverberation. This paper proposes learn with deep neural networks for robust binaural source localization. A learning network designed regress sensor signals real-valued representation DP-RTF. It consists branched...

10.1109/taslp.2021.3120641 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Abstract This research conducted quasi‐experiments in four middle schools to evaluate the long‐term effects of an intelligent web‐based English instruction system, C omputer S imulation E ducational ommunication ( CSIEC ), on students' academic attainment. The analysis regular examination scores and vocabulary test validates positive impact , most cases, is statistically significant. reliability ensured by spectrum students from Grade 1 3 three junior high 2 one senior school, teachers with...

10.1111/jcal.12016 article EN Journal of Computer Assisted Learning 2013-05-13

Multiple moving sound source localization in real-world scenarios remains a challenging issue due to interaction between sources, time-varying trajectories, distorted spatial cues, etc. In this work, we propose use deep learning techniques learn competing and direct-path phase differences for localizing multiple sources. A causal convolutional recurrent neural network is designed extract the difference sequence from signals of each microphone pair. To avoid assignment ambiguity problem...

10.1109/icassp43922.2022.9746624 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

As a result of climate change and rapid urbanization, urban waterlogging commonly caused by rainstorm, is becoming more frequent severe in developing countries. Urban sometimes results significant financial losses as well human casualties. Accurate depth prediction critical for early warning system emergency response. However, the existing hydrological models need to obtain abundant data, model construction complicated. The technology based on object detection are highly dependent image...

10.1371/journal.pone.0286821 article EN cc-by PLoS ONE 2023-10-12

Multiple sound source localization in wireless acoustic sensor networks (WASNs) is a challenging problem. Although compressive sensing based methods have shown effectiveness uncorrelated sources localization, their performance degrades significantly when they are used to locate multiple speech sources. To this end, we propose method on the time difference of arrival (TDOA) clustering and multi-path matching pursuit algorithm. First, TDOAs calculated locally time-frequency (TF) bins...

10.1109/icassp.2017.7952755 article EN 2017-03-01

<title>Abstract</title> Unmanned Aerial Vehicles (UAVs) capture aerial photographs with a wide viewing angle, variable backgrounds, and high-speed motion imaging. Object detection in UAV images is challenging due to significant changes object scale, small mutually occluded objects, lack of feature information. Conventional algorithms have poor real-time performance accuracy this field. The YOLO algorithm prone high false omission rates for objects complex scenes, leading accuracy. To address...

10.21203/rs.3.rs-4302780/v1 preprint EN cc-by Research Square (Research Square) 2024-04-25

Audio-visual speaker tracking in 3D space is a challenging problem. Although the classical particle filter based methods have shown effectiveness audio-visual tracking, performance degrades considerably when measurements are disturbed by noise. To this end, novel two-layer proposed for tracking. Firstly, two groups of particles, which generated from audio and video streams respectively, propagated independently layer visual layer. Then, likelihoods combined an adaptive sigmoid function, can...

10.1109/icip.2019.8803117 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2019-08-26

Terahertz (THz) nondestructive testing (NDT) technology has been increasingly applied to the internal defect detection of composite materials. However, THz image is affected by background noise and power limitation, leading poor quality. The recognition rate based on traditional machine vision algorithms not high. above methods are usually unable determine surface defects in a timely accurate manner. In this paper, we propose method detect materials using terahertz images faster...

10.3390/ma16010317 article EN Materials 2022-12-29

Various time-frequency (T-F) masks are being applied to sound source localization tasks. Moreover, deep learning has dramatically advanced T-F mask estimation. However, existing usually designed for speech separation tasks and suitable only single-channel signals. A novel complex-valued is proposed that reserves the head-related transfer function (HRTF), customized binaural localization. In addition, because convolutional neural network exploited estimate takes spectral information as input...

10.1049/cit2.12010 article EN cc-by-nc-nd CAAI Transactions on Intelligence Technology 2021-03-02

Defective wafer pattern recognition is important for quality control and yield enhancement in semiconductor fabrication systems. The collected maps are usually imbalanced, which may degrade the performance of classifier. In this paper, a focal auxiliary classifier generative adversarial network (FAC-GAN) defective with imbalanced data proposed. FAC-GAN composed AC-GAN modified loss generation deep neural network. proposed measured on real-world map dataset "WM-811k" it outperforms SVM CNN.

10.1109/edtm50988.2021.9421037 article EN 2022 6th IEEE Electron Devices Technology &amp; Manufacturing Conference (EDTM) 2021-04-08

Water level prediction in large dammed rivers is an important task for flood control, hydropower generation, and ecological protection. The variations of water levels are traditionally simulated based on hydrological models. Recently, most studies have begun applying deep learning (DL) models as alternative method forecasting the dynamics levels. However, it still challenging to directly apply DL simultaneous across multiple sites. This study attempts develop a hybrid framework by combining...

10.3390/w15183191 article EN Water 2023-09-07

Harmful algal blooms (HABs) have been deteriorating global water bodies, and the accurate prediction of dynamics using modelling method is a challenging research area. High-frequency monitoring deep learning technology opened up new horizons for HAB forecasting. However, non-stationary stochastic process behind largely limits performance early warning booms. Through an analysis published literature, we found that decomposition methods are widely used in time-series hydrological processes....

10.3390/w15234104 article EN Water 2023-11-27

Lip-reading methods and fusion strategy are crucial for audio-visual speech recognition. In recent years, most approaches involve two separate audio visual streams with early or late strategies. Such a single-stage method may fail to guarantee the integrity representativeness of information simultaneously. This paper extends traditional network two-step feature by adding an (AV-EFF) stream baseline model. can learn different stages, preserving original features as much possible ensuring...

10.1109/icpr48806.2021.9412454 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2021-01-10

The fusion of audio and visual modalities is an important stage audio-visual speech recognition (AVSR), which generally approached through feature or decision fusion. Feature can exploit the covariations between features from different effectively, whereas shows robustness capturing optimal combination multimodality. In this work, to take full advantage complementarity two strategies address challenge inherent ambiguity in noisy environments, we propose a novel hybrid based AVSR method with...

10.1109/icpr48806.2021.9412817 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2021-01-10

Lipreading is an important component of audio-visual speech recognition. However, lips are usually modeled as a whole in lipreading, which ignores that each part lip focuses on different characteristics mouth and the overall model can not fit perfectly. Besides, features based vary lot according to speakers, leads training databases need contain much speakers possible. In this paper, A part-based lipreading (PBL) method proposed deal with mismatch between separate parts lips, also excessive...

10.1109/smc42975.2020.9283044 article EN 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2020-10-11

Background: Motor is a device that converts electrical energy into mechanical energy. It one of the most widely used equipments. Its running state directly affects performance machinery. Keywords: Alternating current motor, fault diagnosis, feature extraction, improved particle swarm optimization, support vector machine, wavelet packet.

10.2174/2212797609666161018164249 article EN Recent Patents on Mechanical Engineering 2017-01-02

This paper proposes a novel cross correlation function (CCF) extraction method based on convolutional neural network for time difference of arrival (TDOA) estimation or further direction (DOA) estimation. CNN is utilized to learn the relationship between localization features and pre-processed waveform signal which may include not only source but also background noise reverberation. In contrast many previous sound approaches, proposed focuses spatial feature extraction. Two kind outputs,...

10.1109/robio49542.2019.8961817 article EN 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO) 2019-12-01
Coming Soon ...