Yang Gao

ORCID: 0000-0003-0348-2546
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Music and Audio Processing
  • Quantum Information and Cryptography
  • Mechanical and Optical Resonators
  • Speech Recognition and Synthesis
  • Advanced SAR Imaging Techniques
  • Photonic and Optical Devices
  • Generative Adversarial Networks and Image Synthesis
  • Quantum Mechanics and Applications
  • Video Surveillance and Tracking Methods
  • Multimodal Machine Learning Applications
  • Digital Media Forensic Detection
  • Advanced Vision and Imaging
  • Advanced Image and Video Retrieval Techniques
  • Domain Adaptation and Few-Shot Learning
  • 3D Shape Modeling and Analysis
  • Force Microscopy Techniques and Applications
  • Neural Networks and Applications
  • Advanced Image Processing Techniques
  • Gait Recognition and Analysis
  • Video Coding and Compression Technologies
  • Robotics and Sensor-Based Localization
  • Data Stream Mining Techniques
  • IoT-based Smart Home Systems
  • Video Analysis and Summarization

Zhejiang University
2024

Guilin University of Electronic Technology
2024

Institute of Oceanographic Instrumentation
2024

Shandong Academy of Sciences
2024

Qilu University of Technology
2024

Laoshan Laboratory
2024

Nanjing University of Posts and Telecommunications
2022-2023

North China University of Technology
2021-2023

Tianjin University of Science and Technology
2015-2023

ShangHai JiAi Genetics & IVF Institute
2022

The observation of quantized nanomechanical oscillations by detecting femtometer-scale displacements is a significant challenge for experimentalists. We propose that phonon blockade can serve as signature quantum behavior in resonators. In analogy to photon and Coulomb electrons, the main idea second cannot be excited when there one nonlinear oscillator. To realize blockade, superconducting two-level system coupled resonator used induce self-interaction. Using Monte Carlo simulations,...

10.1103/physreva.82.032101 article EN Physical Review A 2010-09-01

Voice impersonation is not the same as voice transformation, although latter an essential element of it. In impersonation, resultant must convincingly convey impression having been naturally produced by target speaker, mimicking only pitch and other perceivable signal qualities, but also style speaker. this paper, we propose a novel neural-network based speech quality- style-mimicry framework for synthesis impersonated voices. The built upon fast accurate generative adversarial network...

10.1109/icassp.2018.8462018 preprint EN 2018-04-01

Surface modification of silicone tubing could significantly improve its medical applications. Nevertheless, customizing the surface for better interfacial adhesion and multifunctionality through simple methods is still a...

10.1039/d4ta07212j article EN Journal of Materials Chemistry A 2025-01-01

Abstract Understanding the photon number statistics of a quantum emitter (QE) interacting with complex photonic environments is fundamental to advances in optics and nanophotonics. We introduce general theoretical framework for calculating modal density spectrum (MPNDS) arbitrary dielectric structures an embedded two-level QE. validate our approach by investigating system composed QE crystal (PhC) slab L3 cavity waveguide, finding that MPNDS exhibits significant changes both waveguide...

10.1088/1674-1056/adb262 article EN Chinese Physics B 2025-02-05

Air pollution has been widely recognized as a risk factor for neurological disorders, and the gut microbiome may play mediating role. However, current evidence remains limited. In this study, mouse model was employed with continuous exposure to real-time air from conception late adolescence. Effects of growth-stage on microbiome, host metabolites, brain tissue were assessed. Pathological damage in hippocampus cortex observed. Fecal metagenomic sequencing revealed alterations both...

10.3390/toxics13040260 article EN cc-by Toxics 2025-03-29

Three-dimensional coordinate measurement of feature points on the surface a large-scale workpiece is important and difficult. Various relative measuring methods have been presented in recent years, machine vision method has paid more attentions by researchers. The application 3-D discussed this paper, an accurate, simple, new proposed. design system mainly considers following aspects: 1) principle composition system; 2) monocular algorithm for camera locating; 3) calibration charge-coupled...

10.1109/tim.2009.2030875 article EN IEEE Transactions on Instrumentation and Measurement 2009-10-13

Automatic speaker verification (ASV) systems utilize the biometric information in human speech to verify speaker's identity. The techniques used for performing are often vulnerable malicious attacks that attempt induce ASV system return wrong results, allowing an impostor bypass and gain access. Attackers use a multitude of spoofing this, such as voice conversion, audio replay, synthesis, etc. In recent years, easily available tools generate deepfaked have increased potential threat systems....

10.1109/slt48900.2021.9383558 article EN 2022 IEEE Spoken Language Technology Workshop (SLT) 2021-01-19

State-of-the-art methods for audio generation suffer from fingerprint artifacts and repeated inconsistencies across temporal spectral domains.Such could be well captured by the frequency domain analysis over spectrogram.Thus, we propose a novel use of long-range spectro-temporal modulation feature -2D DCT log-Mel spectrogram deepfake detection.We show that this works better than spectrogram, CQCC, MFCC, as suitable candidate to capture such artifacts.We employ spectrum augmentation...

10.21437/interspeech.2021-1705 article EN Interspeech 2022 2021-08-27

Traffic violations and offences are becoming more serious as the traffic volume increasing, which may bring property damage threaten personal safety. Existing systems lack of capability to analyze high-throughput monitoring stream detect various types in real-time. Thus, a real-time vehicular violation detection system is real demand. In this paper, we design implement system. Our proposes algorithm can discover taking place on roadways well parking lots. order achieve analysis, parallel...

10.1109/wi-iat.2012.91 article EN 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology 2012-12-01

This paper proposes a 77GHz millimeter wave radar-based rear occupant detection system in vehicle. The major focus is detecting forgotten children vehicle's seats. radar TX chirp signal configuration selected to detect slight movements of people (like breathing). RX IF processed through 2D-FFT build two-dimension heat map. Then, the Constant False Alarm Detection (CFAR) algorithm used for object and 2D cloud points image obtained which can visualize presence Moreover, simulated vehicle...

10.1109/ccisp55629.2022.9974471 article EN 2022 7th International Conference on Communication, Image and Signal Processing (CCISP) 2022-11-01

10.1140/epjd/e2007-00319-x article EN The European Physical Journal D 2007-11-22

Gesture recognition technology is an effective way of natural and intuitive communication between humans machines. Among the many existing devices, millimeter wave radar has become a new approach to gesture because its strong target detection capability, insensitivity changes in factors such as light environment, low operating energy consumption protection user privacy. In this paper, we propose over-the-air handwritten digit method based on radar. Unlike direct use spectrograms for...

10.1109/icspcc52875.2021.9564707 article EN 2022 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) 2021-08-17

The fluctuation-dissipation relation is well known for a quantum open system with energy dissipation. In this paper similar underlying found between the bath fluctuation and dephasing of system, which conserved, but information leaking into bath. To obtain we revisit universal, simple model nondemolition interaction system. Then show that decoherence factor describing process factorized two parts, to indicate sources dephasing, vacuum fluctuation, thermal excitations defined in initial state...

10.1103/physreve.75.011105 article EN Physical Review E 2007-01-10

While modern TTS technologies have made significant advancements in audio quality, there is still a lack of behavior naturalness compared to conversing with people. We propose style-embedded system that generates styled responses based on the speech query style. To achieve this, includes style extraction model extracts embedding from query, which then used by produce matching response. faced two main challenges: 1) only small portion training dataset has labels, needed train multi-style...

10.21437/interspeech.2020-3069 article EN Interspeech 2022 2020-10-25
Coming Soon ...