Jiayao Sun

ORCID: 0009-0007-1002-6879
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Speech Recognition and Synthesis
  • Music and Audio Processing
  • Advanced Adaptive Filtering Techniques
  • Advanced Fiber Optic Sensors
  • Luminescence Properties of Advanced Materials
  • Luminescence and Fluorescent Materials
  • Microfluidic and Bio-sensing Technologies
  • Orbital Angular Momentum in Optics
  • Optical Coherence Tomography Applications
  • Photonic and Optical Devices
  • Video Analysis and Summarization
  • Molecular Sensors and Ion Detection
  • Near-Field Optical Microscopy
  • X-ray Diffraction in Crystallography
  • Organic Light-Emitting Diodes Research
  • Optical and Acousto-Optic Technologies
  • Electronic and Structural Properties of Oxides
  • Nanoplatforms for cancer theranostics
  • Mechanical and Optical Resonators
  • Mechanical stress and fatigue analysis
  • Advanced Data Compression Techniques
  • Aerodynamics and Fluid Dynamics Research
  • Structural Load-Bearing Analysis
  • Advanced Optical Network Technologies

Northwestern Polytechnical University
2022-2024

Northeast Petroleum University
2023-2024

Beijing University of Chemical Technology
2021-2022

Soochow University
2021

Zhangjiagang First People's Hospital
2021

Jiangsu University
2012-2014

Lanzhou University
2012

In speech enhancement, complex neural network has shown promising performance due to their effectiveness in processing complex-valued spectrum. Most of the recent enhancement approaches mainly focus on wide-band signal with a sampling rate 16K Hz. However, research super wide band (e.g., 32K Hz) or even full-band (48K) denoising using deep learning is still its infancy difficulty modeling more frequency bands and particularly high components. this paper, we extend our previous convolution...

10.1109/icassp43922.2022.9747029 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

The same molecule synthesized from different carbazoles may show various properties, which originate the trace isomer in purchased carbazole. By changing content of isomers, phosphorescence lifetime can be quantitatively adjusted.

10.1039/d1tc03020e article EN Journal of Materials Chemistry C 2021-01-01

This paper introduces the NWPU Team's entry to ICASSP 2022 AEC Challenge. We take a hybrid approach that cascades linear with neural post-filter. The former is used deal echo components while latter suppresses residual non-linear components. use gated convolutional F-T-LSTM network (GFTNN) as backbone and shape post-filter by multi-task learning (MTL) framework, where voice activity detection (VAD) module adopted an auxiliary task along suppression, aim avoid over suppression may cause...

10.1109/icassp43922.2022.9746733 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Bridging optical tweezers and microfluidics can form a multifunctional platform, which overcome the difficulties of precise manipulation in hydrodynamic flow noninvasive method. However, when integrated into microfluidic chip, fiber optic tweezer loses its flexibility. Here, we propose compact single tweezer–micropipette system. It sort particles by differences shape refractive index completely way while retaining flexibility, high selectivity, precision tweezer. Compact channels are formed...

10.1063/5.0139071 article EN Applied Physics Letters 2023-06-05

The new-type stainless steel–concrete–carbon steel double-skin tubular (SCCDST) members, characterized by their exceptional corrosion resistance and mechanical bearing capacity, have promising applications in ocean engineering, particularly deep-water engineering. external hydraulic pressure interfacial action of various materials intensify the complexity composite performance SCCDST members. This paper describes an analytical investigation on concentric compressive members under pressure....

10.3390/jmse12030406 article EN cc-by Journal of Marine Science and Engineering 2024-02-26

This paper introduces the SWANT team’s entry to ICASSP 2023 AEC Challenge. We submit a system that cascades linear filter with neural post-filter. Particularly, we adopt sub-band processing handle full-band signals and shape network multi-task learning, where dual signal voice activity detection (DSVAD) echo estimation are adopted as auxiliary tasks. Moreover, particularly improve time frequency convolution module (TFCM) increase receptive field using small kernels. Finally, our has ranked...

10.1109/icassp49357.2023.10095137 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Multimodal biometric sensing and processing systems can significantly improve the success rate of identification authentication compared to traditional unimodal techniques. We propose a flexible micro-nano fiber (MNF) multimodal sensor for fingerprint recognition. used polydimethylsiloxane (PDMS) as substrate placed MNF in s-shape on PDMS. The surface PDMS is covered with film thickness only 2 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/jsen.2023.3347201 article EN IEEE Sensors Journal 2024-01-03

We report the first observation of up-conversion photostimulated luminescence in non-doped Mg2SnO4. Stimulated by 980 nm infrared laser (reading) after ultraviolet irradiation (writing), phosphor shows emission band covering 470–550 nm, which is due to recombination F centers with holes. After ceasing irradiation, storage intensity would rapidly decrease 59% its original 2.5 h and then not degrade anymore. It suggested that Mg2SnO4 has potential applications for optical storage. Accordingly,...

10.1088/0256-307x/28/2/027802 article EN Chinese Physics Letters 2011-02-01

In speech enhancement, complex neural network has shown promising performance due to their effectiveness in processing complex-valued spectrum. Most of the recent enhancement approaches mainly focus on wide-band signal with a sampling rate 16K Hz. However, research super wide band (e.g., 32K Hz) or even full-band (48K) denoising is still lacked difficulty modeling more frequency bands and particularly high components. this paper, we extend our previous deep convolution recurrent (DCCRN)...

10.48550/arxiv.2111.08387 preprint EN cc-by arXiv (Cornell University) 2021-01-01

To promote speech processing and recognition research in driving scenarios, we build on the success of Intelligent Cockpit Speech Recognition Challenge (ICSRC) held at ISCSLP 2022 launch ICASSP 2024 In-Car Multi-Channel Automatic (ICMC-ASR) Challenge. This challenge collects over 100 hours multi-channel data recorded inside a new energy vehicle 40 noise for augmentation. Two tracks, including automatic (ASR) diarization (ASDR) are set up, using character error rate (CER) concatenated minimum...

10.48550/arxiv.2401.03473 preprint EN other-oa arXiv (Cornell University) 2024-01-01

This paper describes our audio-quality-based multi-strategy approach for the audio-visual target speaker extraction (AVTSE) task in Multi-modal Information based Speech Processing (MISP) 2023 Challenge. Specifically, adopts different strategies on audio quality, striking a balance between interference removal and speech preservation, which benifits back-end automatic recognition (ASR) systems. Experiments show that achieves character error rate (CER) of 24.2% 33.2% Dev Eval set,...

10.48550/arxiv.2401.03697 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Packet loss is a common and unavoidable problem in voice over internet phone (VoIP) systems. To deal with the problem, we propose band-split packet concealment network (BS-PLCNet). Specifically, split full-band signal into wide-band (0-8kHz) high-band (8-24kHz). The signals are processed by gated convolutional recurrent (GCRN), while counterpart simple GRU network. ensure high speech quality automatic recognition (ASR) compatibility, multi-task learning (MTL) framework including fundamental...

10.48550/arxiv.2401.03687 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Conventional single-fiber optical tweezers usually capture particles at the front or side end of tip a fiber probe, thus enabling manipulation in limited range. In this paper, we design and fabricate novel which is prepared by integrating common single-mode (SMF) silica capillary microtubular (COF), excitation LP <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">21</sub> higher-order modes fiber-optic probes, its output light field has multiple...

10.1109/jsen.2023.3345372 article EN IEEE Sensors Journal 2024-01-17

Advancements in deep learning and voice-activated technologies have driven the development of human-vehicle interaction. Distributed microphone arrays are widely used in-car scenarios because they can accurately capture voices passengers from different speech zones. However, increase number audio channels, coupled with limited computational resources low latency requirements systems, presents challenges for multi-channel separation. To migrate problems, we propose a lightweight framework...

10.48550/arxiv.2409.08610 preprint EN arXiv (Cornell University) 2024-09-13
Coming Soon ...