- Image and Video Quality Assessment
- Infrared Target Detection Methodologies
- Air Quality Monitoring and Forecasting
- Antioxidant Activity and Oxidative Stress
- Congenital heart defects research
- Phase Change Materials Research
- Solar-Powered Water Purification Methods
- Speech Recognition and Synthesis
- Petri Nets in System Modeling
- Speech and Audio Processing
- Advanced Optical Imaging Technologies
- Hearing, Cochlea, Tinnitus, Genetics
- Advanced Wireless Communication Techniques
- Plant biochemistry and biosynthesis
- Power Line Communications and Noise
- Visual Attention and Saliency Detection
- Optical Systems and Laser Technology
- Network Time Synchronization Technologies
- Adaptive optics and wavefront sensing
- Elevator Systems and Control
- Hearing Loss and Rehabilitation
- Power Systems and Technologies
- Vehicle emissions and performance
- Industrial Vision Systems and Defect Detection
- Advanced Vision and Imaging
Beijing Academy of Artificial Intelligence
2024-2025
Beijing University of Technology
2024-2025
Neusoft (China)
2025
Chengdu Neusoft University
2025
Zhejiang University
2009-2024
Beijing Forestry University
2024
Nanjing University of Aeronautics and Astronautics
2024
University of Hong Kong
2024
Zhejiang Energy Research Institute
2022
We introduce Seed-TTS, a family of large-scale autoregressive text-to-speech (TTS) models capable generating speech that is virtually indistinguishable from human speech. Seed-TTS serves as foundation model for generation and excels in in-context learning, achieving performance speaker similarity naturalness matches ground truth both objective subjective evaluations. With fine-tuning, we achieve even higher scores across these metrics. offers superior controllability over various attributes...
The visual quality of 3D-synthesized videos is closely related to the development and broadcasting immersive media such as free-viewpoint six degrees freedom navigation. Therefore, studying 3D-Synthesized video assessment helpful promote popularity applications. Inspired by texture compression, depth compression virtual view synthesis polluting at pixel-, structure-and content-levels, this paper proposes a Multi-Level Video Quality Assessment algorithm, namely ML-SVQA, which consists feature...
Abstract The mammalian inner ear houses the vestibular and cochlear sensory organs dedicated to sensing balance sound, respectively. These distinct arise from a common prosensory region, but mechanisms underlying their divergence remain elusive. Here, we showed that two evolutionarily conserved homeobox genes, Irx3 Irx5 , are required for patterning segregation of saccular domains, as well formation auditory cells. Irx3/5 were highly expressed in cochlea, deletion resulted significantly...
<title>Abstract</title> Single-image super-resolution (SR) methods often encounter difficulties when applied to real-worldimages because of deviations between the degradation model and real-world distribution. Recent studies have attempted address this issue by adopting more complex models tosimulate distributions. However, these tend produce over-smoothedresults lacking fine-grained details. Furthermore, neglect disparity syntheticand images in frequency domain. This study proposes a...
This paper presents a novel diagnosis method combining element-oriented Petri nets with temporal order information. With the use of time-stamp, proposed models protection system philosophy crucial elements in power system, such as transmission lines, buses and transformers. The advantages approach involve following aspects: Firstly, model is easy to recognize causality between data. Secondly, are employed deal problem combination explosion procedure modeling easier. Thirdly, information...