- Coal Properties and Utilization
- Fire dynamics and safety research
- Combustion and Detonation Processes
- Fire effects on ecosystems
- Speech and Audio Processing
- Wind and Air Flow Studies
- Neural dynamics and brain function
- Speech Recognition and Synthesis
- Fire Detection and Safety Systems
- Advanced Memory and Neural Computing
- Risk and Safety Analysis
- Music and Audio Processing
- Combustion and flame dynamics
- Advanced Text Analysis Techniques
- Thermochemical Biomass Conversion Processes
- Geophysical Methods and Applications
- Rock Mechanics and Modeling
- Coal and Its By-products
- IoT and GPS-based Vehicle Safety Systems
- Underground infrastructure and sustainability
- Generative Adversarial Networks and Image Synthesis
- Fluid Dynamics and Heat Transfer
- Fluid Dynamics and Mixing
- Advanced Neural Network Applications
- Safety Warnings and Signage
Xi'an University of Science and Technology
2020-2024
National University of Singapore
2022-2024
Hefei Institutes of Physical Science
2022
Chinese Academy of Sciences
2022
Nanjing Tech University
2015-2021
Sichuan University
2019-2020
China University of Mining and Technology
2011-2020
Chengdu University
2020
Shenzhen University
2016
Deutsches Zentrum für Luft- und Raumfahrt e. V. (DLR)
2014-2015
Audio-visual speaker diarization aims at detecting "who spoke when'' using both auditory and visual signals. Existing audio-visual datasets are mainly focused on indoor environments like meeting rooms or news studios, which quite different from in-the-wild videos in many scenarios such as movies, documentaries, audience sitcoms. To develop methods for these challenging videos, we create the AVA Audio-Visual Diarization (AVA-AVD) dataset. Our experiments demonstrate that adding AVA-AVD into...