- Video Analysis and Summarization
- Multimodal Machine Learning Applications
- Generative Adversarial Networks and Image Synthesis
- Electrochemical sensors and biosensors
- Natural Language Processing Techniques
- Graphene research and applications
- Human Motion and Animation
- Topic Modeling
- Advanced biosensing and bioanalysis techniques
- Music and Audio Processing
- Visual Attention and Saliency Detection
- Advanced Image and Video Retrieval Techniques
Jinan University
2024
Southern University of Science and Technology
2023
In the realm of clinical practice, concurrent utilization anticancer medications can enhance their overall therapeutic efficacy. However, it is crucial to acknowledge that interactions among these drugs potentially yield detrimental consequences on intended outcomes. Consequently, assessment both potency and potential toxic side effects greatly refined when multiple are simultaneously detected evaluated. Here, we designed a wearable electrochemical aptasensor array for monitoring in sweat....
Launchpad is a musical instrument that allows users to create and perform music by pressing illuminated buttons. To assist inspire the design of light effect, provide more accessible approach for beginners visualization with this instrument, we proposed LaunchpadGPT model generate designs on automatically. Based language excellent generation ability, our takes an audio piece as input outputs lighting effects Launchpad-playing in form video (Launchpad-playing video). We collect videos process...
With the burgeoning growth of online video platforms and escalating volume content, demand for proficient understanding tools has intensified markedly. Given remarkable capabilities Large Language Models (LLMs) in language multimodal tasks, this survey provides a detailed overview recent advancements harnessing power LLMs (Vid-LLMs). The emergent Vid-LLMs are surprisingly advanced, particularly their ability open-ended spatial-temporal reasoning combined with commonsense knowledge,...
Advertisement video editing aims to automatically edit advertising videos into shorter while retaining coherent content and crucial information conveyed by advertisers. It mainly contains two stages: segmentation segment assemblage. The existing method performs well at stages but suffers from the problems of dependencies on extra cumbersome models poor performance assemblage stage. To address these problems, we propose M-SAN (Multi-modal Segment Assemblage Network) which can perform...