- Human Pose and Action Recognition
- Anomaly Detection Techniques and Applications
- Video Analysis and Summarization
- Multimodal Machine Learning Applications
- Generative Adversarial Networks and Image Synthesis
- E-commerce and Technology Innovations
- Video Surveillance and Tracking Methods
- Multimedia Communication and Technology
- Human Motion and Animation
- Control and Dynamics of Mobile Robots
- Digital Humanities and Scholarship
- Machine Learning and Algorithms
- Image and Video Quality Assessment
- Water Systems and Optimization
- Advanced Data and IoT Technologies
- Advancements in Photolithography Techniques
- VLSI and Analog Circuit Testing
- Advanced Image and Video Retrieval Techniques
- Adhesion, Friction, and Surface Interactions
- Cellular Automata and Applications
- Domain Adaptation and Few-Shot Learning
- Gait Recognition and Analysis
- Advanced Neural Network Applications
- Connexins and lens biology
- Advanced Steganography and Watermarking Techniques
South China University of Technology
2024-2025
Zhuhai Institute of Advanced Technology
2025
Xi'an Jiaotong University
2024
Alibaba Group (United States)
2022-2024
Siemens (China)
2024
Hebei University
2024
Hohai University
2021
Huazhong University of Science and Technology
2007-2019
Wuhan University
2007
Hubei Zhongshan Hospital
2007
Current few-shot action recognition methods reach impressive performance by learning discriminative features for each video via episodic training and designing various temporal alignment strategies. Nevertheless, they are limited in that (a) individual without considering the entire task may lose most relevant information current episode, (b) these strategies fail misaligned instances. To overcome two limitations, we propose a novel Hybrid Relation guided Set Matching (HyRSM) approach...
Current state-of-the-art approaches for few-shot action recognition achieve promising performance by conducting frame-level matching on learned visual features. However, they generally suffer from two limitations: i) the procedure between local frames tends to be inaccurate due lack of guidance force long-range temporal perception; ii) explicit motion learning is usually ignored, leading partial information loss. To address these issues, we develop a Motion-augmented Long-short Contrastive...
Current state-of-the-art approaches for spatio-temporal action detection have achieved impressive results but remain unsatisfactory temporal extent detection. The main reason comes from that, there are some ambiguous states similar to the real actions which may be treated as target even by a well trained network. In this paper, we define these samples "transitional states", and propose Transition-Aware Context Network (TACNet) distinguish transitional states. proposed TACNet includes two...
The evolution of 5G technology necessitates effective thermal management strategies for compact, high-power devices. potential aluminum-based vapor chambers (VCs) as solutions is recognized, yet the heat transfer performance limited by capillary constraints wick structures. This study proposes a laser-sintered composite to address this limitation. Experimental evaluations were conducted on microgroove wicks (MW) and groove–spiral woven mesh (GSCW), utilizing ethanol acetone working fluids....
Text-to-video diffusion models have made remarkable advancements. Driven by their ability to generate temporally coherent videos, research on zero-shot video editing using these fundamental has expanded rapidly. To enhance quality, structural controls are frequently employed in editing. Among techniques, cross-attention mask control stands out for its effectiveness and efficiency. However, when masks naively applied editing, they can introduce artifacts such as blurring flickering. Our...
Semantic parts have shown a powerful discriminative capacity for action recognition. However, many existing methods select according to predefined heuristic rules, which may cause the correlation among be lost, or do not appropriately consider cluttered candidate part space, result in weak generalizability of resulting labels. Therefore, better consideration and refinement space will lead more representation. This paper achieves improved performance by elegantly addressing these two factors....
Mid-level parts are shown to be effective for human action recognition in videos. Typically, these semantic first mined with some heuristic rules, then videos represented via volumetric max-pooling (VMP) method. However, methods have two issues: 1) the VMP strategy divides by static grids. In this case, a part may occur different localizations That means loses space-time invariance. To solve problem, we propose apply saliency-driven scheme represent video. We extract video cues saliency map,...
Self-supervised learning presents a remarkable performance to utilize unlabeled data for various video tasks. In this paper, we focus on applying the power of self-supervised methods improve semi-supervised action proposal generation. Particularly, design an effective Semi-supervised Temporal Action Proposal (SSTAP) framework. The SSTAP contains two crucial branches, i.e., temporal-aware branch and relation-aware branch. improves model by introducing temporal perturbations, feature shift...
This technical report presents our first place winning solution for temporal action detection task in CVPR-2022 AcitivityNet Challenge. The aims to localize boundaries of instances with specific classes long untrimmed videos. Recent mainstream attempts are based on dense boundary matchings and enumerate all possible combinations produce proposals. We argue that the generated proposals contain rich contextual information, which may benefits confidence prediction. To this end, method mainly...
Recent advancements in generation models have showcased remarkable capabilities generating fantastic content. However, most of them are trained on proprietary high-quality data, and some withhold their parameters only provide accessible application programming interfaces (APIs), limiting benefits for downstream tasks. To explore the feasibility training a text-to-image model comparable to advanced using publicly available resources, we introduce EvolveDirector. This framework interacts with...
Recent advances in customized video generation have enabled users to create videos tailored both specific subjects and motion trajectories. However, existing methods often require complicated test-time fine-tuning struggle with balancing subject learning control, limiting their real-world applications. In this paper, we present DreamVideo-2, a zero-shot customization framework capable of generating trajectory, guided by single image bounding box sequence, respectively, without the need for...
To address the problems of weak search ability, easily falling into local optimal solutions and poor path quality sparrow algorithm in AGV planning, a multi-strategy improved (MISSA) is proposed this paper. MISSA improves global ability by improving discoverer position update operator introducing sine cosine algorithm; adopts adaptive number vigilantes adjustment step size to improve convergence speed; introduces Levy flight variation strategy reduce probability any solution; optimizes...
Text-to-video diffusion models have made remarkable advancements. Driven by their ability to generate temporally coherent videos, research on zero-shot video editing using these fundamental has expanded rapidly. To enhance quality, structural controls are frequently employed in editing. Among techniques, cross-attention mask control stands out for its effectiveness and efficiency. However, when masks naively applied editing, they can introduce artifacts such as blurring flickering. Our...
With the rapid advancement of electronic integration technology, requirements for working environment and stability heat dissipation equipment have become increasingly stringent. Consequently, studying a high-efficiency gas–liquid two-phase transfer surface holds significant importance. Aiming at limited liquid transport performance caused by temperature gradient in process, this paper combines wetting with shape proposes gradient-wettable multiwedge patterned surface, where droplets can be...
Optical proximity correction (OPC) plays a critical role in the entire semiconductor manufacturing process. The consistency of identical patterns within same context becomes increasingly crucial to ensure high performance during OPC processing, especially areas like SRAM regions. Consistency checking essentially involves classification repeated and comparison pattern layers (i.e., results) . While mini-array designs can often be easily identified manually, there are still instances where...
Current state-of-the-art approaches for spatio-temporal action detection have achieved impressive results but remain unsatisfactory temporal extent detection. The main reason comes from that, there are some ambiguous states similar to the real actions which may be treated as target even by a well-trained network. In this paper, we define these samples "transitional states", and propose Transition-Aware Context Network (TACNet) distinguish transitional states. proposed TACNet includes two...
This paper develops a method to learn very few discriminative part detectors from training videos directly, for action recognition. We hold the opinion that being classification is of primary importance in selecting detectors, not just intuitive. For this purpose, selection based on feature proposed, employing SVM method. Firstly, large number candidate are trained using k-means and Exemplar-LDA techniques whitened space. Secondly, each detector regarded as visual feature, so can be achieved...
Abstract Blockchain technology is a new type of distributed database solution, which has unique advantages in terms decentralization, security and transparency. These characteristics the blockchain can solve some typical technical problems construction current power distribution IoT cloud master station. In this context, architecture design implementation station based on first discussed; then, integrated from three aspects: performance, state estimation algorithm, deep search engine. The...