Kai Chen

ORCID: 0000-0003-4160-1024
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Crystallization and Solubility Studies
  • X-ray Diffraction in Crystallography
  • Face recognition and analysis
  • Advanced Image and Video Retrieval Techniques
  • Topic Modeling
  • Domain Adaptation and Few-Shot Learning
  • Face and Expression Recognition
  • Multimodal Machine Learning Applications
  • Human Pose and Action Recognition
  • Advanced Graph Neural Networks
  • Anomaly Detection Techniques and Applications
  • Crystallography and molecular interactions
  • Video Surveillance and Tracking Methods
  • Robotic Path Planning Algorithms
  • Recommender Systems and Techniques
  • Image Retrieval and Classification Techniques
  • Machine Learning and ELM
  • Graph Theory and Algorithms
  • Biometric Identification and Security
  • Hand Gesture Recognition Systems
  • Advanced Neural Network Applications
  • Fault Detection and Control Systems
  • Complex Network Analysis Techniques
  • Robotics and Sensor-Based Localization
  • Medical Image Segmentation Techniques

Zhejiang Institute of Communications
2025

National University of Defense Technology
2016-2024

Guangdong University of Technology
2024

Shanghai Artificial Intelligence Laboratory
2022-2023

Group Sense (China)
2023

ShangHai JiAi Genetics & IVF Institute
2022

Liaoning Technical University
2021

Shanghai Maritime University
2021

University of Shanghai for Science and Technology
2021

Tianjin Normal University
2019

We present PYSKL: an open-source toolbox for skeleton-based action recognition based on PyTorch. The supports a wide variety of skeleton algorithms, including approaches GCN and CNN. In contrast to existing projects that include only one or two PYSKL implements six different algorithms under unified framework with both the latest original good practices ease comparison efficacy efficiency. also provide GCN-based model named ST-GCN++, which achieves competitive performance without any...

10.1145/3503161.3548546 article EN Proceedings of the 30th ACM International Conference on Multimedia 2022-10-10

In previous deep-learning-based methods, semantic segmentation has been regarded as a static or dynamic per-pixel classification task, i.e., classify each pixel representation to specific category. However, these methods only focus on learning better representations kernels while ignoring the structural information of objects, which is critical human decision-making mechanism. this paper, we present new paradigm for segmentation, named structure-aware extraction. Specifically, it generates...

10.1109/tcsvt.2023.3252807 article EN IEEE Transactions on Circuits and Systems for Video Technology 2023-03-06

Evolutionary algorithms exhibit flexibility and global search advantages in multi-UAV path planning, effectively addressing complex constraints. However, when there are numerous obstacles the environment, especially narrow passageways, algorithm often struggles to quickly find a viable path. Additionally, collaborative constraints among multiple UAVs complicate space, making convergence challenging. To address these issues, we propose novel hybrid particle swarm optimization called PPSwarm....

10.3390/drones8050192 article EN cc-by Drones 2024-05-11

When dealing with UAV path planning problems, evolutionary algorithms demonstrate strong flexibility and global search capabilities. However, as the number of UAVs increases, scale problem grows exponentially, leading to a significant rise in computational complexity. The Cooperative Co-Evolutionary Algorithm (CCEA) effectively addresses this issue through its divide-and-conquer strategy. Nonetheless, CCEA needs find balance between efficiency algorithmic performance while also resolving...

10.3390/drones8090435 article EN cc-by Drones 2024-08-26

We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using images? To answer this, we train 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on large dataset images (the model has 1 billion connections, 10 million 200x200 pixel downloaded Internet). this network parallelism asynchronous SGD cluster 1,000 machines (16,000 cores) for three days....

10.48550/arxiv.1112.6209 preprint EN other-oa arXiv (Cornell University) 2011-01-01

Abstract Predicting potential facts in the future, Temporal Knowledge Graph (TKG) extrapolation remains challenging because of deep dependence between temporal association and semantic patterns facts. Intuitively, (events) that happened at different timestamps have influences on future events, which can be attributed to a hierarchy among not only but also relevant entities. Therefore, it is crucial pay more attention important entities events when forecasting future. However, most existing...

10.1049/cit2.12186 article EN CAAI Transactions on Intelligence Technology 2023-01-26

The multi-UAV path planning method based on artificial potential field (APF) has the advantage of rapid processing speed and ability to deal with dynamic obstacles, though some problems remain-such as a lack consideration initial heading constraint UAVs, making it easy fall into local minimum trap, not being sufficiently smooth. Consequently, fixed-wing UAV formation piecewise (PPF) is proposed, where problem flight in different states can be solved by suitable design PPF function. Firstly,...

10.1038/s41598-023-28087-0 article EN cc-by Scientific Reports 2023-02-08

Multilevel image segmentation is time-consuming and involves large computation. The firefly algorithm has been applied to enhancing the efficiency of multilevel segmentation. However, in some cases, easily trapped into local optima. In this paper, an improved (IFA) proposed search thresholds. IFA, order help fireflies escape from optima accelerate convergence, two strategies (i.e., diversity strategy with Cauchy mutation neighborhood strategy) are adaptively chosen according different...

10.1155/2016/1578056 article EN cc-by Mathematical Problems in Engineering 2016-01-01

End-to-end face quality assessment based on deep learning can directly predict the overall quantitative score of quality, thus helping to control risk recognition system. Thanks development automatic pseudo-label generation, most recent methods use large-scale datasets learn model. However, existing regression models fit pseudo-labels, which lack attention samples that are easy be misidentified, and require large for training. The paper treats as a classification problem, focusing difficult...

10.1109/lsp.2021.3109781 article EN IEEE Signal Processing Letters 2021-01-01

Videos incorporate rich semantics as well redundant information. Seeking a compact yet effective video representation, e.g., sample informative frames from the entire video, is critical to efficient recognition. There have been works that formulate frame sampling sequential decision task by selecting one according their importance. In this paper, we present more framework named OCSampler, which explores such representation with short clip. OCSampler designs new paradigm of learning...

10.1109/cvpr52688.2022.01352 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

With the rise of artificial intelligence, machine learning (ML) is increasingly integrated into daily life. Facial expressions, lasting about 1/20th a second and difficult to conceal, convey nuanced emotions beyond words. They are categorized as macro displayed under normal circumstances, micro fleeting subconscious. However, recognizing expressions in photos challenging due varied backgrounds, appearances, age, race, impacting accuracy. Addressing this, our research focuses on...

10.4018/ijec.368068 article EN cc-by International Journal of e-Collaboration 2025-02-01

Traditional recommendation system focus more on the correlations between users and items (user-item relationships), while research user-user relationships has received significant attention these years, which is also known as social recommendation. Graph-based models have achieved a great success in this task by utilizing complex topological information of networks. However, still face insufficient expressive overfitting problems. Counterfactual approaches are proven effective augmentation...

10.1609/aaai.v39i1.32011 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Recognizing transformation types applied to a video clip (RecogTrans) is long-established paradigm for selfsupervised representation learning, which achieves much inferior performance compared instance discrimination approaches (InstDisc) in recent works. However, based on thorough comparison of representative Recog-Trans and InstDisc methods, we observe the great potential RecogTrans both semantic-related temporalrelated downstream tasks. Based hard-label classification, existing suffer...

10.1109/cvpr52688.2022.00301 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Multilevel image thresholding is a powerful and commonly used technique in analysis. Conventional segmentation methods suffer large amount of computation time unstable results. In this paper, we present multilevel method based on fuzzy entropy modified gravitational search algorithm. Fuzzy extended to multilevel, algorithm (mGSA) proposed accelerate the maximization process. Experimental results show that can obtain optimal thresholds mGSA has more accurate stable compared with firefly (FA)...

10.1109/icit.2016.7474845 article EN 2022 IEEE International Conference on Industrial Technology (ICIT) 2016-03-01

Though it has been easier to build large face datasets by collecting images from the Internet in this Big Data era, time-consuming manual annotation process prevents researchers constructing larger ones, which makes automatic cleaning of noisy labels highly desirable. However, identifying mislabeled faces machine is quite challenging because diversity a person’s that are captured wildly at all ages extraordinarily rich. In view this, we propose graph-based method mainly employs community...

10.1155/2018/4512473 article EN cc-by Computational Intelligence and Neuroscience 2018-01-01

Abstract Image encryption algorithms based on chaos theory have rapidly developed in recent years, with many achieving by confusion-diffusion structures. However, the security performance of these needs to be improved. This paper proposes a holographic algorithm new integrated chaotic system and mask. The improved Gerchberg-Saxton transforms plaintext images into pure-phase holograms. masks generated decompose holograms sub-images. sub-images are pixel-wise heterogeneous operations finally...

10.1088/1402-4896/ad3adb article EN cc-by-nc-nd Physica Scripta 2024-04-17

Predicting the popularity of online content is an important task for recommendation, social influence prediction and so on. Recent deep learning models generally utilize graph neural networks to model complex relationship between information cascade future popularity, have shown better results compared with traditional methods. However, existing adopt simple pooling strategies, e.g., summation or average, which prone generate inefficient representation lead unsatisfactory results. Meanwhile,...

10.3390/axioms10030159 article EN cc-by Axioms 2021-07-23

Recently, there has been an increasing interest in developing diffusion-based text-to-image generative models capable of generating coherent and well-formed visual text. In this paper, we propose a novel efficient approach called GlyphControl to address task. Unlike existing methods that rely on character-aware text encoders like ByT5 require retraining models, our leverages additional glyph conditional information enhance the performance off-the-shelf Stable-Diffusion model accurate By...

10.48550/arxiv.2305.18259 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01
Coming Soon ...