- Natural Language Processing Techniques
- Topic Modeling
- Human Pose and Action Recognition
- Remote-Sensing Image Classification
- Image Processing Techniques and Applications
- Advanced Vision and Imaging
- Advanced Image Fusion Techniques
- Domain Adaptation and Few-Shot Learning
- Anomaly Detection Techniques and Applications
- Liver Disease Diagnosis and Treatment
- Video Surveillance and Tracking Methods
- Image Enhancement Techniques
- Cancer-related molecular mechanisms research
- Advanced Image Processing Techniques
- Multimodal Machine Learning Applications
- Smart Agriculture and AI
- RNA modifications and cancer
- Target Tracking and Data Fusion in Sensor Networks
- Color Science and Applications
- Text Readability and Simplification
- COVID-19 diagnosis using AI
- Spectroscopy and Chemometric Analyses
- Advanced SAR Imaging Techniques
- Healthcare Technology and Patient Monitoring
- Metabolism, Diabetes, and Cancer
University of Amsterdam
2023-2025
Beijing Academy of Science and Technology
2025
Chinese Academy of Sciences
2022-2024
Beijing Institute of Technology
2022-2024
Henan University of Science and Technology
2024
Shanghai Jiao Tong University
2024
The First Affiliated Hospital, Sun Yat-sen University
2024
Sun Yat-sen University
2024
Shenzhen Institutes of Advanced Technology
2024
Northwestern Polytechnical University
2023
Abstract With the development of deep learning theory, application Yolov3 in fruit detection has been widely studied. Aiming at problem that loses information during network transmission and semantic feature extraction small targets is not rich, this article proposed an improved cherry tomato algorithm. Firstly, algorithm uses dual path as a to extract richer target features. Second, four layers with different scales are established for multiscale prediction. Finally, K‐means++ clustering...
The temporal action localization research aims to discover instances from untrimmed videos, representing a fundamental step in the field of intelligent video understanding. With advent deep learning, backbone networks have been instrumental providing representative spatiotemporal features, while end-to-end learning paradigm has enabled development high-quality models through data-driven training. Both supervised and weakly approaches contributed rapid progress localization, resulting...
Peng Xu, Hamidreza Saghir, Jin Sung Kang, Teng Long, Avishek Joey Bose, Yanshuai Cao, Jackie Chi Kit Cheung. Proceedings of the 57th Annual Meeting Association for Computational Linguistics. 2019.
In this paper, we introduce hierarchical action search. Starting from the observation that hierarchies are mostly ignored in literature, retrieve not only individual actions but also relevant and related actions, given an name or video example as input. We propose a hyperbolic network, which is centered around space shared by videos. Our discriminative embedding projects on while jointly optimizing hypernym-hyponym relations between pairs large margin separation all actions. The projected...
Zero-shot learning (ZSL) aims to recognize unseen classes that are excluded from training classes. ZSL suffers 1) bias (Z-Bias) --- model is biased towards seen because data inaccessible for training; 2) variance (Z-Variance) associating different images same semantic embedding yields large error. To reduce Z-Bias, we propose a pseudo transfer mechanism, where first synthesize the distribution of using embeddings, then minimize mismatch between and synthesized distribution. Z-Variance,...
Land cover classification is a popular research field in remote sensing applications, which have to both consider the pixel-level and boundary mapping comprehensively. Although multi-scale features deep learning (DL) network powerful ability, how use feature description produce an accurate land from very high resolution (VHR) optical image still challenging task because of large intraclass or small interclass difference covers. Therefore, aiming at achieving more classification, we proposed...
Superimposing visible watermarks on images provides a powerful weapon to cope with the copyright issue. Watermark removal techniques, which can strengthen robustness of in an adversarial way, have attracted increasing research interest. Modern watermark methods perform localization and background restoration simultaneously, could be viewed as multi-task learning problem. However, existing approaches suffer from incomplete detected degraded texture quality restored background. Therefore, we...
Background: Previous studies have verified that metabolic dysfunction-associated steatotic liver disease (MASLD) confered higher risk of coronary atherosclerosis development. However, whether MASLD influence prognosis after drug-eluting stent (DES) implantation treatment remain not known. Methods: In this retrospective observational study, 301 included cardiovascular (CVD) patients who underwent re-coronary angiography the first successful DES-based percutaneous intervention. All received...
Nicotine, the principal alkaloid in tobacco, exhibits significant central nervous system activity and induces a wide array of physiological effects. In addition to its well-documented role tobacco dependence, previous studies have suggested that nicotine also has diverse pharmacological properties. These include alleviating symptoms associated with Parkinson's disease, potentially reducing risk Alzheimer's mitigating oxidative stress, as well anti-inflammatory anxiolytic Neuroscientists...
Humans interpret texts with respect to some background information, or world knowledge, and we would like develop automatic reading comprehension systems that can do the same. In this paper, introduce a task several models drive progress towards goal. particular, propose of rare entity prediction: given web document entities removed, are tasked predicting correct missing conditioned on context lexical resources. This is challenging due diversity language styles extremely large number...
Recent work in learning vector-space embeddings for multi-relational data has focused on combining relational information derived from knowledge bases with distributional large text corpora. We propose a simple approach that leverages the descriptions of entities or phrases available lexical resources, conjunction semantics, order to derive better initialization training models. Applying this TransE model results significant new state-of-the-art performances WordNet dataset, decreasing mean...
Object detection is challenging in high spatial resolution (HSR) remote sensing images that have a complex background and irregular object locations. To minimize manual annotation cost supervised learning methods achieve advanced performance, we proposed point-based weakly method to address the challenge HSR images. In study, point labels are introduced guide candidate bounding box mining generate pseudobounding boxes for objects. Then, applied train model. A progressive strategy refine...
With the development of machine learning, many researchers have used learning models for building extraction in high resolution remote sensing images. Especially recently proposed deep models, it has been widely detection urban monitoring. In this study, performances images segmentation based on Fully Convolutional Networks (FCN) model and shallow are qualitatively quantitatively compared. Firstly, public aerial dataset Massachusetts dataset[1] preprocessed to extract features. Then, we...
To interpret deep neural networks, one main approach is to dissect the visual input and find prototypical parts responsible for classification. However, existing methods often ignore hierarchical relationship between these prototypes, thus can not explain semantic concepts at both higher level (e.g., water sports) lower swimming). In this paper inspired by human cognition system, we leverage hierarchal information deal with uncertainty. end, propose HIerarchical Prototype Explainer (HIPE)...
Given a composite image, image harmonization aims to adjust the foreground illumination be consistent with background. Previous methods have explored transforming features achieve competitive performance. In this work, we show that using global information guide feature transformation could significant improvement. Besides, propose transfer foreground-background relation from real images images, which can provide intermediate supervision for transformed encoder features. Additionally,...
The growth and aging process of the human population has accelerated increase in surgical procedures. Yet, demand for increasing operations can be hardly met since training anesthesiologists is usually a long-term process. Closed-loop artificial intelligence (AI) model provides possibility to solve intelligent decision-making anesthesia auxiliary control and, as such, allowed breakthroughs closed-loop clinical practices intensive care units (ICUs). However, applying an open-loop algorithm...
Transformers have shown significant effectiveness for various vision tasks including both high-level and low-level vision. Recently, masked autoencoders (MAE) feature pre-training further unleashed the potential of Transformers, leading to state-of-the-art performances on tasks. However, significance MAE has not been sufficiently explored. In this paper, we show that are also scalable self-supervised learners image processing We first present an efficient Transformer model considering...
Hierarchical clustering is a natural approach to discover ontologies from data. Yet, existing approaches are hampered by their inability scale large datasets and the discrete encoding of hierarchy. We introduce scalable Hyperbolic Clustering (sHHC) which overcomes these limitations learning continuous hierarchies in hyperbolic space. Our hierarchical high quality can be obtained fraction runtime.Additionally, we demonstrate strength sHHC on downstream cross-modal self-supervision task. By...