- Human Pose and Action Recognition
- Advanced Vision and Imaging
- Video Surveillance and Tracking Methods
- Advanced Image and Video Retrieval Techniques
- Anomaly Detection Techniques and Applications
- Multimodal Machine Learning Applications
- Advanced Image Processing Techniques
- AI in cancer detection
- Advanced Neural Network Applications
- Image Enhancement Techniques
- Face recognition and analysis
- Gait Recognition and Analysis
- Biometric Identification and Security
- Radiomics and Machine Learning in Medical Imaging
- Face and Expression Recognition
- Artificial Intelligence in Games
- Neural Networks and Applications
- Sports Analytics and Performance
- Image Retrieval and Classification Techniques
- Domain Adaptation and Few-Shot Learning
- Image Processing Techniques and Applications
- Artificial Intelligence in Healthcare
- Video Analysis and Summarization
- COVID-19 diagnosis using AI
- Biomedical Text Mining and Ontologies
Peking University
2015-2025
University of Oxford
2020-2024
Center for Life Sciences
2024
King University
2024
Beijing Forestry University
2019
Peking University Shenzhen Hospital
2017
Implantable sensors can directly interface with various organs for precise evaluation of health status. However, extracting signals from such mainly requires transcutaneous wires, integrated circuit chips, or cumbersome readout equipment, which increases the risks infection, reduces biocompatibility, limits portability. Here, we develop a set millimeter-scale, chip-less, and battery-less magnetic implants paired fully wearable device measuring biophysical biochemical signals. The induce...
Cell-laden bioprinting is a promising biofabrication strategy for regenerating bioactive transplants to address organ donor shortages. However, there has been little success in reproducing transplantable artificial organs with multiple distinctive cell types and physiologically relevant architecture. In this study, an omnidirectional printing embedded network (OPEN) presented as support medium 3D printing. The state-of-the-art due its one-step preparation, fast removal, versatile ink...
Aging in an individual refers to the temporal change, mostly decline, body’s ability meet physiological demands. Biological age (BA) is a biomarker of chronological aging and can be used stratify populations predict certain age-related chronic diseases. BA predicted from biomedical features such as brain MRI, retinal, or facial images, but inherent heterogeneity process limits usefulness body systems. In this paper, we developed multimodal Transformer–based architecture with cross-attention...
In this work, we focus on a challenging task: synthesizing multiple imaginary videos given single image. Major problems come from high dimensionality of pixel space and the ambiguity potential motions. To overcome those problems, propose new framework that produce by transformation generation. The generated transformations are applied to original image in novel volumetric merge network reconstruct frames video. Through sampling different latent variables, our method can output video samples....
Abstract Motivation Accurate prediction of drug-target interactions (DTIs), especially for novel targets or drugs, is crucial accelerating drug discovery. Recent advances in pretrained language models (PLMs) and multi-modal learning present new opportunities to enhance DTI by leveraging vast unlabeled molecular data integrating complementary information from multiple modalities. Results We introduce DrugLAMP (PLM-Assisted Multi-modal Prediction), a PLM-based framework accurate transferable...
Biomedical knowledge graphs (BKGs) have emerged as powerful tools for organizing and leveraging the vast complex data found across biomedical field. Yet, current reviews of BKGs often limit their scope to specific domains or methods, overlooking broader landscape rapid technological progress reshaping it. In this survey, we address gap by offering a systematic review from three core perspectives: domains, tasks, applications. We begin examining how are constructed diverse sources, including...
Maintaining comprehensive and up-to-date knowledge graphs (KGs) is critical for modern AI systems, but manual curation struggles to scale with the rapid growth of scientific literature. This paper presents KARMA, a novel framework employing multi-agent large language models (LLMs) automate KG enrichment through structured analysis unstructured text. Our approach employs nine collaborative agents, spanning entity discovery, relation extraction, schema alignment, conflict resolution that...
The Platonic Representation Hypothesis suggests a universal, modality-independent reality representation behind different data modalities. Inspired by this, we view each neuron as system and detect its multi-segment activity under various peripheral conditions. We assume there's time-invariant for the same neuron, reflecting intrinsic properties like molecular profiles, location, morphology. goal of obtaining these neuronal representations has two criteria: (I) segments from should have more...
With the rapid increase in amount of multimedia data, video classification has become a demanding and challenging research topic. Compared with image classification, requires mapping that contains hundreds frames to semantic tags, which poses many challenges direct use advanced models originally designed for image-oriented tasks. On other hand, continuous also give us more visual clues we can leverage achieve better classification. One most important is context spatiotemporal domain. In this...
Whole slide image (WSI) analysis presents significant computational challenges due to the massive number of patches in gigapixel images. While transformer architectures excel at modeling long-range correlations through self-attention, their quadratic complexity makes them impractical for pathology applications. Existing solutions like local-global or linear self-attention reduce costs but compromise strong capabilities full self-attention. In this work, we propose Querent, i.e., query-aware...
This paper proposes an adaptive approach to learn class-specific pooling shapes (CSPS) for image classification. Prevalent methods spatial are often conducted on predefined grids of images, which is ad-hoc method and, thus, lacks generalization power across different categories. In contrast, our CSPS designed in a data-driven fashion by generating plenty candidates and selecting the optimal subset each class. Specifically, we establish overcomplete shape set that preserves as many geometric...
In this work, we focus on a challenging task: synthesizing multiple imaginary videos given single image. Major problems come from high dimensionality of pixel space and the ambiguity potential motions. To overcome those problems, propose new framework that produce by transformation generation. The generated transformations are applied to original image in novel volumetric merge network reconstruct frames video. Through sampling different latent variables, our method can output video samples....
Artificial intelligence based diagnosis systems have emerged as powerful tools to reform traditional medical care. Each clinician now wants his own intelligent diagnostic partner expand the range of services he can provide. When reading a clinical note, experts make inferences with relevant knowledge. However, knowledge appears be heterogeneous, including structured and unstructured Existing approaches are incapable uniforming them well. Besides, descriptions findings in notes, which...
Artificial intelligence (AI) approaches in cancer analysis typically utilize a 'one-size-fits-all' methodology characterizing average patient responses. This manner neglects the diverse conditions pancancer and subtypes of individual patients, resulting suboptimal outcomes diagnosis treatment. To overcome this limitation, we shift from blanket application statistics to focus on explicit recognition patient-specific abnormalities. Our objective is use multiomics data empower clinicians with...
A key factor that makes action detection in videos different from general video classification is human-guided clues, especially motion signals. Since not all the pixels a are informative for recognition, irrelevant and redundant parts can lead to lot of noise be burdensome both feature extraction classifier training. This encourages researchers seek out design attentive model dynamically focus computations on spatiotemporal volumes. In this paper, we propose motion-centric attention which...
Future frame prediction for video sequences is a challenging task and worth exploring problem in computer vision. Existing methods often learn motion information the entire image to predict next frames. However, different objects same scene move deform ways intuitively. Considering human visual system, one pays attention key that contain crucial signals, rather than compress an into static representation. Motivated by this property of perception, work, we develop novel object-centric model...
This paper considers the challenging task of long-term video interpolation. Unlike most existing methods that only generate few intermediate frames between adjacent ones, we attempt to speculate or imagine procedure an episode and further multiple two non-consecutive in videos. In this paper, present a novel deep architecture called bidirectional predictive network (BiPN) predicts from opposite directions. The allows model learn scene transformation with time as well longer sequences....
Knowledge distillation (KD) is a powerful technique that enables well-trained large model to assist small model. However, KD constrained in teacher-student manner. Thus, this method may not be appropriate general situations, where the learning abilities of two models are uncertain or significantly different. In paper, we propose collaborative (CL) method, which flexible strategy achieve bidirectional assistance for using mutual knowledge base (MKB). The MKB used collect information and...
Video prediction is the challenging task of generating future frames a video given sequence previously observed frames. This involves construction an internal representation that accurately models frame evolutions, including contents and dynamics. considered difficult due to inherent compounding errors in recursive pixel level prediction. In this paper, we present novel system focuses on regions interest (ROIs) rather than entire learns evolutions at transformation level. We provide two...