- Multimodal Machine Learning Applications
- Domain Adaptation and Few-Shot Learning
- Advanced Neural Network Applications
- Obstructive Sleep Apnea Research
- Soil Geostatistics and Mapping
- Neuroscience of respiration and sleep
- Generative Adversarial Networks and Image Synthesis
- Advanced Image Processing Techniques
- Human Pose and Action Recognition
- Machine Learning and ELM
- Cardiovascular Syncope and Autonomic Disorders
- Geochemistry and Geologic Mapping
- Land Use and Ecosystem Services
- Information Technology Governance and Strategy
- Statistical Methods and Inference
- Speech and dialogue systems
- Cleft Lip and Palate Research
- Handwritten Text Recognition Techniques
- Impact of Light on Environment and Health
- Advanced Technologies in Various Fields
- Advanced Image and Video Retrieval Techniques
- ERP Systems Implementation and Impact
- Open Education and E-Learning
- Sparse and Compressive Sensing Techniques
- Infant Nutrition and Health
Shanghai Artificial Intelligence Laboratory
2024
Beijing Academy of Artificial Intelligence
2024
First Affiliated Hospital of Guangzhou Medical University
2023
State Key Laboratory of Respiratory Disease
2023
Guangzhou Medical University
2023
University of Hong Kong
2023
The University of Tokyo
2023
Hong Kong University of Science and Technology
2023
Jilin Maternity and Child Health Care Hospital
2022
Tsinghua–Berkeley Shenzhen Institute
2021
Abstract Problem Autism spectrum disorders (ASD) are pervasive neurodevelopmental and generally accompanied by social disorders, verbal or nonverbal communication defects, inability to concentrate other negative symptoms that affect the autistic person's normal life. However, traditional screening methods time‐consuming public health resources limited. Methods This study proposed a novel technique combined eye‐movement data machine learning algorithms for predicting traits. We converted raw...
Training models with longer in-context lengths is a significant challenge for multimodal model due to substantial GPU memory and computational costs. This exploratory study does not present state-of-the-art models; rather, it introduces an innovative method designed increase text length in multi-modality large language (MLLMs) efficiently. We Visualized In-Context Text Processing (VisInContext), which processes long using visual tokens. technique significantly reduces usage floating point...
Learning from a few training samples is desirable ability of an object detector, inspiring the explorations Few-Shot Object Detection (FSOD). Most existing approaches employ pretrain-transfer paradigm. The model first pre-trained on base classes with abundant data and then transferred to novel annotated samples. Despite substantial progress, FSOD performance still far behind satisfactory. During pre-training, due co-occurrence between classes, learned treat co-occurred as backgrounds....
Recent advances have shown promise in merging neural radiance fields (NeRFs) with pre-trained diffusion models for text-to-3D object generation. However, one enduring challenge is their inadequate capability to accurately parse and regenerate consistent multi-object environments. Specifically, these encounter difficulties representing quantity style prompted by texts, often resulting a collapse of the rendering fidelity that fails match semantic intricacies. Moreover, amalgamating elements...
The explicit neural radiance field (NeRF) has gained considerable interest for its efficient training and fast inference capabilities, making it a promising direction such as virtual reality gaming. In particular, PlenOctree (POT) [43], an hierarchical multi-scale octree representation, emerged structural influential framework. However, POT's fixed structure direct optimization is sub-optimal the scene complexity evolves continuously with updates to cached color density, necessitating...
Enterprise Resource Planning (ERP) systems currently used by American businesses are unsuitable for adoption small and medium size Chinese businesses. Cross-cultural issues include not only the obvious localization interface language differences but also bearable implementation costs, culturally-specific management styles, financial report format discrepancies. This paper discusses cross-cultural that must be addressed to bridge gap between ERP presents a case study of an system designed...
Generating and editing a 3D scene guided by natural language poses challenge, primarily due to the complexity of specifying positional relations volumetric changes within space. Recent advancements in Large Language Models (LLMs) have demonstrated impressive reasoning, conversational, zero-shot generation abilities across various domains. Surprisingly, these models also show great potential realizing interpreting In light this, we propose novel language-guided interactive system, dubbed...
Recently, the self-supervised pre-training paradigm has shown great potential in leveraging large-scale unlabeled data to improve downstream task performance. However, increasing scale of real-world scenarios requires prohibitive computational costs and faces challenge uncurated samples. To address these issues, we build a task-specific framework from selection perspective based on simple hypothesis that samples with similar distribution target can bring substantial performance gains....
Existing NAS (Neural Architecture Search) algorithms achieve a low error rate on vision tasks such as image classification by training each child network with equal resources during the search. However, it is not necessary to train resource or use fully converge score obtain relative performance of network, and there computational redundancy in all networks resource. In this paper, we propose Bandit-NAS automatically compute required data slicing time for network. i): We first model search...
Recent image matting studies are developing towards proposing trimap-free or interactive methods to complete the complex task. Although avoiding extensive labors of trimap interaction, existing still suffer from two limitations: (1) For single with multiple objects, it is essential provide extra interaction information help determining target; (2) transparent accurate regression alpha matte RGB much more difficult compared opaque ones. In this work, we propose UIM, a Unified Interactive...
Image restoration and enhancement is a process of improving the image quality by removing degradations, such as noise, blur, resolution degradation. Deep learning (DL) has recently been applied to enhancement. Due its ill-posed property, plenty works have explored priors facilitate training deep neural networks (DNNs). However, importance not systematically studied analyzed far in research community. Therefore, this paper serves first study that provides comprehensive overview recent...
There is a soaring demand for up-to-date and spatially-explicit soil information to address various environmental challenges. One of the most basic pieces information, essential research decision-making in multiple disciplines classification. Conventional maps are often low spatial resolution lack complexity be practical hands-on use. Digital Soil Mapping (DSM) has emerged as an efficient alternative its reproducibility, updatablity, accuracy, cost-effectiveness, well ability quantify...
The explicit neural radiance field (NeRF) has gained considerable interest for its efficient training and fast inference capabilities, making it a promising direction such as virtual reality gaming. In particular, PlenOctree (POT)[1], an hierarchical multi-scale octree representation, emerged structural influential framework. However, POT's fixed structure direct optimization is sub-optimal the scene complexity evolves continuously with updates to cached color density, necessitating refining...
Background: Plant diseases and pests were natural disasters that seriously threaten agricultural production. Leaf posed a significant threat to the overall productivity quality of apple orchards. Currently, diagnosis plant in orchards mainly relied on manual labor, which was both time-consuming costly. It took considerable amount time train an expert capable accurately identifying pests, different crop vary. However, with use AI technology, training could be completed just few hours day,...
Despite CLIP being the foundation model in numerous vision-language applications, suffers from a severe text spotting bias. Such bias causes models to `Parrot' visual embedded within images while disregarding authentic semantics. We uncover that most popular image-text dataset LAION-2B, captions also densely parrot (spell) images. Our analysis shows around 50% of are with content, and 30% words these content. Based on such observation, we thoroughly inspect different released versions verify...
We discuss the fundamental issue of identification in linear instrumental variable (IV) models with unknown IV validity. With assumption "sparsest rule", which is equivalent to plurality rule but becomes operational computation algorithms, we investigate and prove advantages non-convex penalized approaches over other estimators based on two-step selections, terms selection consistency accommodation for individually weak IVs. Furthermore, propose a surrogate sparsest penalty that aligns...
Abstract Although oral probiotics can improve breast microecology and alleviate the inflammatory response, there are no data regarding cases with existing abscesses. We aimed to investigate effect of Lactobacillus fermentum CECT5716 during needle aspiration in patients lactational Patients (aged 20–41 years) single-cavity abscesses (diameter 3–6 cm) from 12 hospitals were randomly assigned experimental (n = 51) control groups 50). Outcome measures included abscess cure rate on treatment...
Recently, the self-supervised pre-training paradigm has shown great potential in leveraging large-scale unlabeled data to improve downstream task performance. However, increasing scale of real-world scenarios requires prohibitive computational costs and faces challenge uncurated samples. To address these issues, we build a task-specific framework from selection perspective based on simple hypothesis that samples with similar distribution target can bring substantial performance gains....