Hao Tan

ORCID: 0000-0003-1774-4375
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Multimodal Machine Learning Applications
  • Neurological disorders and treatments
  • Epilepsy research and treatment
  • Computer Graphics and Visualization Techniques
  • Pain Management and Treatment
  • Advanced Vision and Imaging
  • Domain Adaptation and Few-Shot Learning
  • Speech Recognition and Synthesis
  • EEG and Brain-Computer Interfaces
  • Glioma Diagnosis and Treatment
  • Topic Modeling
  • Meningioma and schwannoma management
  • Neuroscience and Neural Engineering
  • Musculoskeletal pain and rehabilitation
  • Trigeminal Neuralgia and Treatments
  • Advanced Image and Video Retrieval Techniques
  • Generative Adversarial Networks and Image Synthesis
  • Adversarial Robustness in Machine Learning
  • Spine and Intervertebral Disc Pathology
  • Neural dynamics and brain function
  • Surgical Simulation and Training
  • Cognitive and developmental aspects of mathematical skills
  • Advanced Neural Network Applications
  • Botulinum Toxin and Related Neurological Disorders
  • Human Pose and Action Recognition

Anhui University of Science and Technology
2025

Beijing Information Science & Technology University
2022-2025

Hubei University of Science and Technology
2025

Oregon Health & Science University
2021-2024

Neurological Surgery
2022-2024

Yunnan University
2022-2024

Chongqing Medical University
2022-2024

Peng Cheng Laboratory
2023-2024

Chongqing University
2024

China Mobile (China)
2023

We propose the first Large Reconstruction Model (LRM) that predicts 3D model of an object from a single input image within just 5 seconds. In contrast to many previous methods are trained on small-scale datasets such as ShapeNet in category-specific fashion, LRM adopts highly scalable transformer-based architecture with 500 million learnable parameters directly predict neural radiance field (NeRF) image. train our end-to-end manner massive multi-view data containing around 1 objects,...

10.48550/arxiv.2311.04400 preprint EN other-oa arXiv (Cornell University) 2023-01-01

This study addresses the prevalent challenges of inefficiency and suboptimal quality in indoor 3D scene generation rendering by proposing a parameter-tuning strategy for Gaussian Splatting (3DGS). Through systematic quantitative analysis various performance indicators under differing resolution conditions, threshold settings average magnitude spatial position gradients, adjustments to scaling learning rate, optimal parameter configuration 3DGS model, specifically tailored modeling scenarios,...

10.3390/ijgi14010021 article EN cc-by ISPRS International Journal of Geo-Information 2025-01-07

With the continuous progress of computer and medical imaging technology, image segmentation has gradually become a hot topic in technology research, playing an essential role field. Magnetic resonance (MRI) can sensitively detect changes water content tissue components, display physiological biochemical information such as function metabolic processes, provide diagnostic basis for some early lesions; it is often more effective detecting lesions than CT, does not produce ionizing radiation...

10.1016/j.jrras.2023.100627 article EN cc-by-nc-nd Journal of Radiation Research and Applied Sciences 2023-07-11

Image-text matching has become a challenging task in the multimedia analysis field. Many advanced methods have been used to explore local and global cross-modal correspondence matching. However, most ignore importance of eliminating potential irrelevant features original each modality common feature. Moreover, extracted from regions images words sentences contain cluttered background noise different occlusion noise, which negatively affects alignment. Different these methods, we propose...

10.1109/tmm.2023.3243665 article EN IEEE Transactions on Multimedia 2023-01-01

Text-to-3D with diffusion models has achieved remarkable progress in recent years. However, existing methods either rely on score distillation-based optimization which suffer from slow inference, low diversity and Janus problems, or are feed-forward that generate low-quality results due to the scarcity of 3D training data. In this paper, we propose Instant3D, a novel method generates high-quality diverse assets text prompts manner. We adopt two-stage paradigm, first sparse set four...

10.48550/arxiv.2311.06214 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Idiopathic inflammatory myopathies (IIMs) are a clinically heterogeneous group of immune-mediated muscle diseases characterized by weakness and multisystem involvement. Early diagnosis IIMs can contribute to better prognosis clinical management. The classifier based on deep learning framework has proven be an effective noninvasive technique for classifying from ultrasound images. However, as the availability data is always in paucity, augmentation indispensable improve classification...

10.1109/icet58434.2023.10211926 article EN 2023-05-12

We propose \textbf{DMV3D}, a novel 3D generation approach that uses transformer-based large reconstruction model to denoise multi-view diffusion. Our incorporates triplane NeRF representation and can noisy images via rendering, achieving single-stage in $\sim$30s on single A100 GPU. train \textbf{DMV3D} large-scale image datasets of highly diverse objects using only losses, without accessing assets. demonstrate state-of-the-art results for the single-image problem where probabilistic...

10.48550/arxiv.2311.09217 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Vision-language pretraining models have achieved great success in supporting multimedia applications by understanding the alignments between images and text. While existing vision-language primarily focus on single image associated with a piece of text, they often ignore alignment at intra-document level, consisting multiple sentences images. In this work, we propose DocumentCLIP, salience-aware contrastive learning framework to enforce comprehend interaction longer text within documents....

10.48550/arxiv.2306.06306 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Federated learning (FL) aims to collaboratively train a shared model across multiple clients without transmitting their local data. Data heterogeneity is critical challenge in realistic FL settings, as it causes significant performance deterioration due discrepancies optimization among models. In this work, we focus on label distribution skew, common scenario data heterogeneity, where the categories are imbalanced each client. To address issue, propose FedBalance, which corrects bias models...

10.48550/arxiv.2311.08202 preprint EN other-oa arXiv (Cornell University) 2023-01-01

BACKGROUND AND OBJECTIVES: Labeling residents as “black” or “white” clouds based on perceived presumed workloads is a timeworn custom across medical training and practice. Previous studies examining whether such perceptions align with objective workload patterns have offered conflicting results. We assessed peer-assigned labels were associated between-resident differences in objective, on-call metrics three classes of neurosurgery junior residents. In doing so, we introduce more inclusive...

10.1227/neu.0000000000002740 article EN Neurosurgery 2023-10-24

The climate and environmental pollution problems caused by carbon dioxide other harmful gases emitted from traditional fossil fuel thermal power plants are increasingly threatening the living environment of mankind. In September, 2020, Chinese government clearly put forward national strategic goal "Carbon Peak Carbon Neutrality". Distributed generation is main means to effectively reduce emissions, especially rapid development wind generation. Accurate stable speed prediction can reasonably...

10.1016/j.egyr.2022.10.399 article EN cc-by-nc-nd Energy Reports 2022-11-01
Coming Soon ...