Hao Tan

ORCID: 0000-0003-1774-4375
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Neurological disorders and treatments
  • Multimodal Machine Learning Applications
  • Epilepsy research and treatment
  • Pain Management and Treatment
  • Computer Graphics and Visualization Techniques
  • Advanced Vision and Imaging
  • Topic Modeling
  • EEG and Brain-Computer Interfaces
  • Glioma Diagnosis and Treatment
  • Speech Recognition and Synthesis
  • Domain Adaptation and Few-Shot Learning
  • Meningioma and schwannoma management
  • Trigeminal Neuralgia and Treatments
  • Neuroscience and Neural Engineering
  • Musculoskeletal pain and rehabilitation
  • Transcranial Magnetic Stimulation Studies
  • Surgical Simulation and Training
  • Spine and Intervertebral Disc Pathology
  • Advanced Image and Video Retrieval Techniques
  • Cognitive and developmental aspects of mathematical skills
  • Human Pose and Action Recognition
  • Botulinum Toxin and Related Neurological Disorders
  • Advanced Neural Network Applications
  • Generative Adversarial Networks and Image Synthesis
  • Neural dynamics and brain function

Anhui University of Science and Technology
2025

Beijing Information Science & Technology University
2022-2025

Hubei University of Science and Technology
2025

Oregon Health & Science University
2021-2024

Neurological Surgery
2022-2024

Yunnan University
2022-2024

Chongqing Medical University
2022-2024

Peng Cheng Laboratory
2023-2024

Chongqing University
2024

China Mobile (China)
2023

This study addresses the prevalent challenges of inefficiency and suboptimal quality in indoor 3D scene generation rendering by proposing a parameter-tuning strategy for Gaussian Splatting (3DGS). Through systematic quantitative analysis various performance indicators under differing resolution conditions, threshold settings average magnitude spatial position gradients, adjustments to scaling learning rate, optimal parameter configuration 3DGS model, specifically tailored modeling scenarios,...

10.3390/ijgi14010021 article EN cc-by ISPRS International Journal of Geo-Information 2025-01-07

We propose the first Large Reconstruction Model (LRM) that predicts 3D model of an object from a single input image within just 5 seconds. In contrast to many previous methods are trained on small-scale datasets such as ShapeNet in category-specific fashion, LRM adopts highly scalable transformer-based architecture with 500 million learnable parameters directly predict neural radiance field (NeRF) image. train our end-to-end manner massive multi-view data containing around 1 objects,...

10.48550/arxiv.2311.04400 preprint EN other-oa arXiv (Cornell University) 2023-01-01

With the continuous progress of computer and medical imaging technology, image segmentation has gradually become a hot topic in technology research, playing an essential role field. Magnetic resonance (MRI) can sensitively detect changes water content tissue components, display physiological biochemical information such as function metabolic processes, provide diagnostic basis for some early lesions; it is often more effective detecting lesions than CT, does not produce ionizing radiation...

10.1016/j.jrras.2023.100627 article EN cc-by-nc-nd Journal of Radiation Research and Applied Sciences 2023-07-11

Image-text matching has become a challenging task in the multimedia analysis field. Many advanced methods have been used to explore local and global cross-modal correspondence matching. However, most ignore importance of eliminating potential irrelevant features original each modality common feature. Moreover, extracted from regions images words sentences contain cluttered background noise different occlusion noise, which negatively affects alignment. Different these methods, we propose...

10.1109/tmm.2023.3243665 article EN IEEE Transactions on Multimedia 2023-01-01

Diffusion Transformers have emerged as the preeminent models for a wide array of generative tasks, demonstrating superior performance and efficacy across various applications. The promising results come at cost slow inference, each denoising step requires running whole transformer model with large amount parameters. In this paper, we show that performing full computation diffusion is unnecessary, some computations can be skipped by lazily reusing previous steps. Furthermore, lower bound...

10.1609/aaai.v39i19.34248 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

ABSTRACT Traditional invasive blood glucose monitoring methods carry risks such as wound infections and patient discomfort. To address these issues, we propose a non‐invasive method based on facial infrared thermography, aiming to enhance comfort improve the accuracy convenience of detection. data imbalance problem, wavelet‐based sample pairing fusion technique was used thermal imaging dataset. Features extracted by MobileNetV3 network were input into an SVM model for training,...

10.1002/ima.70100 article EN International Journal of Imaging Systems and Technology 2025-05-01

Text-to-3D with diffusion models has achieved remarkable progress in recent years. However, existing methods either rely on score distillation-based optimization which suffer from slow inference, low diversity and Janus problems, or are feed-forward that generate low-quality results due to the scarcity of 3D training data. In this paper, we propose Instant3D, a novel method generates high-quality diverse assets text prompts manner. We adopt two-stage paradigm, first sparse set four...

10.48550/arxiv.2311.06214 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Idiopathic inflammatory myopathies (IIMs) are a clinically heterogeneous group of immune-mediated muscle diseases characterized by weakness and multisystem involvement. Early diagnosis IIMs can contribute to better prognosis clinical management. The classifier based on deep learning framework has proven be an effective noninvasive technique for classifying from ultrasound images. However, as the availability data is always in paucity, augmentation indispensable improve classification...

10.1109/icet58434.2023.10211926 article EN 2023-05-12

We propose \textbf{DMV3D}, a novel 3D generation approach that uses transformer-based large reconstruction model to denoise multi-view diffusion. Our incorporates triplane NeRF representation and can noisy images via rendering, achieving single-stage in $\sim$30s on single A100 GPU. train \textbf{DMV3D} large-scale image datasets of highly diverse objects using only losses, without accessing assets. demonstrate state-of-the-art results for the single-image problem where probabilistic...

10.48550/arxiv.2311.09217 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Vision-language pretraining models have achieved great success in supporting multimedia applications by understanding the alignments between images and text. While existing vision-language primarily focus on single image associated with a piece of text, they often ignore alignment at intra-document level, consisting multiple sentences images. In this work, we propose DocumentCLIP, salience-aware contrastive learning framework to enforce comprehend interaction longer text within documents....

10.48550/arxiv.2306.06306 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Federated learning (FL) aims to collaboratively train a shared model across multiple clients without transmitting their local data. Data heterogeneity is critical challenge in realistic FL settings, as it causes significant performance deterioration due discrepancies optimization among models. In this work, we focus on label distribution skew, common scenario data heterogeneity, where the categories are imbalanced each client. To address issue, propose FedBalance, which corrects bias models...

10.48550/arxiv.2311.08202 preprint EN other-oa arXiv (Cornell University) 2023-01-01

BACKGROUND AND OBJECTIVES: Labeling residents as “black” or “white” clouds based on perceived presumed workloads is a timeworn custom across medical training and practice. Previous studies examining whether such perceptions align with objective workload patterns have offered conflicting results. We assessed peer-assigned labels were associated between-resident differences in objective, on-call metrics three classes of neurosurgery junior residents. In doing so, we introduce more inclusive...

10.1227/neu.0000000000002740 article EN Neurosurgery 2023-10-24
Coming Soon ...