NFDI4DS | UHH-SEMS - Publication Details

Tao Tu

ORCID: 0000-0001-9191-7938

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5014724876

Research Areas

Neural dynamics and brain function
Speech Recognition and Synthesis
Functional Brain Connectivity Studies
EEG and Brain-Computer Interfaces
Identity, Memory, and Therapy
Speech and Audio Processing
Grief, Bereavement, and Mental Health
Advanced Neural Network Applications
Multimodal Machine Learning Applications
Music and Audio Processing
Visual perception and processing mechanisms
3D Surveying and Cultural Heritage
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Face Recognition and Perception
Video Surveillance and Tracking Methods
Topic Modeling
Speech and dialogue systems
Radiomics and Machine Learning in Medical Imaging
Human Pose and Action Recognition
Image Enhancement Techniques
Robotics and Sensor-Based Localization
Neural Networks and Applications
Anxiety, Depression, Psychometrics, Treatment, Cognitive Processes
Machine Learning in Healthcare

Google (United States)
2024-2025

Google (United Kingdom)
2024

DeepMind (United Kingdom)
2024

University of Science and Technology of China
2023

National Tsing Hua University
2023

Beijing University of Chemical Technology
2021-2022

National Taiwan University
2019-2021

Columbia University
2017-2021

New York University
2017

Stanford University
2016

Towards Generalist Biomedical AI

OPENALEX - Publications

Tao Tu Shekoofeh Azizi Danny Driess Mike Schaekermann Mohamed Amin and 28 more

BackgroundMedicine is inherently multimodal, requiring the simultaneous interpretation and integration of insights between many data modalities spanning text, imaging, genomics, more. Generalist biomedical artificial intelligence systems that flexibly encode, integrate, interpret these might better enable impactful applications ranging from scientific discovery to care delivery.MethodsTo catalyze development models, we curated MultiMedBench, a new multimodal benchmark. MultiMedBench...

10.1056/aioa2300138 article EN NEJM AI 2024-02-22

Capabilities of Gemini Models in Medicine

OPENALEX - Publications

Khaled Saab Tao Tu Wei‐Hung Weng Ryutaro Tanno David Stutz and 61 more

Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date knowledge and understanding complex multimodal data. Gemini models, with strong general capabilities long-context offer exciting possibilities medicine. Building on these core strengths Gemini, we introduce Med-Gemini, family highly capable models that are specialized medicine the ability seamlessly use web search, can be efficiently tailored novel...

10.48550/arxiv.2404.18416 preprint EN arXiv (Cornell University) 2024-04-29

Toward expert-level medical question answering with large language models

OPENALEX - Publications

K. K. Singhal Tao Tu Juraj Gottweis Rory Sayres Ellery Wulczyn and 30 more

Large language models (LLMs) have shown promise in medical question answering, with Med-PaLM being the first to exceed a 'passing' score United States Medical Licensing Examination style questions. However, challenges remain long-form answering and handling real-world workflows. Here, we present 2, which bridges these gaps combination of base LLM improvements, domain fine-tuning new strategies for improving reasoning grounding through ensemble refinement chain retrieval. 2 scores up 86.5% on...

10.1038/s41591-024-03423-7 article EN cc-by-nc-nd Nature Medicine 2025-01-08

Collaboration between clinicians and vision–language models in radiology report generation

OPENALEX - Publications

Ryutaro Tanno David G. T. Barrett Andrew Sellergren Sumedh Ghaisas Sumanth Dathathri and 24 more

Automated radiology report generation has the potential to improve patient care and reduce workload of radiologists. However, path toward real-world adoption been stymied by challenge evaluating clinical quality artificial intelligence (AI)-generated reports. We build a state-of-the-art system for chest radiographs, called Flamingo-CXR, perform an expert evaluation AI-generated reports engaging panel board-certified observe wide distribution preferences across settings, with 56.1%...

10.1038/s41591-024-03302-1 article EN cc-by-nc-nd Nature Medicine 2024-11-07

End-to-End Text-to-Speech for Low-Resource Languages by Cross-Lingual Transfer Learning

OPENALEX - Publications

Yuan-Jui Chen Tao Tu Cheng-chieh Yeh Hung-yi Lee

End-to-end text-to-speech (TTS) has shown great success on large quantities of paired text plus speech data.However, laborious data collection remains difficult for at least 95% the languages over world, which hinders development TTS in different languages.In this paper, we aim to build systems such low-resource (target) where only very limited are available.We show can be effectively constructed by transferring knowledge from a high-resource (source) language.Since model trained source...

10.21437/interspeech.2019-2730 article EN Interspeech 2022 2019-09-13

Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning

OPENALEX - Publications

Alexander H. Liu Tao Tu Hung-yi Lee Lin-Shan Lee

In this paper we propose a Sequential Representation Quantization AutoEncoder (SeqRQ-AE) to learn from primarily unpaired audio data and produce sequences of representations very close phoneme speech utterances. This is achieved by proper temporal segmentation make the phoneme-synchronized, phonetic clustering have total number distinct phonemes. Mapping between phonemes learned small amount annotated paired data. Preliminary experiments on LJSpeech demonstrated for vowels relative locations...

10.1109/icassp40776.2020.9053571 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

V-MIND: Building Versatile Monocular Indoor 3D Detector with Diverse 2D Annotations

OPENALEX - Publications

Jin-Cheng Jhang Tao Tu Fu-En Wang Ke Zhang Min Sun and 1 more

10.1109/wacv61041.2025.00927 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025-02-26

DreaMo: Articulated 3D Reconstruction from a Single Casual Video

OPENALEX - Publications

Tao Tu Mingfeng Li Chieh Hubert Lin Yen-Chi Cheng Min Sun and 1 more

10.1109/wacv61041.2025.00227 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025-02-26

Towards conversational diagnostic artificial intelligence

OPENALEX - Publications

Tao Tu Mike Schaekermann Anil Palepu Khaled Saab Jan Freyberg and 21 more

10.1038/s41586-025-08866-7 article EN cc-by Nature 2025-04-09

Towards accurate differential diagnosis with large language models

OPENALEX - Publications

Daniel McDuff Mike Schaekermann Tao Tu Anil Palepu Amy Wang and 23 more

10.1038/s41586-025-08869-4 article EN cc-by Nature 2025-04-09

Inferring Macroscale Brain Dynamics via Fusion of Simultaneous EEG-fMRI

OPENALEX - Publications

Marios G. Philiastides Tao Tu Paul Sajda

Advances in the instrumentation and signal processing for simultaneously acquired electroencephalography functional magnetic resonance imaging (EEG-fMRI) have enabled new ways to observe spatiotemporal neural dynamics of human brain. Central utility EEG-fMRI neuroimaging systems are methods fusing two data streams, with machine learning playing a key role. These can be dichotomized into those that symmetric asymmetric terms how modalities inform fusion. Studies using these shown fusion...

10.1146/annurev-neuro-100220-093239 article EN Annual Review of Neuroscience 2021-03-24

A multimodal encoding model applied to imaging decision-related neural cascades in the human brain

OPENALEX - Publications

Jordan Muraskin Truman R. Brown Jennifer M. Walz Tao Tu Bryan Conroy and 2 more

10.1016/j.neuroimage.2017.06.059 article EN publisher-specific-oa NeuroImage 2017-06-30

Multivariate dynamical systems-based estimation of causal brain interactions in fMRI: Group-level validation using benchmark data, neurophysiological models and human connectome project data

OPENALEX - Publications

Srikanth Ryali Tianwen Chen Kaustubh Supekar Tao Tu John Kochalka and 2 more

10.1016/j.jneumeth.2016.03.010 article EN Journal of Neuroscience Methods 2016-03-22

Disentangle then Parse: Night-time Semantic Segmentation with Illumination Disentanglement

OPENALEX - Publications

Zhixiang Wei Lin Chen Tao Tu Pengyang Ling Huaian Chen and 1 more

Most prior semantic segmentation methods have been developed for day-time scenes, while typically underperforming in night-time scenes due to insufficient and complicated lighting conditions. In this work, we tackle challenge by proposing a novel paradigm, i.e., disentangle then parse (DTP). DTP explicitly disentangles images into light-invariant reflectance light-specific illumination components recognizes semantics based on their adaptive fusion. Concretely, the proposed comprises two key...

10.1109/iccv51070.2023.01974 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Attentional Bias to Reminders of the Deceased as Compared With a Living Attachment in Grieving

OPENALEX - Publications

Noam Schneck Tao Tu Christina A. Michel George A. Bonanno Paul Sajda and 1 more

10.1016/j.bpsc.2017.08.003 article EN publisher-specific-oa Biological Psychiatry Cognitive Neuroscience and Neuroimaging 2017-08-24

Large-scale network dynamics in neural response to emotionally negative stimuli linked to serotonin 1A binding in major depressive disorder

OPENALEX - Publications

Noam Schneck Tao Tu Harry Rubin Falcone Jeffrey M. Miller Francesca Zanderigo and 7 more

10.1038/s41380-020-0733-5 article EN Molecular Psychiatry 2020-04-30

Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation

OPENALEX - Publications

Tao Tu Qing Ping Govindarajan Thattai Gökhan Tür Prem Natarajan

GuessWhat?! is a visual dialog guessing game which incorporates Questioner agent that generates sequence of questions, while an Oracle answers the respective questions about target object in image. Based on this history between and Oracle, Guesser makes final guess object. While previous work has focused dialogue policy optimization visual-linguistic information fusion, most learns vision-linguistic encoding for three agents solely dataset without shared prior knowledge representation. To...

10.1109/cvpr46437.2021.00557 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Tracking Deceased-Related Thinking With Neural Pattern Decoding of a Cortical-Basal Ganglia Circuit

OPENALEX - Publications

Noam Schneck Stefan Haufe Tao Tu George A. Bonanno Kevin N. Ochsner and 2 more

10.1016/j.bpsc.2017.02.004 article EN Biological Psychiatry Cognitive Neuroscience and Neuroimaging 2017-03-01

Network Configurations in the Human Brain Reflect Choice Bias during Rapid Face Processing

OPENALEX - Publications

Tao Tu Noam Schneck Jordan Muraskin Paul Sajda

Network interactions are likely to be instrumental in processes underlying rapid perception and cognition. Specifically, high-level perceptual regions must interact balance pre-existing models of the environment with new incoming stimuli. Simultaneous electroencephalography (EEG) fMRI (EEG/fMRI) enables temporal characterization brain-network combined improved anatomical localization regional activity. In this paper, we use simultaneous EEG/fMRI multivariate dynamical systems (MDS) analysis...

10.1523/jneurosci.1677-17.2017 article EN cc-by-nc-sa Journal of Neuroscience 2017-11-08

Ongoing monitoring of mindwandering in avoidant grief through cortico-basal-ganglia interactions

OPENALEX - Publications

Noam Schneck Tao Tu Stefan Haufe George A. Bonanno Hanga GalfaIvy and 3 more

An avoidant grief style is marked by repeated and often unsuccessful attempts to prevent thinking about loss. Prior work shows involves monitoring the external environment in order avoid reminders of Here we sought determine whether grievers also monitor internal minimize conscious awareness loss-related thoughts. Individuals bereaved a first-degree relative, spouse or partner within last 14 months participated functional magnetic resonance imaging (fMRI) study (N = 29). We first applied...

10.1093/scan/nsy114 article EN cc-by Social Cognitive and Affective Neuroscience 2018-12-05

Relating Deep Neural Network Representations to EEG-fMRI Spatiotemporal Dynamics in a Perceptual Decision-Making Task

OPENALEX - Publications

Tao Tu Jonathan Koss Paul Sajda

The hierarchical architecture of deep convolutional neural networks (CNN) resembles the multi-level processing stages human visual system during object recognition. Converging evidence suggests that this organization is key to CNN achieving human-level performance in categorization [22]. In paper, we leverage investigate spatiotemporal dynamics rapid brain. Specifically focus on perceptual decisions associated with different levels ambiguity. Using simultaneous EEG-fMRI, demonstrate temporal...

10.1109/cvprw.2018.00267 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2018-06-01

Self-generated Unconscious Processing of Loss Linked to Less Severe Grieving

OPENALEX - Publications

Noam Schneck Tao Tu George A. Bonanno M. Katherine Shear Paul Sajda and 1 more

10.1016/j.bpsc.2018.08.003 article EN publisher-specific-oa Biological Psychiatry Cognitive Neuroscience and Neuroimaging 2018-08-25

ImGeoNet: Image-induced Geometry-aware Voxel Representation for Multi-view 3D Object Detection

OPENALEX - Publications

Tao Tu Shun-Po Chuang Yu-Lun Liu Cheng Sun Ke Zhang and 3 more

We propose ImGeoNet, a multi-view image-based 3D object detection framework that models space by an image-induced geometry-aware voxel representation. Unlike previous methods which aggregate 2D features into voxels without considering geometry, ImGeoNet learns to induce geometry from images alleviate the confusion arising of free space, and during inference phase, only multiple views are required. Besides, powerful pre-trained feature extractor can be leveraged our representation, leading...

10.1109/iccv51070.2023.00644 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

V-MIND: Building Versatile Monocular Indoor 3D Detector with Diverse 2D Annotations

OPENALEX - Publications

Jin-Cheng Jhang Tao Tu Fu-En Wang Ke Zhang Min Sun and 1 more

The field of indoor monocular 3D object detection is gaining significant attention, fueled by the increasing demand in VR/AR and robotic applications. However, its advancement impeded limited availability diversity training data, owing to labor-intensive nature data collection annotation processes. In this paper, we present V-MIND (Versatile Monocular INdoor Detector), which enhances performance detectors across a diverse set classes harnessing publicly available large-scale 2D datasets. By...

10.48550/arxiv.2412.11412 preprint EN arXiv (Cornell University) 2024-12-15

Coming Soon ...