Haider Al-Tahan

ORCID: 0000-0002-2296-6477
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Neural dynamics and brain function
  • Face Recognition and Perception
  • Music and Audio Processing
  • Advanced Image Processing Techniques
  • Visual and Cognitive Learning Processes
  • Image Processing Techniques and Applications
  • Language, Metaphor, and Cognition
  • Geographic Information Systems Studies
  • Hearing Loss and Rehabilitation
  • Visual Attention and Saliency Detection
  • Spatial Cognition and Navigation
  • Domain Adaptation and Few-Shot Learning
  • Visual perception and processing mechanisms
  • Image Enhancement Techniques

Western University
2020-2021

Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them visual domain. From having a assistant that could guide us through unfamiliar environments generative models produce images using only high-level text description, vision-language model (VLM) applications will significantly impact our relationship with technology. However, there are many challenges need be addressed improve reliability those models. While language is discrete,...

10.48550/arxiv.2405.17247 preprint EN arXiv (Cornell University) 2024-05-27

While vision evokes a dense network of feedforward and feedback neural processes in the brain, visual are primarily modeled with hierarchical networks, leaving computational role poorly understood. Here, we developed generative autoencoder model adversarially trained it on categorically diverse data set images. We hypothesized that ventral pathway can be represented by reconstruction information performed model. compared representational similarity activity patterns proposed temporal...

10.1371/journal.pcbi.1008775 article EN cc-by PLoS Computational Biology 2021-03-24

Learning rich visual representations using contrastive self-supervised learning has been extremely successful. However, it is still a major question whether we could use similar approach to learn superior auditory representations. In this paper, expand on prior work (SimCLR) better We (1) introduce various data augmentations suitable for and evaluate their impact predictive performance, (2) show that training with time-frequency audio features substantially improves the quality of learned...

10.48550/arxiv.2010.09542 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Significant research efforts have been made to scale and improve vision-language model (VLM) training approaches. Yet, with an ever-growing number of benchmarks, researchers are tasked the heavy burden implementing each protocol, bearing a non-trivial computational cost, making sense how all these benchmarks translate into meaningful axes progress. To facilitate systematic evaluation VLM progress, we introduce UniBench: unified implementation 50+ spanning comprehensive range carefully...

10.48550/arxiv.2408.04810 preprint EN arXiv (Cornell University) 2024-08-08

Abstract While vision evokes a dense network of feedforward and feedback neural processes in the brain, visual are primarily modeled with hierarchical networks, leaving computational role poorly understood. Here, we developed generative autoencoder model adversarially trained it on categorically diverse data set images. We hypothesized that ventral pathway can be represented by reconstruction information performed model. compared representational similarity activity patterns proposed...

10.1101/2020.07.23.218859 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2020-07-24

In less than the blink of an eye, human brain processes visual sensory input, interprets scene, identifies faces, and recognizes objects. Decades neurophysiological studies have demonstrated that accomplishes these complicated tasks through a dense network feedforward feedback neural in ventral cortex. So far, are primarily modeled with hierarchical networks, computational role is poorly understood. this study, we developed generative autoencoder model adversarially trained it on large...

10.1167/jov.21.9.2746 article EN cc-by-nc-nd Journal of Vision 2021-09-01

Contrastive learning of auditory and visual perception has been extremely successful when investigated individually. However, there are still major questions on how we could integrate principles learned from both domains to attain effective audiovisual representations. In this paper, present a contrastive framework learn representations unlabeled videos. The type strength augmentations utilized during self-supervised pre-training play crucial role for frameworks work sufficiently. Hence,...

10.48550/arxiv.2110.07082 preprint EN cc-by-nc-sa arXiv (Cornell University) 2021-01-01
Coming Soon ...