Federico Raue

ORCID: 0000-0002-8604-6207
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image Processing Techniques
  • Advanced Image and Video Retrieval Techniques
  • Advanced Neural Network Applications
  • Music and Audio Processing
  • Visual Attention and Saliency Detection
  • Advanced Vision and Imaging
  • Generative Adversarial Networks and Image Synthesis
  • Speech and Audio Processing
  • Multimodal Machine Learning Applications
  • Video Surveillance and Tracking Methods
  • Image and Signal Denoising Methods
  • Domain Adaptation and Few-Shot Learning
  • Computational Geometry and Mesh Generation
  • Machine Learning and Data Classification
  • Anomaly Detection Techniques and Applications
  • Image Processing Techniques and Applications
  • Natural Language Processing Techniques
  • Neural Networks and Applications
  • Animal Vocal Communication and Behavior
  • Advanced Text Analysis Techniques
  • Diverse Musicological Studies
  • Advanced Graph Neural Networks
  • Network Security and Intrusion Detection
  • Adversarial Robustness in Machine Learning
  • Modular Robots and Swarm Intelligence

German Research Centre for Artificial Intelligence
2015-2024

University of Kaiserslautern
2015-2021

Fundación Centro Tecnológico de la Información y la Comunicación
2007

This paper addresses the problem of pixel-level segmentation and classification scene images with an entirely learning-based approach using Long Short Term Memory (LSTM) recurrent neural networks, which are commonly used for sequence classification. We investigate two-dimensional (2D) LSTM networks natural taking into account complex spatial dependencies labels. Prior methods generally have required separate image stages and/or pre- post-processing. In our approach, classification,...

10.1109/cvpr.2015.7298977 article EN 2015-06-01

The rapidly evolving field of sound classification has greatly benefited from the methods other domains. Today, trend is to fuse domain-specific tasks and approaches together, which provides community with new outstanding models.We present AudioCLIP – an extension CLIP model that handles audio in addition text images. Utilizing AudioSet dataset, our proposed incorporates ESResNeXt audio-model into framework, thus enabling it perform multimodal keeping CLIP's zero-shot capabilities.AudioCLIP...

10.1109/icassp43922.2022.9747631 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Environmental Sound Classification (ESC) is an active research area in the audio domain and has seen a lot of progress past years. However, many existing approaches achieve high accuracy by relying on domain-specific features architectures, making it harder to benefit from advances other fields (e.g., image domain). Additionally, some successes have been attributed discrepancy how results are evaluated (i.e., unofficial splits UrbanSound8K (US8K) dataset), distorting overall progression...

10.1109/icpr48806.2021.9413035 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2021-01-10

Diffusion Models (DMs) have disrupted the image Super-Resolution (SR) field and further closed gap between quality human perceptual preferences.They are easy to train can produce very high-quality samples that exceed realism of those produced by previous generative methods.Despite their promising results, they also come with new challenges need research: high computational demands, comparability, lack explainability, color shifts, more.Unfortunately, entry into this is overwhelming because...

10.1109/tnnls.2024.3476671 article EN cc-by IEEE Transactions on Neural Networks and Learning Systems 2024-01-01

10.1109/ijcnn60899.2024.10651227 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2024-06-30

Dataset distillation is the concept of condensing large datasets into smaller but highly representative synthetic samples. While previous research has primarily focused on image classification, its application to Super-Resolution (SR) remains underexplored. This exploratory work studies multiple dataset techniques applied SR, including pixel- and latent-space approaches under different aspects. Our experiments demonstrate that a 91.12% size reduction can be achieved while maintaining...

10.48550/arxiv.2502.03656 preprint EN arXiv (Cornell University) 2025-02-05

10.1109/wacv61041.2025.00054 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025-02-26

10.1109/wacv61041.2025.00676 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025-02-26

We propose a novel way to measure and understand convolutional neural networks by quantifying the amount of input signal they let in. To do this, an autoencoder (AE) was fine-tuned on gradients from pre-trained classifier with fixed parameters. compared reconstructed samples AEs that were set image classifiers (AlexNet, VGG16, ResNet-50, Inception v3) found substantial differences. The AE learns which aspects space preserve ones ignore, based information encoded in backpropagated gradients....

10.1109/cvpr.2018.00328 article EN 2018-06-01

Environmental Sound Classification (ESC) is a rapidly evolving field that recently demonstrated the advantages of application visual domain techniques to audio-related tasks. Previous studies indicate domain-specific modification cross-domain approaches show promise in pushing whole area ESC forward. In this paper, we present new time-frequency transformation layer based on complex frequency B-spline (fbsp) wavelets. Being used with high-performance audio classification model, proposed...

10.1109/ijcnn52387.2021.9533654 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2021-07-18

In typical computer vision problems revolving around video data, pre-trained models are simply evaluated at test time, without adaptation. This general approach clearly cannot capture the shifts that will likely arise between distributions from which training and data have been sampled. Adapting a model to new en-countered time could be essential avoid potentially catastrophic effects of such shifts. However, given inherent impossibility labeling only available test-time, traditional...

10.1109/wacv51458.2022.00266 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022-01-01

Diffusion Models (DMs) have disrupted the image Super-Resolution (SR) field and further closed gap between quality human perceptual preferences. They are easy to train can produce very high-quality samples that exceed realism of those produced by previous generative methods. Despite their promising results, they also come with new challenges need research: high computational demands, comparability, lack explainability, color shifts, more. Unfortunately, entry into this is overwhelming...

10.48550/arxiv.2401.00736 preprint EN cc-by arXiv (Cornell University) 2024-01-01

The aim of this work is to investigate Long Short-Term Memory (LSTM) for finding the semantic associations between two parallel text lines different instances same class sequence. In work, we propose a new model called class-less classifier, which cognitive motivated by simplified version infants learning. presented not only learns association but also relation labels and classes. addition, our uses LSTM networks learning rule based on alignment both networks. For testing purposes, sequence...

10.1109/icdar.2015.7333828 article EN 2015-08-01

Growing amounts of online user data motivate the need for automated processing techniques. In case ratings, one interesting option is to use neural networks learning predict ratings given an item and a user. While training prediction, such approach at same time learns map each vector, so-called embedding. Such embeddings can example be valuable estimating similarity. However, there are various ways how information combined in networks, it unclear way combining affects resulting embeddings.In...

10.1109/ijcnn.2019.8852259 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2019-07-01

The overwhelming success of Deep Learning approaches in recent years is often driven by the availability large public datasets. However, some domains like finance, creating and sharing realistic datasets hindered secrecy or privacy concerns. This can lead to a mismatch, where that have proven work well on public, research-oriented end up underperforming when applied real-world (private) In this work, we focus task Outlier Detection (OD) bridge above gap building an autoencoder based approach...

10.1109/ijcnn54540.2023.10191326 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2023-06-18
Coming Soon ...