George Retsinas

ORCID: 0000-0001-6734-3575
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Handwritten Text Recognition Techniques
  • Advanced Image and Video Retrieval Techniques
  • Image Retrieval and Classification Techniques
  • Image Processing and 3D Reconstruction
  • Natural Language Processing Techniques
  • Advanced Neural Network Applications
  • Face recognition and analysis
  • Anomaly Detection Techniques and Applications
  • Generative Adversarial Networks and Image Synthesis
  • Video Surveillance and Tracking Methods
  • Domain Adaptation and Few-Shot Learning
  • Mental Health Research Topics
  • Text and Document Classification Technologies
  • Human Pose and Action Recognition
  • Smart Agriculture and AI
  • Digital Mental Health Interventions
  • Matrix Theory and Algorithms
  • Hand Gesture Recognition Systems
  • Topic Modeling
  • Neural Networks and Applications
  • Context-Aware Activity Recognition Systems
  • Gait Recognition and Analysis
  • Multimodal Machine Learning Applications
  • Polynomial and algebraic computation
  • Facial Nerve Paralysis Treatment and Research

National Technical University of Athens
2015-2024

Athena Research and Innovation Center In Information Communication & Knowledge Technologies
2023-2024

National Centre of Scientific Research "Demokritos"
2015-2020

Institute of Informatics & Telecommunications
2015-2019

Purpose An overview of the current use handwritten text recognition (HTR) on archival manuscript material, as provided by EU H2020 funded Transkribus platform. It explains HTR, demonstrates , gives examples cases, highlights affect HTR may have scholarship, and evidences this turning point advanced digitised heritage content. The paper aims to discuss these issues. Design/methodology/approach This adopts a case study approach, using development delivery one openly available platform for...

10.1108/jd-07-2018-0114 article EN Journal of Documentation 2019-07-23

The recent state of the art on monocular 3D face reconstruction from image data has made some impressive advancements, thanks to advent Deep Learning. However, it mostly focused input coming a single RGB image, overlooking following important factors: a) Nowadays, vast majority facial interest do not originate images but rather videos, which contain rich dynamic information. b) Furthermore, these videos typically capture individuals in form verbal communication (public talks,...

10.1109/cvprw59228.2023.00609 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2023-06-01

Agricultural robotics is an up and coming field which deals with the development of robotic systems able to tackle a multitude agricultural tasks efficiently. The case interest, in this work, mushroom collection industrial farms. Developing such robot, select out-root mushroom, requires delicate actions that can only be conducted if well-performing perception module exists. Specifically, one should accurately detect 3D pose order facilitate smooth operation system. In we develop vision for...

10.3390/s23073576 article EN cc-by Sensors 2023-03-29

This paper presents an overview of the e-Prevention: Person Identification and Relapse Detection Challenge, which was open call for researchers at ICASSP-2023. The challenge aimed analysis processing long-term continuous recordings biosignals recorded from wearable sensors, namely accelerometers, gyroscopes heart rate monitors embedded in smartwatches, as well sleep information daily step counts, order to extract high-level representations wearer's activity behavior, termed digital...

10.1109/ojsp.2024.3376300 article EN cc-by-nc-nd IEEE Open Journal of Signal Processing 2024-01-01

Recognition of old Greek document images containing polytonic (multi accent) characters is a challenging task due to the large number existing character classes (more than 270) which cannot be handled sufficiently by current OCR technologies. Taking into account that system was used from late antiquity until recently, amount scanned documents still remains without full test search capabilities. In order assist progress relevant research, this paper introduces first publicly available...

10.1109/icdar.2015.7333841 article EN 2015-08-01

In this paper we present a novel descriptor and method for segmentation-based keyword spotting. We introduce Zoning-Aggregated Hypercolumn features as pixel-level cues document images. Motivated by recent research in machine vision, use an appropriately pretrained convolutional network feature extraction tool. The resulting local are subsequently aggregated to form word-level fixed-length descriptors. Encoding is computationally inexpensive does not require learning separate generative...

10.1109/icfhr.2016.0061 article EN 2016-10-01

In this article, a method for segmentation-based learning-free Query by Example (QbE) keyword spotting on handwritten documents is proposed. The consists of three steps, namely preprocessing, feature extraction and matching, which address critical variations text images (e.g., skew, translation, different writing styles). During the step, sequence descriptors generated using combination zoning scheme novel appearance descriptor, referred as modified Projections Oriented Gradients....

10.1109/tpami.2018.2845880 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2018-06-11

In this work, we explore the discriminating ability of short-term signal patterns (e.g. few minutes long) with respect to person identification task. We focus on signals recorded by simple wearable devices, such as smart watches, which can measure movements (accelerometer and gyroscope sensors) biosignals (heart rate monitor). To address problem, develop a deep neural network, based one-dimensional convolutions, receives raw from three different smartwatch sensors predicts wearing...

10.1109/icassp40776.2020.9053910 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

In this paper, we present a novel approach for segmentation-based handwritten keyword spotting. The proposed relies upon the extraction of simple yet efficient descriptor which is based on projections oriented gradients. To end, global and local word image descriptors, together with their combination, are proposed. Retrieval performed using to euclidean distance between descriptors query segmented images. methods have been evaluated dataset ICFHR 2014 Competition Experimental results prove...

10.1109/das.2016.61 article EN 2016-04-01

Timely detection of relapses constitutes an important step towards improving the quality life in patients with psychotic disorders. In this paper, we design a novel framework for discovering indications by modeling digital phenotype who wear smartwatches. We start designing deep neural network architectures that can use biosignals person identification high discriminatory performance. Then, show how these networks be employed to identify looking at per-person misclassification rate and...

10.1109/ichi57859.2023.00045 article EN 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI) 2023-06-26

In this work, we present the Realistic Synthetic Mushroom Scenes Dataset, which encompasses images depicting mushrooms in various settings relatively cluttered scenes. The dataset is composed of 15,000 high-quality, realistic with useful annotations. can be leveraged to address problems associated mushroom detection, instance segmentation, and 3D pose estimation. These tasks are paramount importance automating harvesting process farms, a challenging costly procedure. Also, proffer three-step...

10.1109/cvprw59228.2023.00668 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2023-06-01

Deep features, defined as the activations of hidden layers a neural network, have given promising results applied to various vision tasks. In this paper, we explore usefulness and transferability deep in context problem keyword spotting (KWS). We use state-of-the-art convolutional network extract features. The optimal parameters concerning their application are subsequently studied: impact choice layer, applying dimensionality reduction with manifold learning technique, well dissimilarity...

10.3390/proceedings2020089 article EN cc-by 2018-01-09

Current state-of-the-art approaches in the field of Handwritten Text Recognition are predominately single task with unigram, character level target units. In our work, we utilize a Multi-task Learning scheme, training model to perform decompositions sequence units different granularity, from fine coarse. We consider this method as way n-gram information, implicitly, process, while final recognition is performed using only unigram output. Unigram decoding such multi-task approach highlights...

10.1109/icpr48806.2021.9412351 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2021-01-10

Deep convolutional neural networks are today the new baseline for a wide range of machine vision tasks. The problem keyword spotting is no exception to this rule. Many successful network architectures and learning strategies have been adapted from other tasks create systems. In paper, we argue that various details concerning adaptation could be re-examined, end building stronger models. particular, examine usefulness pyramidal spatial pooling layer versus simpler approach, show zoning...

10.1109/das.2018.49 article EN 2018-04-01

Keyword spotting (KWS) is defined as the problem of detecting all instances a given word, provided by user either query word image (Query-by-Example, QbE) or string (Query-by-String, QbS) in body digitized documents. detection typically preceded preprocessing step where text segmented into lines (line-level KWS). Methods following this paradigm are monopolized test-time computationally expensive handwritten recognition (HTR)-based approaches; furthermore, they cannot handle queries (QbE). In...

10.1109/cvpr.2019.01294 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

The recent state of the art on monocular 3D face reconstruction from image data has made some impressive advancements, thanks to advent Deep Learning. However, it mostly focused input coming a single RGB image, overlooking following important factors: a) Nowadays, vast majority facial interest do not originate images but rather videos, which contain rich dynamic information. b) Furthermore, these videos typically capture individuals in form verbal communication (public talks,...

10.48550/arxiv.2207.11094 preprint EN cc-by arXiv (Cornell University) 2022-01-01
Coming Soon ...