Joan Andreu Sánchez

ORCID: 0000-0003-0423-2020
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Handwritten Text Recognition Techniques
  • Topic Modeling
  • Image Processing and 3D Reconstruction
  • Algorithms and Data Compression
  • Speech and dialogue systems
  • Image Retrieval and Classification Techniques
  • Mathematics, Computing, and Information Processing
  • Music and Audio Processing
  • Speech Recognition and Synthesis
  • Advanced Image and Video Retrieval Techniques
  • Machine Learning and Algorithms
  • DNA and Biological Computing
  • Semantic Web and Ontologies
  • Vehicle License Plate Recognition
  • Web Data Mining and Analysis
  • Software Engineering Research
  • Music Technology and Sound Studies
  • 3D Surveying and Cultural Heritage
  • semigroups and automata theory
  • Speech and Audio Processing
  • Text and Document Classification Technologies
  • Space Satellite Systems and Control
  • Engineering and Material Science Research
  • Plasma and Flow Control in Aerodynamics

Universitat Politècnica de València
2015-2024

Universitat Politècnica de Catalunya
2015

Centro Tecnológico de Investigación, Desarrollo e Innovación en tecnologías de la Información y las Comunicaciones (TIC)
2010-2013

Los Alamos National Laboratory
2007

Universidad de Las Palmas de Gran Canaria
2003

This paper describes the Handwritten Text Recognition (HTR) competition on READ dataset that has been held in context of International Conference Frontiers Handwriting 2016. aims to bring together researchers working off-line HTR and provide them a suitable benchmark compare their techniques task transcribing typical historical handwritten documents. Two tracks with different conditions use training data were proposed. Ten research groups registered but finally five submitted results. The...

10.1109/icfhr.2016.0120 article EN 2016-10-01

Purpose An overview of the current use handwritten text recognition (HTR) on archival manuscript material, as provided by EU H2020 funded Transkribus platform. It explains HTR, demonstrates , gives examples cases, highlights affect HTR may have scholarship, and evidences this turning point advanced digitised heritage content. The paper aims to discuss these issues. Design/methodology/approach This adopts a case study approach, using development delivery one openly available platform for...

10.1108/jd-07-2018-0114 article EN Journal of Documentation 2019-07-23

A contest on Handwritten Text Recognition organised in the context of ICFHR 2014 conference is described. Two tracks with increased freedom use training data were proposed and three research groups participated these two tracks. The handwritten images for this drawn from an English set which currently being considered Tran scriptorium project. goal project to develop innovative, efficient cost-effective solutions transcription historical document images, focusing four languages: English,...

10.1109/icfhr.2014.137 article EN 2014-09-01

This paper describes the fourth edition of Handwritten Text Recognition (HTR) competition that was prepared this time in context International Conference on Document Analysis and (ICDAR) 2017. Previous editions were conducted, first, with datasets from tranScriptorium project ICFHR 2014, ICDAR 2015, then, "Recognition Enrichment Archival Documents (READ)" European 2016. aims to bring together researchers working off-line HTR provides them a suitable benchmark compare their techniques task...

10.1109/icdar.2017.226 article EN 2017-11-01

Tran Scriptorium is a 3-years project that aims to develop innovative, cost-effective solutions for the indexing, search and full transcription of historical handwritten document images, using Handwritten Text Recognition (HTR) technology. The production ground-truth (GT) dataset images among first tasks. We address novel approaches faster this GT based on crowd-sourcing prior-knowledge methods. also here low-cost semi-supervised procedure obtaining pairs correct line-level aligned...

10.1109/das.2014.23 article EN 2014-04-01

In mathematical expression recognition, symbol classification is a crucial step. Numerous approaches for recognizing handwritten math symbols have been published, but most of them are either an online approach or hybrid approach. There absence study focused on offline features recognition. Furthermore, many papers provide results difficult to compare. this paper we assess the performance several well-known task. We also test novel set based polar histograms and vertical repositioning method...

10.1109/icpr.2014.507 article EN 2014-08-01

The extraction of relevant information from historical handwritten document collections is one the key steps in order to make these manuscripts available for access and searches. In this competition, goal detect named entities assign each them a semantic category, therefore, simulate filling knowledge database. This paper describes dataset, tasks, evaluation metrics, participants methods results.

10.1109/icdar.2017.227 article EN 2017-11-01

This paper describes the second edition of Handwritten Text Recognition (HTR) contest on tranScriptorium datasets that has been held in context International Conference Document Analysis and 2015. Two tracks with different conditions use training data were proposed. Nine research groups registered but finally three submitted results. The handwritten images for this drawn from English "Bentham collection" dataset used project. A small subset collection chosen present HTR competition. selected...

10.1109/icdar.2015.7333944 article EN 2015-08-01

Transcription of historical handwritten documents is a crucial problem for making easier the access to these general public. Currently, huge amount are being made available by on-line portals worldwide. It not realistic obtain transcription manually, and therefore automatic techniques has be used. tranScriptorium project that aims at researching on modern Handwritten Text Recognition (HTR) technology transcribing documents. The HTR used in based models learnt automatically from examples....

10.1145/2595188.2595193 article EN 2014-05-19

The tranScriptorium project aims to develop innovative, efficient and cost-effective solutions for annotating handwritten historical documents using modern, holistic Handwritten Text Recognition (HTR) technology. Three actions are planned in tranScriptorium: i) improve basic image preprocessing HTR techniques; ii) novel indexing keyword searching approaches; iii) capitalize on new, user-friendly interactive-predictive approaches computer-assisted operation.

10.1145/2494266.2494294 article EN 2013-09-03

An important problem related to the probabilistic estimation of stochastic context-free grammars (SCFGs) is guaranteeing consistency estimated model. This was considered by Booth-Thompson (1973) and Wetherell (1980) studied Maryanski (1974) Chaudhuri et al. (1983) for unambiguous SCFGs only, when probability distributions were relative frequencies in a training sample. In this work, we extend result proving that property guaranteed all without restrictions, are learned from classical...

10.1109/34.615455 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 1997-01-01

Text line segmentation is the process by which text lines in a document image are localized and extracted. It an important step off-line Handwritten Recognition (HTR) given that input of these systems to be transcribed. A myriad solutions problem have been proposed literature. Although may differ greatly on what actually applied perform segmentation, they can classified level precision detail final extracted lines. In this paper we study influence real needs different levels HTR task. We...

10.1109/icdar.2015.7333819 article EN 2015-08-01

The main aim of the Carabela project was to develop and apply techniques that allow textual searching on massive Spanish collections 15th-19th century manuscripts. focused a relatively small subset 125 000 images interest underwater archaeology. For this type manuscripts, state-of-the-art automatic transcription techniques, generally fail achieve usable accuracy. Therefore, rather than insisting in actual transcription, methodologies for probabilistic indexing handwritten text have been...

10.1109/icfhr2020.2020.00026 article EN 2020-09-01

Textual access to large collections of digitized images remains unfeasible because usually they lack transcripts. Transcribing such is in turn typically unattainable terms costs. However, the use probabilistic indices can facilitate textual accessing with only moderate demands resources. Besides allowing effortless information retrieval, it will be shown that also used estimate features indexed but otherwise untranscribed collections, as running words and Zipf's curves. Complete have been...

10.1109/icdar.2019.00026 article EN 2019-09-01

Automatic recognition of printed mathematical symbols is a fundamental problem for expressions. Several classification techniques has been previously used, but there are very few works that compare different on the same database and with experimental conditions. In this work we have tested classical novelty symbol two databases.

10.1109/icpr.2010.481 article EN 2010-08-01

Since the beginning of Neural Networks, different mechanisms have been required to provide a sufficient number examples avoid overfitting.Data augmentation, most common one, is focused on generation new instances performing distortions in real samples.Usually, these transformations are problem-dependent, and they result synthetic set of, likely, unseen examples.In this work, we studied generative model, based paradigm encoder-decoder, that works directly data space, is, with images.This...

10.5220/0006618600960104 article EN cc-by-nc-nd Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications 2018-01-01

Flow discharge at the blunt trailing edge of bodies immersed into supersonic stream alters significantly downstream aerodynamic characteristics. This paper studies changes in base region topology for a wide range purge ejection, using Unsteady Reynolds Averaged Navier–Stokes solutions. At low rate pressure increases, resulting reduction shock wave intensity. Supersonic flows diminishes and thus strengthens wave, linear relationship between strength was obtained investigated conditions....

10.1016/j.compfluid.2013.09.013 article EN cc-by-nc-sa Computers & Fluids 2013-09-19

Recognition of on-line handwritten mathematical symbols has been tackled using different methods, but the recognition rates achieved until now still leave room for improvement. Many published approaches are based on hidden Markov models, and some them use off-line information extracted from data. In this paper, we present a set hybrid features that combine both information. Lately, recurrent neural networks have demonstrated to obtain good results they outperformed models in several sequence...

10.1109/icdar.2013.203 article EN 2013-08-01
Coming Soon ...