NFDI4DS | UHH-SEMS - Publication Details

Markus Diem

ORCID: 0000-0002-5048-5128

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5033628262

Research Areas

Handwritten Text Recognition Techniques
Image Processing and 3D Reconstruction
Image Retrieval and Classification Techniques
Advanced Image and Video Retrieval Techniques
Digital and Cyber Forensics
Natural Language Processing Techniques
Vehicle License Plate Recognition
Anomaly Detection Techniques and Applications
Currency Recognition and Detection
Single-cell and spatial transcriptomics
Digital Media Forensic Detection
Cultural Heritage Materials Analysis
Cell Image Analysis Techniques
Acute Myeloid Leukemia Research
Acute Lymphoblastic Leukemia research
Image and Object Detection Techniques
Archaeological Research and Protection
Mobile Agent-Based Network Management
Microfluidic and Bio-sensing Technologies
Gene expression and cancer classification
Bone and Joint Diseases
Mathematics, Computing, and Information Processing
3D Surveying and Cultural Heritage
Music and Audio Processing
Advanced Neural Network Applications

TU Wien
2011-2021

University of Vienna
2019

University of Applied Sciences Technikum Wien
2013

Institute of Automation
2010

CVL-DataBase: An Off-Line Database for Writer Retrieval, Writer Identification and Word Spotting

OPENALEX - Publications

Florian Kleber Stefan Fiel Markus Diem Robert Sablatnig

In this paper a public database for writer retrieval, identification and word spotting is presented. The CVL-Database consists of 7 different handwritten texts (1 German 6 English Texts) 311 writers. For each text an RGB color image (300 dpi) comprising the printed sample are available as well cropped version (only handwritten). A unique ID identifies writer, whereas bounding boxes single stored in XML file. An evaluation best algorithms ICDAR ICHFR contest has been performed on CVL-database.

10.1109/icdar.2013.117 article EN 2013-08-01

Transforming scholarship in the archives through handwritten text recognition

OPENALEX - Publications

Guenter Muehlberger Louise Seaward Melissa Terras Sofia Ares Oliveira V. Herrero and 49 more

Purpose An overview of the current use handwritten text recognition (HTR) on archival manuscript material, as provided by EU H2020 funded Transkribus platform. It explains HTR, demonstrates , gives examples cases, highlights affect HTR may have scholarship, and evidences this turning point advanced digitised heritage content. The paper aims to discuss these issues. Design/methodology/approach This adopts a case study approach, using development delivery one openly available platform for...

10.1108/jd-07-2018-0114 article EN Journal of Documentation 2019-07-23

cBAD: ICDAR2017 Competition on Baseline Detection

OPENALEX - Publications

Markus Diem Florian Kleber Stefan Fiel Tobias Grüning Basilis Gatos

The cBAD competition aims at benchmarking state-of-the-art baseline detection algorithms. It is in line with previous competitions such as the ICDAR 2013 Handwriting Segmentation Contest. A new, challenging, dataset was created to test behavior of systems on real world data. Since traditional evaluation schemes are not applicable size and modality this dataset, we present a new one that introduces baselines measure performance. We received submissions from five different teams for both tracks.

10.1109/icdar.2017.222 article EN 2017-11-01

READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents

OPENALEX - Publications

Tobias Grüning Roger Labahn Markus Diem Florian Kleber Stefan Fiel

Text line detection is crucial for any application associated with Automatic Recognition or Keyword Spotting. Modern algorithms perform good on well-established datasets since they either comprise clean data simple/homogeneous page layouts. We have collected and annotated 2036 archival document images from different locations time periods. The dataset contains varying layouts degradations that challenge text segmentation methods. Well established evaluation schemes such as the Detection Rate...

10.1109/das.2018.38 article EN 2018-04-01

Automated Flow Cytometric MRD Assessment in Childhood Acute B‐ Lymphoblastic Leukemia Using Supervised Machine Learning

OPENALEX - Publications

Michael J. Reiter Markus Diem Angela Schumich Margarita Maurer Leonid Karawajew and 7 more

Minimal residual disease (MRD) as measured by multiparameter flow cytometry (FCM) is an independent and strong prognostic factor in B-cell acute lymphoblastic leukemia (B-ALL). However, reliable cytometric detection of MRD strongly depends on operator skills expert knowledge. Hence, objective, automated tool for FCM-MRD quantification, able to overcome the technical diversity analytical subjectivity, would be most helpful. We developed a supervised machine learning approach using combination...

10.1002/cyto.a.23852 article EN Cytometry Part A 2019-07-07

ICDAR2017 Competition on Historical Document Writer Identification (Historical-WI)

OPENALEX - Publications

Stefan Fiel Florian Kleber Markus Diem Vincent Christlein Georgios Louloudis and 2 more

The ICDAR 2017 Competition on Historical Document Writer Identification is dedicated to record the most recent advances made in field of writer identification.The goal identification task retrieval pages, which have been written by same author.The test dataset used this competition consists 3600 handwritten pages originating from 13 th 20 century.It contains manuscripts 720 different writers where each contributed five pages.This paper describes dataset, as well details competition.Five...

10.1109/icdar.2017.225 article EN 2017-11-01

ICDAR 2013 Competition on Handwritten Digit Recognition (HDRC 2013)

OPENALEX - Publications

Markus Diem Stefan Fiel Angelika Garz Manuel Keglevic Florian Kleber and 1 more

This paper presents the results of HDRC 2013 competition for recognition handwritten digits organized in conjunction with ICDAR 2013. The general objective this is to identify, evaluate and compare recent developments character introduce a new challenging dataset benchmarking. We describe details including evaluation measures used, give comparative performance analysis nine (9) submitted methods along short description respective methodologies.

10.1109/icdar.2013.287 article EN 2013-08-01

Layout Analysis for Historical Manuscripts Using Sift Features

OPENALEX - Publications

Angelika Garz Robert Sablatnig Markus Diem

We propose a layout analysis method for historical manuscripts that relies on the part-based identification of entities. A entity -- such as letters text, initials or headings is composed set characteristic segments structures, which dissimilar distinct classes in under consideration. This fact exploited order to segment manuscript page into homogeneous regions. Historical documents traditionally involve challenges uneven writing support and varying shapes characters, fluctuating text lines,...

10.1109/icdar.2011.108 article EN International Conference on Document Analysis and Recognition 2011-09-01

End-to-End Text Recognition Using Local Ternary Patterns, MSER and Deep Convolutional Nets

OPENALEX - Publications

Michael Opitz Markus Diem Stefan Fiel Florian Kleber Robert Sablatnig

Text recognition in natural scene images is an application for several computer vision applications like licence plate recognition, automated translation of street signs, help visually impaired people or image retrieval. In this work end-to-end text system presented. For detection AdaBoost ensemble with a modified Local Ternary Pattern (LTP) feature-set post-processing stage build upon Maximally Stable Extremely Region (MSER) used. The done using deep Convolution Neural Network (CNN) trained...

10.1109/das.2014.29 article EN 2014-04-01

ICFHR 2014 Competition on Handwritten Digit String Recognition in Challenging Datasets (HDSRC 2014)

OPENALEX - Publications

Markus Diem Stefan Fiel Florian Kleber Robert Sablatnig José M. Saavedra and 3 more

This paper presents the results of HDSRC 2014 competition on handwritten digit string recognition in challenging datasets organized conjunction with ICFHR 2014. The general objective this is to identify, evaluate and compare recent developments Western Arabic varying length. In addition, introduces two new for benchmarking. We describe details including evaluation measures used, give a comparative performance analysis six (6) participating methods along short description respective methodologies.

10.1109/icfhr.2014.136 article EN 2014-09-01

Recognition of Degraded Handwritten Characters Using Local Features

OPENALEX - Publications

Markus Diem Robert Sablatnig

The main problems of Optical Character Recognition (OCR) systems are solved if printed latin text is considered. Since OCR based upon binary images, their results poor the degraded. In this paper a codex consisting ancient manuscripts investigated. Due to environmental effects characters analyzed washed out which leads gained by state art binarization methods. Hence, segmentation free approach on local descriptors being developed. Regarding information allows for recognizing that only...

10.1109/icdar.2009.158 article EN 2009-01-01

Text Line Detection for Heterogeneous Documents

OPENALEX - Publications

Markus Diem Florian Kleber Robert Sablatnig

Text line detection is a pre-processing step for automated document analysis such as word spotting or OCR. It additionally used structure layout analysis. Considering mixed layouts, degraded documents and handwritten documents, text still challenging. We present novel approach that targets torn having varying layouts writing. The proposed method bottom up fuses words, to globally minimize their fusing distance. In order improve processing time further analysis, lines are represented by...

10.1109/icdar.2013.152 article EN 2013-08-01

cBAD: ICDAR2019 Competition on Baseline Detection

OPENALEX - Publications

Markus Diem Florian Kleber Robert Sablatnig Basilis Gatos

Baseline detection is a simplified text-line extraction that typically serves as pre-processing for Automated Text Recognition. The cBAD competition benchmarks state-of-the-art baseline algorithms. It the successor of 2017 with larger dataset contains more diverse document pages. images together manually annotated groundtruth are made publicly available which allows other teams to benchmark and compare their methods. We could also evaluate winning method on newly introduced now baseline....

10.1109/icdar.2019.00240 article EN 2019-09-01

Clustering of cell populations in flow cytometry data using a combination of Gaussian mixtures

OPENALEX - Publications

Michael J. Reiter Paolo Rota Florian Kleber Markus Diem Stefanie Groeneveld‐Krentz and 1 more

10.1016/j.patcog.2016.04.004 article EN Pattern Recognition 2016-05-01

Recognizing characters of ancient manuscripts

OPENALEX - Publications

Markus Diem Robert Sablatnig

Considering printed Latin text, the main issues of Optical Character Recognition (OCR) systems are solved. However, for degraded handwritten document images, basic preprocessing steps such as binarization, gain poor results with state-of-the-art methods. In this paper ancient Slavonic manuscripts from 11th century investigated. order to minimize consequences false character segmentation, a binarization-free approach based on local descriptors is proposed. Additionally information allows...

10.1117/12.843532 article EN Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE 2010-01-19

Strip shredded document reconstruction using optical character recognition

OPENALEX - Publications

John Perl Markus Diem Florian Kleber R. Sablatnig

Document reconstruction affects different areas such as archeology, philology and forensics. A of fragmented writing materials allows to retrieve analyze the lost content. Due complexity reconstruction, automated algorithms are necessary. methodology for shredded documents is presented in this paper which recognizes characters at stripes' borders matches them subsequently. In order achieve this, an Optical Character Recognition (OCR) system exploited, that capable recognizing partially...

10.1049/ic.2011.0132 article EN 2011-01-01

Text Classification and Document Layout Analysis of Paper Fragments

OPENALEX - Publications

Markus Diem Florian Kleber Robert Sablatnig

In general document image analysis methods are pre-processing steps for Optical Character Recognition (OCR) systems. contrast, the proposed method aims at clustering snippets, so that an automated of documents can be performed. Therefore, words classified according to printed text, manuscripts, and noise. Where, third class corrects falsely segmented background elements. Having text elements, a layout is carried out which groups into lines paragraphs. A back propagation weights - assigned...

10.1109/icdar.2011.175 article EN International Conference on Document Analysis and Recognition 2011-09-01

Registration and enhancing of multispectral manuscript images

OPENALEX - Publications

Martin Lettner Markus Diem Robert Sablatnig Heinz Miklas

Two medieval manuscripts are recorded, investigated and analyzed by philologists in collaboration with computer scientists. Due to mold, air humidity water the parchment is partially damaged consequently hard read. In order enhance readability of text, manuscript pages imaged different spectral bands ranging from 360 1000nm. A registration process necessary for further image processing methods which combine information gained bands. Therefore, images coarsely aligned using rotationally...

10.5281/zenodo.41086 article EN European Signal Processing Conference 2008-08-25

Document analysis applied to fragments

OPENALEX - Publications

Markus Diem Florian Kleber Robert Sablatnig

Document analysis is done to analyze entire forms (e.g. intelligent form analysis, table detection) or describe the layout/structure of a document. In this paper document applied snippets torn documents calculate features that can be used for reconstruction. The main intention handle varying size and different contents handwritten printed text). Documents either destroyed by make content unavailable business crime) due time induced degeneration ancient bad storage conditions). Current...

10.1145/1815330.1815381 article EN 2010-06-09

Detecting Text Areas and Decorative Elements in Ancient Manuscripts

OPENALEX - Publications

Angelika Garz Markus Diem Robert Sablatnig

An approach for the detection of decorative elements - such as initials and headlines text regions, focused on ancient manuscripts, is presented. Due to their age, manuscripts suffer from degradation staining well ink faded-out over time. Identifying regions allows indexing a manuscript serves input Optical Character Recognition (OCR) it localizes interest within document pages. We propose robust method inspired by state-of-the-art object recognition methodologies. Scale Invariant Feature...

10.1109/icfhr.2010.35 article EN 2010-11-01

WGAN Latent Space Embeddings for Blast Identification in Childhood Acute Myeloid Leukaemia

OPENALEX - Publications

Roxane Licandro Thomas Schlegl Michael J. Reiter Markus Diem Michael Dworzak and 3 more

Acute Myeloid Leukaemia (AML) is a rare type of childhood acute leukaemia. During treatment, the assessment number cancer cells particularly important to determine treatment response and consequently adapt scheme if necessary. Minimal Residual Disease (MRD) diagnostic measure based on Flow CytoMetry (FCM) data that captures amount blasts in blood sample clinical tool for planning patients' individual therapy, which requires reliable blast identification. In this work we propose novel...

10.1109/icpr.2018.8546177 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2018-08-01

MSIO: MultiSpectral Document Image BinarizatIOn

OPENALEX - Publications

Markus Diem Fabian Hollaus Robert Sablatnig

MultiSpectral (MS) imaging enriches document digitization by increasing the spectral resolution. We present a methodology which detects target ink in images taking into account this additional information. The proposed method performs rough foreground estimation to localize possible regions. Then, Adaptive Coherence Estimator (ACE), detection algorithm, transforms MS input space single gray-scale image where values close one indicate ink. A spatial segmentation using GrabCut on detection's...

10.1109/das.2016.39 article EN 2016-04-01

Torn Document Analysis as a Prerequisite for Reconstruction

OPENALEX - Publications

Florian Kleber Markus Diem Robert Sablatnig

An automated assembling of torn documents (2D) will support philologists, archaeologists and forensic experts. Especially if the amount fragments is large (up to 1000), a human puzzle solver not be feasible due cost time. Ancient manuscripts may broken bad storage conditions, or are manually make information unreadable. In Germany project reconstruct "Stasi-files" running for historical investigations. Also disasters like collapse archive city cologne (Germany), where part archived have been...

10.1109/vsmm.2009.27 article EN 2009-09-01

Skew Estimation of Sparsely Inscribed Document Fragments

OPENALEX - Publications

Markus Diem Florian Kleber Robert Sablatnig

Document analysis is done to analyze entire forms (e.g. intelligent form analysis, table detection) or describe the layout/structure of a document for further processing. A pre-processing step methods skew estimation scanned photographed documents. Current require existence large text areas, are dependent on type and can be limited specific angle range. The proposed method gradient based in combination with Focused Nearest Neighbor Clustering interest points has no limitations regarding...

10.1109/das.2012.81 article EN 2012-03-01

Coming Soon ...