NFDI4DS | UHH-SEMS - Publication Details

Sequential sentence classification in research papers using cross-domain multi-task learning

OPENALEX - Publications

Arthur Brack Elias Entrup Markos Stamatakis Pascal Buschermöhle Anett Hoppe and 1 more

Abstract The automatic semantic structuring of scientific text allows for more efficient reading research articles and is an important indexing step academic search engines. Sequential sentence classification essential task targets the categorisation sentences based on their content context. However, potential transfer learning across different domains types, such as full papers abstracts, has not yet been explored in prior work. In this paper, we present a systematic analysis sequential...

10.1007/s00799-023-00392-z article EN cc-by International Journal on Digital Libraries 2024-01-22

Text detection in images based on unsupervised classification of high-frequency wavelet coefficients

OPENALEX - Publications

Julinda Gllavata Ralph Ewerth Bernd Freisleben

Text localization and recognition in images is important for searching information digital photo archives, video databases Web sites. However, since text often printed against a complex background, it difficult to detect. In this paper, robust approach presented, which can automatically detect horizontally aligned with different sizes, fonts, colors languages. First, wavelet transform applied the image distribution of high-frequency coefficients considered statistically characterize non-text...

10.1109/icpr.2004.896 article EN Deleted Journal 2004-08-23

TVCalib: Camera Calibration for Sports Field Registration in Soccer

OPENALEX - Publications

Jonas Theiner Ralph Ewerth

Sports field registration in broadcast videos is typically interpreted as the task of homography estimation, which provides a mapping between planar and corresponding visible area image. In contrast to previous approaches, we consider camera calibration problem. First, introduce differentiable objective function that able learn pose focal length from segment correspondences (e.g., lines, point clouds), based on pixel-level annotations for segments known object. The module iteratively...

10.1109/wacv56688.2023.00122 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

Learning analytics and the Universal Design for Learning (UDL): A clustering approach

OPENALEX - Publications

Marvin Roski Ratan Sebastian Ralph Ewerth Anett Hoppe Andreas Nehring

In the context of inclusive education, Universal Design for Learning (UDL) is a framework used worldwide to create learning opportunities accessible all learners. While much research focused on design and students' perceptions UDL-based settings, studies usage patterns in UDL-guided elements, particularly digital environments, are still scarce. Therefore, we analyze cluster 9th 10th graders web-based platform called [anonymized project name]. The focuses chemistry learning, UDL principles...

10.1016/j.compedu.2024.105028 article EN cc-by Computers & Education 2024-03-02

Text detection in images based on unsupervised classification of high-frequency wavelet coefficients

OPENALEX - Publications

Julinda Gllavata Ralph Ewerth Bernd Freisleben

Text localization and recognition in images is important for searching information digital photo archives, video databases Web sites. However, since text often printed against a complex background, it difficult to detect. In this paper, robust approach presented, which can automatically detect horizontally aligned with different sizes, fonts, colors languages. First, wavelet transform applied the image distribution of high-frequency coefficients considered statistically characterize non-text...

10.1109/icpr.2004.1334146 article EN Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004. 2004-01-01

Supervised Video Summarization Via Multiple Feature Sets with Parallel Attention

OPENALEX - Publications

Junaid Ahmed Ghauri Sherzod Hakimov Ralph Ewerth

The assignment of importance scores to particular frames or (short) segments in a video is crucial for summarization, but also difficult task. Previous work utilizes only one source visual features. In this paper, we suggest novel model architecture that combines three feature sets content and motion predict scores. proposed an attention mechanism before fusing features representing the (static) content, i.e., derived from image classification model. Comprehensive experimental evaluations...

10.1109/icme51207.2021.9428318 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2021-06-09

Exploring Data Mining in Chemistry Education: Building a Web-Based Learning Platform for Learning Analytics

OPENALEX - Publications

Marvin Roski Ralph Ewerth Anett Hoppe Andreas Nehring

The integration of learning analytics and artificial intelligence methods into education is part the latest developments significantly affects chemistry (research): researchers might face challenge collecting analyzing content-rich data sets involving interdisciplinary approaches from computer science, chemistry, education. Developing a platform offers higher degree freedom compared to using existing Learning Management Systems. This paper presents step-by-step overview how we designed...

10.1021/acs.jchemed.3c00794 article EN cc-by Journal of Chemical Education 2024-02-13

Unraveling the Impact of Visual Complexity on Search as Learning

OPENALEX - Publications

Wolfgang Gritz Anett Hoppe Ralph Ewerth

Information search has become essential for learning and knowledge acquisition, offering broad access to information resources. The visual complexity of web pages is known influence behavior, with previous work suggesting that searchers make evaluative judgments within the first second on a page. However, there significant gap in our understanding how impacts searches specifically conducted intent. This particularly relevant development optimized retrieval (IR) systems effectively support...

10.48550/arxiv.2501.05289 preprint EN arXiv (Cornell University) 2025-01-09

Verifying Cross-modal Entity Consistency in News using Vision-language Models

OPENALEX - Publications

Sahar Tahmasebi Eric Müller-Budack Ralph Ewerth

The web has become a crucial source of information, but it is also used to spread disinformation, often conveyed through multiple modalities like images and text. identification inconsistent cross-modal in particular entities such as persons, locations, events, critical detect disinformation. Previous works either identify out-of-context disinformation by assessing the consistency whole document, neglecting relations individual entities, or focus on generic that are not relevant news. So...

10.48550/arxiv.2501.11403 preprint EN arXiv (Cornell University) 2025-01-20

Patent Figure Classification using Large Vision-language Models

OPENALEX - Publications

Sushil Awale Eric Müller-Budack Ralph Ewerth

Patent figure classification facilitates faceted search in patent retrieval systems, enabling efficient prior art search. Existing approaches have explored for only a single aspect and aspects with limited number of concepts. In recent years, large vision-language models (LVLMs) shown tremendous performance across numerous computer vision downstream tasks, however, they remain unexplored classification. Our work explores the efficacy LVLMs visual question answering (VQA) classification,...

10.48550/arxiv.2501.12751 preprint EN arXiv (Cornell University) 2025-01-22

A robust algorithm for text detection in images

OPENALEX - Publications

Julinda Gllavata Ralph Ewerth Bernd Freisleben

Text detection in images or videos is an important step to achieve multimedia content retrieval. In this paper, efficient algorithm which can automatically detect, localize and extract horizontally aligned text (and digital videos) with complex backgrounds presented. The proposed approach based on the application of a color reduction technique, method for edge detection, localization regions using projection profile analyses geometrical properties. output are boxes simplified background,...

10.1109/ispa.2003.1296349 article EN 2004-07-09

Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency

OPENALEX - Publications

Eric Müller-Budack Jonas Theiner Sebastian Diering Maximilian Idahl Ralph Ewerth

The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or attract attention. photos can be decorative, depict additional details, even contain misleading information. Quantifying cross-modal consistency of entity representations assist human assessors in evaluating overall multimodal message. In some cases such measures might give hints detect fake news,...

10.1145/3372278.3390670 article EN 2020-06-02

Extraction of Positional Player Data from Broadcast Soccer Videos

OPENALEX - Publications

Jonas Theiner Wolfgang Gritz Eric Müller-Budack Robert Rein Daniel Memmert and 1 more

Computer-aided support and analysis are becoming increasingly important in the modern world of sports. The scouting potential prospective players, performance as well match analysis, monitoring training programs rely more on data-driven technologies to ensure success. Therefore, many approaches require large amounts data, which are, however, not easy obtain general. In this paper, we propose a pipeline for fully-automated extraction positional data from broadcast video recordings soccer...

10.1109/wacv51458.2022.00153 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022-01-01

SoccerNet 2022 Challenges Results

OPENALEX - Publications

Silvio Giancola Anthony Cioppa Adrien Deliège Floriane Magera Vladimir Somers and 89 more

The SoccerNet 2022 challenges were the second annual video understanding organized by team. In 2022, composed of 6 vision-based tasks: (1) action spotting, focusing on retrieving timestamps in long untrimmed videos, (2) replay grounding, live moment an shown a replay, (3) pitch localization, detecting line and goal part elements, (4) camera calibration, dedicated to intrinsic extrinsic parameters, (5) player re-identification, same players across multiple views, (6) object tracking, tracking...

10.1145/3552437.3558545 preprint EN 2022-09-30

Deep learning for content-based video retrieval in film and television production

OPENALEX - Publications

Markus Mühling Nikolaus Korfhage Eric Müller-Budack Christian Otto Matthias Springstein and 4 more

10.1007/s11042-017-4962-9 article EN Multimedia Tools and Applications 2017-07-05

Pushing the button: Why do learners pause online videos?

OPENALEX - Publications

Martin Merkt Anett Hoppe Gerrit Bruns Ralph Ewerth Markus Huff

With the recent surge in digitalization across all levels of education, online video platforms gained educational relevance. Therefore, optimizing such line with learners' actual needs should be considered a priority for scientists and educators alike. In this project, we triangulate logfiles large German platform videos behavioral data from laboratory study objective characteristics selected videos. We aim to understand potential motives why participants pause while watching online. Our...

10.1016/j.compedu.2021.104355 article EN cc-by Computers & Education 2021-10-21

Interpretable Semantic Photo Geolocation

OPENALEX - Publications

Jonas Theiner Eric Müller-Budack Ralph Ewerth

Planet-scale photo geolocalization is the complex task of estimating location depicted in an image solely based on its visual content. Due to success convolutional neural networks (CNNs), current approaches achieve superhuman performance. However, previous work has exclusively focused optimizing accuracy. black-box property deep learning systems, their predictions are difficult validate for humans. State-of-the-art methods treat as a classification problem, where choice classes, that...

10.1109/wacv51458.2022.00154 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022-01-01

Fast Motion Estimation on Graphics Hardware for H.264 Video Encoding

OPENALEX - Publications

M. Schwalb Ralph Ewerth Bernd Freisleben

The video coding standard H.264 supports compression with a higher efficiency than previous standards. However, this comes at the expense of an increased encoding complexity, in particular for motion estimation which becomes very time consuming task even today's central processing units (CPU). On other hand, modern graphics hardware includes powerful unit (GPU) whose computing power remains idle most time. In paper, we present GPU based approach to purpose encoding. A small diamond search is...

10.1109/tmm.2008.2008873 article EN IEEE Transactions on Multimedia 2008-12-24

Content-based video retrieval in historical collections of the German Broadcasting Archive

OPENALEX - Publications

Markus Mühling Manja Meister Nikolaus Korfhage Jörg Wehling Angelika Hörth and 2 more

10.1007/s00799-018-0236-z article EN International Journal on Digital Libraries 2018-03-08

Characterization and classification of semantic image-text relations

OPENALEX - Publications

Christian Otto Matthias Springstein Avishek Anand Ralph Ewerth

Abstract The beneficial, complementary nature of visual and textual information to convey is widely known, for example, in entertainment, news, advertisements, science, or education. While the complex interplay image text form semantic meaning has been thoroughly studied linguistics communication sciences several decades, computer vision multimedia research remained on surface problem more less. An exception previous work that introduced two metrics Cross-Modal Mutual Information Semantic...

10.1007/s13735-019-00187-6 article EN cc-by International Journal of Multimedia Information Retrieval 2020-01-22

A Recommender System For Open Educational Videos Based On Skill Requirements

OPENALEX - Publications

Mohammadreza Tavakoli Sherzod Hakimov Ralph Ewerth Gábor Kismihók

In this paper, we suggest a novel method to help learners find relevant open educational videos master skills demanded on the labour market. We have built prototype, which 1) applies text classification and mining methods job vacancy announcements match jobs their required skills; 2) predicts quality of videos; 3) creates an video recommender system personalized learning content learners. For first evaluation prototype focused area data science related jobs. Our was evaluated by in-depth,...

10.1109/icalt49669.2020.00008 article EN 2020-07-01

Identification of Speaker Roles and Situation Types in News Videos

OPENALEX - Publications

Gullal S. Cheema Judi Arafat Chiao-I Tseng John Bateman Ralph Ewerth and 1 more

The proliferation of news sources on the web amplifies problem disinformation and misinformation, impacting public perception societal stability. These issues necessitate identification bias in broadcasts, whereby analysis understanding speaker roles contexts are essential prerequisites. Although there is prior research multimodal role recognition (mostly) domain, modern feature representations have not been explored yet, no comprehensive dataset available. In this paper, we propose novel...

10.1145/3652583.3658101 article EN cc-by 2024-05-30

Estimation of arbitrary camera motion in MPEG videos

OPENALEX - Publications

Ralph Ewerth M. Schwalb P. Tessmann Bernd Freisleben

Several algorithms have been proposed to solve the problem of camera motion estimation in digital videos. However, distinction between translation along x-axis (y-axis) and rotation around y-axis (x-axis) has only rarely considered, no approach this kind is known us for MPEG domain. In paper, we present such an algorithm For performance reasons it reasonable extract vectors directly from compressed stream. since are optimal with respect compression, they often do not model real adequately...

10.1109/icpr.2004.339 article EN Deleted Journal 2004-08-23

ORKG

DBLP

CEUR

MyBinder

Ralph Ewerth