Roeland Ordelman

ORCID: 0000-0001-9229-0006
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Video Analysis and Summarization
  • Music and Audio Processing
  • Speech Recognition and Synthesis
  • Natural Language Processing Techniques
  • Speech and dialogue systems
  • Advanced Image and Video Retrieval Techniques
  • Image Retrieval and Classification Techniques
  • Diverse Musicological Studies
  • Multimodal Machine Learning Applications
  • Digital and Traditional Archives Management
  • AI in Service Interactions
  • Social Robot Interaction and HRI
  • Topic Modeling
  • Multimedia Communication and Technology
  • Digital Humanities and Scholarship
  • Speech and Audio Processing
  • Radio, Podcasts, and Digital Media
  • Semantic Web and Ontologies
  • Hate Speech and Cyberbullying Detection
  • Music Technology and Sound Studies
  • Advanced Data Compression Techniques
  • Scientific Computing and Data Management
  • Bullying, Victimization, and Aggression
  • Advanced Text Analysis Techniques
  • Data Visualization and Analytics

University of Twente
2013-2024

Netherlands Institute for Sound and Vision
2009-2022

Human Media
2001-2017

Delft University of Technology
2009

Fraunhofer Institute for Intelligent Analysis and Information Systems
2009

Netherlands Organisation for Applied Scientific Research
2009

Radboud University Nijmegen
2009

University of Edinburgh
2005

University of Sheffield
2005

Brno University of Technology
2005

Automatically generated tags and geotags hold great promise to improve access video collections online communities. We overview three tasks offered in the MediaEval 2010 benchmarking initiative, for each, describing its use scenario, definition data set released. For each task, a reference algorithm is presented that was used within comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes collection of Dutch television with subject...

10.1145/1991996.1992047 article EN 2011-04-18

Work on expressive speech synthesis has long focused the expression of basic emotions. In recent years, however, interest in other styles been increasing. The research presented this paper aims at generation a storytelling speaking style, which is suitable for applications and more general, aimed children. Based an analysis human storytellers' speech, we designed implemented set prosodic rules converting "neutral" as produced by text-to-speech system, into speech. An evaluation our system...

10.1109/tasl.2006.876129 article EN IEEE Transactions on Audio Speech and Language Processing 2006-06-21

Searching for relevant webpages and following hyperlinks to related content is a widely accepted effective approach information seeking on the textual web. Existing work multimedia retrieval has focused search individual items or linking without specific attention results. We describe our research exploring integrated multimodal hyperlinking data. Our investigation based MediaEval 2012 Search Hyperlinking task. This includes known-item task using Blip10000 internet video collection, where...

10.1145/2461466.2461511 preprint EN 2013-04-16

In this technical demonstration, we showcase a multimedia search engine that facilitates semantic access to archival rock n' roll concert video. The key novelty is the crowdsourcing mechanism, which relies on online users improve, extend, and share, automatically detected results in video fragments using an advanced timeline-based player. user-feedback serves as valuable input further improve automated retrieval results, such concepts transcribed interviews. has been operational harvest...

10.1145/1873951.1874278 article EN Proceedings of the 30th ACM International Conference on Multimedia 2010-10-25

The automatic processing of speech collected in conference style meetings has attracted considerable interest with several large scale projects devoted to this area.In paper we explore the use various meeting corpora for purpose recognition.In particular investigate similarity these resources and how efficiently them construction a transcription system.The analysis shows distinctive features each resource.However benefit pooling data hence seems sufficient speak generic "conference...

10.21437/interspeech.2005-543 article EN Interspeech 2022 2005-09-04

This paper addresses compound splitting for Dutch in the context of broadcast news transcription. Language models were created using original text versions and that decomposed a data-driven algorithm. model performances compared terms out-of- vocabulary rates word error real-world transcription task. It was concluded does improve ASR performance. Best results obtained when frequent compounds not decomposed.

10.21437/eurospeech.2003-105 article EN 2003-09-01

In this paper we discuss the speech activity detection system that used for detecting regions in Dutch TRECVID video collection.The is designed to filter non-speech like music or sound effects out of signal without use predefined models.Because trains its models on-line, it robust handling out-ofdomain data.The error rate on an out-of-domain test set, recordings English conference meetings, was 4.4%.The overall twelve randomly selected five minute fragments 11.5%.

10.21437/interspeech.2007-729 article EN Interspeech 2022 2007-08-27

The MediaEval Multimedia Benchmark leveraged community cooperation and crowdsourcing to develop a large Internet video dataset for its Genre Tagging Rich Speech Retrieval tasks.

10.1109/mmul.2012.27 article EN IEEE Multimedia 2012-05-24
Coming Soon ...