NFDI4DS | UHH-SEMS - Publication Details

Jonathan Foote

ORCID: 0000-0003-4411-1362

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5087467671

Research Areas

Music and Audio Processing
Video Analysis and Summarization
Speech and Audio Processing
Advanced Image and Video Retrieval Techniques
Music Technology and Sound Studies
Speech Recognition and Synthesis
Advanced Vision and Imaging
Image Retrieval and Classification Techniques
Multimedia Communication and Technology
Architecture and Art History Studies
Augmented Reality Applications
Speech and dialogue systems
Neuroscience and Music Perception
Renaissance and Early Modern Studies
Interactive and Immersive Displays
Advanced Text Analysis Techniques
Architecture and Computational Design
Computer Graphics and Visualization Techniques
Architecture, Modernity, and Design
Video Coding and Compression Technologies
Advanced Data Compression Techniques
3D Surveying and Cultural Heritage
Algorithms and Data Compression
Recommender Systems and Techniques
Natural Language Processing Techniques

Pomona College
2024

Aarhus School of Architecture
2016-2023

Virginia Tech
2012

FX Palo Alto Laboratory
1998-2010

Fuji Xerox (Japan)
2005

Xerox (France)
1999-2002

University of Cambridge
1995-2002

Xerox (United States)
2002

National University of Singapore
1997-1999

Brown University
1991-1994

<title>Content-based retrieval of music and audio</title>

OPENALEX - Publications

Jonathan Foote

Though many systems exist for content-based retrieval of images, little work has been done on the audio portion multimedia stream. This paper presents a system to retrieve documents y acoustic similarity. The similarity measure is based statistics derived from supervised vector quantizer, rather than matching simple pitch or spectral characteristics. thus able learn distinguishing features while ignoring unimportant variation. Both theoretical and experimental results are presented,...

10.1117/12.290336 article EN Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE 1997-10-06

Automatic audio segmentation using a measure of audio novelty

OPENALEX - Publications

Jonathan Foote

The paper describes methods for automatically locating points of significant change in music or audio, by analyzing local self-similarity. This method can find individual note boundaries even natural segment such as verse/chorus speech/music transitions, the absence cues silence. approach uses signal to model itself, and thus does not rely on particular acoustic nor requires training. We present a wide variety applications, including indexing, segmenting, beat tracking audio. works well...

10.1109/icme.2000.869637 article EN 2002-11-07

WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition

OPENALEX - Publications

T. Robinson J. Fransen David Pye Jonathan Foote Steve Renals

A significant new speech corpus of British English has been recorded at Cambridge University. Derived from the Wall Street Journal text corpus, WSJCAMO constitutes one largest corpora spoken currently in existence. It specifically designed for construction and evaluation speaker-independent recognition systems. The database consists 140 speakers each speaking about 110 utterances. This paper describes motivation processes undertaken its utilities needed as support tools. All utterance...

10.1109/icassp.1995.479278 article EN International Conference on Acoustics, Speech, and Signal Processing 2002-11-19

An overview of audio information retrieval

OPENALEX - Publications

Jonathan Foote

10.1007/s005300050106 article EN Multimedia Systems 1999-01-01

Visualizing music and audio using self-similarity

OPENALEX - Publications

Jonathan Foote

This paper presents a novel approach to visualizing the time structure of music and audio. The acoustic similarity between any two instants an audio recording is displayed in 2D representation, allowing identification structural rhythmic characteristics. Examples are presented for classical popular music. Applications include content-based analysis segmentation, as well tempo extraction.

10.1145/319463.319472 article EN 1999-10-30

Video Manga

OPENALEX - Publications

Shingo Uchihashi Jonathan Foote Andreas Girgensohn John Boreczky

This paper presents methods for automatically creating pictorial video summaries that resemble comic books. The relative importance of segments is computed from their length and novelty. Image audio analysis used to detect emphasize meaningful events. Based on this measure, we choose relevant keyframes. Selected keyframes are sized by importance, then efficiently packed into a summary. We present quantitative measure how well summary captures the salient events in video, show it can be...

10.1145/319463.319654 article EN 1999-10-30

Temporal event clustering for digital photo collections

OPENALEX - Publications

Matthew Cooper Jonathan Foote Andreas Girgensohn Lynn Wilcox

Organizing digital photograph collections according to events such as holiday gatherings or vacations is a common practice among photographers. To support photographers in this task, we present similarity-based methods cluster photos by time and image content. The approach general unsupervised, makes minimal assumptions regarding the structure statistics of photo collection. We several variants an automatic unsupervised algorithm partition collection photographs based either on temporal...

10.1145/1083314.1083317 article EN ACM Transactions on Multimedia Computing Communications and Applications 2005-08-01

The beat spectrum: a new approach to rhythm analysis

OPENALEX - Publications

Jonathan Foote Shingo Uchihashi

We introduce the beat spectrum, a new method of automatically characterizing rhythm and tempo music audio. The spectrum is measure acoustic self-similarity as function time lag. Highly structured or repetitive will have strong peaks at repetition times. This reveals both relative strength particular beats, therefore can distinguish between different kinds rhythms same tempo. also spectrogram which graphically illustrates variation over time. Unlike previous approaches to analysis, does not...

10.1109/icme.2001.1237863 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2001-01-01

Automatic content-based retrieval of broadcast news

OPENALEX - Publications

Monica Brown Jonathan Foote Gareth J. F. Jones Karen Spärck Jones S.J. Young

10.1145/217279.215080 article EN 1995-01-01

A semi-automatic approach to home video editing

OPENALEX - Publications

Andreas Girgensohn John Boreczky Patrick Chiu John Doherty Jonathan Foote and 3 more

Article Free Access Share on A semi-automatic approach to home video editing Authors: Andreas Girgensohn FX Palo Alto Laboratory, 3400 Hillview Avenue, Alto, CA CAView Profile , John Boreczky Patrick Chiu Doherty Jonathan Foote Gene Golovchinsky Shingo Uchihashi Lynn Wilcox Authors Info & Claims UIST '00: Proceedings of the 13th annual ACM symposium User interface software and technologyNovember 2000Pages 81–89https://doi.org/10.1145/354401.354415Published:01 November 2000Publication History...

10.1145/354401.354415 article EN 2000-01-01

Media segmentation using self-similarity decomposition

OPENALEX - Publications

Jonathan Foote Matthew Cooper

We present a framework for analyzing the structure of digital media streams. Though our methods work video, text, and audio, we concentrate on detecting music files. In first step, spectral data is used to construct similarity matrix calculated from inter-frame similarity.The audio can be robustly segmented by correlating kernel along diagonal matrix. Once segmented, statistics each segment are computed. second step,segments clustered based self-similarity their statistics. This reveals in...

10.1117/12.476302 article EN Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE 2003-01-20

Retrieving spoken documents by combining multiple index sources

OPENALEX - Publications

Gareth J. F. Jones Jonathan Foote Karen Spärck Jones S. Young

Article Free Access Share on Retrieving spoken documents by combining multiple index sources Authors: G. J. F. Jones Computer Laboratory, University of Cambridge, New Museums Site, Pembroke Street, Cambridge CB2 3QG, England EnglandView Profile , T. Foote View K. Spärck S. Young Engineering Department, Trumpington CB2, 1PZ, Authors Info & Claims SIGIR '96: Proceedings the 19th annual international ACM conference Research and development in information retrievalAugust 1996Pages...

10.1145/243199.243208 article EN Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '02 1996-01-01

Summarizing popular music via structural similarity analysis

OPENALEX - Publications

Matthew Cooper Jonathan Foote

We present a framework for summarizing digital media based on structural analysis. Though these methods are applicable to general media, we concentrate here characterizing the repetitive structure in popular music. In first step, similarity matrix is calculated from interframe spectral similarity. Segment boundaries, such as verse-chorus transitions, found by correlating kernel along diagonal of matrix. Once segmented, statistics each segment computed. second segments clustered, pairwise...

10.1109/aspaa.2003.1285836 article EN 2004-05-06

Summarizing video using non-negative similarity matrix factorization

OPENALEX - Publications

Matthew Cooper Jonathan Foote

We present a novel approach to automatically extracting summary excerpts from audio video and video. Our is maximize the average similarity between excerpt source. first calculate matrix by comparing each pair of time samples using quantitative measure. To determine segment with highest similarity, we summation self-similarity over support segment. select multiple while avoiding redundancy, compute non-negative factorization (NMF) into its essential structural components. then build...

10.1109/mmsp.2002.1203239 article EN 2004-01-23

FlyCam: practical panoramic video and automatic camera control

OPENALEX - Publications

Jonathan Foote Don Kimber

We describe computationally and materially inexpensive methods for panoramic video imaging. Digitally combining images from an array of cameras results in a wide-field camera, off-the-shelf hardware. present that both correct lens distortion seamlessly merge into image. Electronically selecting region this rapidly steerable "virtual camera". Because the camera is fixed with respect to background, simple motion analysis can be used track objects people interest. algorithms automatic control...

10.1109/icme.2000.871033 article EN 2002-11-07

Region of interest extraction and virtual camera control based on panoramic video capturing

OPENALEX - Publications

Xinding Sun Jonathan Foote Don Kimber B. S. Manjunath

We present a system for automatically extracting the region of interest and controlling virtual cameras control based on panoramic video.It targets applications such as classroom lectures video conferencing.For capturing video, we use FlyCam that produces high resolution, wide-angle by stitching images from multiple stationary cameras.To generate conventional (ROI) can be cropped video.We propose methods ROI detection, tracking, camera work in both uncompressed compressed domains.The is...

10.1109/tmm.2005.854388 article EN IEEE Transactions on Multimedia 2005-09-20

Open-vocabulary speech indexing for voice and video mail retrieval

OPENALEX - Publications

Monica Brown Jonathan Foote Gareth J. F. Jones Karen Spärck Jones S.J. Young

Article Free Access Share on Open-vocabulary speech indexing for voice and video mail retrieval Authors: M. G. Brown Olivetti Research Limited, 24a Trumpington St., Cambridge, CB2 1QA, UK UKView Profile , J. T. Foote Cambridge University Engineering Department, 1PZ, F. Jones Computer Laboratory, 3QG, K. Spärck View S. Young Authors Info & Claims MULTIMEDIA '96: Proceedings of the fourth ACM international conference MultimediaFebruary 1997 Pages...

10.1145/244130.244232 article EN 1996-01-01

Creating music videos using automatic media analysis

OPENALEX - Publications

Jonathan Foote Matthew Cooper Andreas Girgensohn

We present methods for automatic and semi-automatic creation of music videos, given an arbitrary audio soundtrack source video. Significant changes are automatically detected; similarly, the video is segmented analyzed suitability based on camera motion exposure. Video with excessive or poor contrast penalized a high unsuitability score, more likely to be discarded in final edit. High quality clips then selected aligned time significant changes. adjusted match segments by selecting most...

10.1145/641007.641119 article EN 2002-12-01

Temporal event clustering for digital photo collections

OPENALEX - Publications

Matthew Cooper Jonathan Foote Andreas Girgensohn Lynn Wilcox

We present similarity-based methods to cluster digital photos by time and image content. The approach is general, unsupervised, makes minimal assumptions regarding the structure or statistics of photo collection. results for algorithm based solely on temporal similarity, jointly content-based similarity. also describe a supervised learning vector quantization. Finally, we include experimental proposed algorithms several competing approaches two test collections.

10.1145/957013.957093 article EN 2003-11-02

FlySPEC

OPENALEX - Publications

Qiong Liu Don Kimber Jonathan Foote Lynn Wilcox John Boreczky

FlySPEC is a video camera system designed for real-time remote operation. A hybrid design combines the high resolution of an optomechanical with wide field view always available from panoramic camera. The control integrates requests multiple users so that each controls virtual seamlessly manual and fully automatic control. It supports range options untended to full can also learn strategies user requests. Additionally, intuitive interface, objects are never out regardless zoom factor. We...

10.1145/641007.641110 article EN 2002-12-01

Discriminative Techniques for Keyframe Selection

OPENALEX - Publications

Matthew Cooper Jonathan Foote

A convenient representation of a video segment is single "keyframe". Keyframes are widely used in applications such as non-linear browsing and editing. With existing methods keyframe selection, similar segments result very keyframes, with the drawback that actual differences between may be obscured. We present for selection based on two criteria: capturing similarity to represented segment, preserving from other so different will have visually distinct representations. discriminative...

10.1109/icme.2005.1521470 article EN 2005-10-24

Coming Soon ...