NFDI4DS | UHH-SEMS - Publication Details

Fernando Pereira

ORCID: 0000-0001-6100-947X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5044708805

Research Areas

Video Coding and Compression Technologies
Advanced Vision and Imaging
Advanced Data Compression Techniques
Wireless Communication Security Techniques
Natural Language Processing Techniques
Topic Modeling
Video Analysis and Summarization
Advanced Image and Video Retrieval Techniques
3D Shape Modeling and Analysis
Image and Video Quality Assessment
Computer Graphics and Visualization Techniques
Chaos-based Image/Signal Encryption
Speech Recognition and Synthesis
Face recognition and analysis
Algorithms and Data Compression
Multimedia Communication and Technology
Advanced Image Processing Techniques
Advanced Steganography and Watermarking Techniques
Music and Audio Processing
3D Surveying and Cultural Heritage
Video Surveillance and Tracking Methods
Visual Attention and Saliency Detection
Image Retrieval and Classification Techniques
Face and Expression Recognition
Digital Rights Management and Security

University of Lisbon
2016-2025

Instituto de Telecomunicações
2016-2025

Iscte – Instituto Universitário de Lisboa
2024

Weatherford College
2024

Instituto Politécnico de Lisboa
1990-2023

Governo do Estado de São Paulo
2023

Instituto Superior de Tecnologias Avançadas
2015-2023

Voith (Germany)
2023

Secretaria Regional do Ambiente e Recursos Naturais
2023

Zimmer Biomet (United States)
2023

A theory of learning from different domains

OPENALEX - Publications

Shai Ben-David John Blitzer Koby Crammer Alex Kulesza Fernando Pereira and 1 more

Discriminative learning methods for classification perform well when training and test data are drawn from the same distribution. Often, however, we have plentiful labeled a source domain but wish to learn classifier which performs on target with different distribution little or no data. In this work investigate two questions. First, under what conditions can trained be expected data? Second, given small amount of data, how should combine it during large achieve lowest error at time? We...

10.1007/s10994-009-5152-4 article EN cc-by-nc Machine Learning 2009-10-22

The Unreasonable Effectiveness of Data

OPENALEX - Publications

Alon Halevy Peter Norvig Fernando Pereira

Problems that involve interacting with humans, such as natural language understanding, have not proven to be solvable by concise, neat formulas like F = ma. Instead, the best approach appears embrace complexity of domain and address it harnessing power data: if other humans engage in tasks generate large amounts unlabeled, noisy data, new algorithms can used build high-quality models from data.

10.1109/mis.2009.36 article EN IEEE Intelligent Systems 2009-03-01

Domain adaptation with structural correspondence learning

OPENALEX - Publications

John Blitzer Ryan McDonald Fernando Pereira

Discriminative learning methods are widely used in natural language processing. These work best when their training and test data drawn from the same distribution. For many NLP tasks, however, we confronted with new domains which labeled is scarce or non-existent. In such cases, seek to adapt existing models a resource-rich source domain resource-poor target domain. We introduce structural correspondence automatically induce correspondences among features different domains. our technique on...

10.3115/1610075.1610094 article EN 2006-01-01

Shallow parsing with conditional random fields

OPENALEX - Publications

Fei Sha Fernando Pereira

Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each position. Among tasks in language processing, shallow parsing has received much attention, with the development of standard evaluation datasets extensive comparison among methods. We show here how to train a conditional field achieve performance as good any reported base noun-phrase chunking method on CoNLL task, better than single model. Improved training...

10.3115/1073445.1073473 article EN 2003-01-01

Distributional clustering of English words

OPENALEX - Publications

Fernando Pereira Naftali Tishby Lillian Lee

We describe and evaluate experimentally a method for clustering words according to their distribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts which they appear, entropy between those is used as similarity measure clustering. Clusters average context derived from given probabilities cluster membership. In many cases, clusters can be thought encoding coarse sense distinctions. Deterministic annealing find lowest distortion...

10.3115/981574.981598 article EN 1993-01-01

Video coding with H.264/AVC: tools, performance, and complexity

OPENALEX - Publications

Jörn Östermann J. Bormans Peter List Detlev Marpe Matthias Narroschke and 3 more

H.264/AVC, the result of collaboration between ISO/IEC Moving Picture Experts Group and ITU-T Video Coding Group, is latest standard for video coding. The goals this standardization effort were enhanced compression efficiency, network friendly representation interactive (video telephony) non-interactive applications (broadcast, streaming, storage, on demand). H.264/AVC provides gains in efficiency up to 50% over a wide range bit rates resolutions compared previous standards. Compared...

10.1109/mcas.2004.1286980 article EN IEEE Circuits and Systems Magazine 2004-01-01

Non-projective dependency parsing using spanning tree algorithms

OPENALEX - Publications

Ryan McDonald Fernando Pereira Kiril Ribarov Jan Hajič

We formalize weighted dependency parsing as searching for maximum spanning trees (MSTs) in directed graphs. Using this representation, the algorithm of Eisner (1996) is sufficient over all projective O(n3) time. More surprisingly, representation extended naturally to non-projective using Chu-Liu-Edmonds (Chu and Liu, 1965; Edmonds, 1967) MST algorithm, yielding an O(n2) algorithm. evaluate these methods on Prague Dependency Treebank online large-margin learning techniques (Crammer et al.,...

10.3115/1220575.1220641 article EN 2005-01-01

Weighted finite-state transducers in speech recognition

OPENALEX - Publications

Mehryar Mohri Fernando Pereira Michael Riley

10.1006/csla.2001.0184 article EN Computer Speech & Language 2002-01-01

Online large-margin training of dependency parsers

OPENALEX - Publications

Ryan McDonald Koby Crammer Fernando Pereira

We present an effective training algorithm for linearly-scored dependency parsers that implements online large-margin multi-class (Crammer and Singer, 2003; Crammer et al., 2003) on top of efficient parsing techniques trees (Eisner, 1996). The trained achieve a competitive accuracy both English Czech with no language specific enhancements.

10.3115/1219840.1219852 article EN 2005-01-01

Confidence-weighted linear classification

OPENALEX - Publications

Mark Dredze Koby Crammer Fernando Pereira

We introduce confidence-weighted linear classifiers, which add parameter confidence information to classifiers. Online learners in this setting update both classifier parameters and the estimate of their confidence. The particular online algorithms we study here maintain a Gaussian distribution over vectors mean covariance with each instance. Empirical evaluation on range NLP tasks show that our algorithm improves other state art batch methods, learns faster setting, lends itself better...

10.1145/1390156.1390190 article EN 2008-01-01

JPEG Pleno: Toward an Efficient Representation of Visual Reality

OPENALEX - Publications

Touradj Ebrahimi Siegfried Föessel Fernando Pereira Peter Schelkens

In discussing the rationale behind vision for JPEG Pleno and how new standardization initiative aims to reinvent future of imaging, authors review plenoptic representation its underlying practical implications challenges in implementing real-world applications with an enhanced quality experience.

10.1109/mmul.2016.64 article EN IEEE Multimedia 2016-10-01

Inside-outside reestimation from partially bracketed corpora

OPENALEX - Publications

Fernando Pereira Yves Schabes

The inside-outside algorithm for inferring the parameters of a stochastic context-free grammar is extended to take advantage constituent information (constituent bracketing) in partially parsed corpus.Experiments on formal and natural language corpora show that new can achieve faster convergence better modeling hierarchical structure than original one.In particular, over 90% test set bracketing accuracy was achieved grammars inferred by our from training handparsed part-of-speech strings...

10.3115/981967.981984 article EN 1992-01-01

MPEG-7: the generic multimedia content description standard, part 1

OPENALEX - Publications

José M. Martínez Rob Koenen Fernando Pereira

The recently completed ISO/IEC, International Standard 15938, formally called the Multimedia Content Description Interface (but better known as MPEG-7), provides a rich set of tools for completely describing multimedia content. standard wasn't just designed from content management viewpoint (classical archival information). It includes an innovative description media's content, which we can extract via analysis and processing. MPEG-7 also isn't aimed at any one application; rather, elements...

10.1109/93.998074 article EN IEEE Multimedia 2002-01-01

Prolog - the language and its implementation compared with Lisp

OPENALEX - Publications

David H. Warren Luís M.C. Pereira Fernando Pereira

Prolog is a simple but powerful programming language founded on symbolic logic. The basic computational mechanism pattern matching process ("unification") operating general record structures ("terms" of logic). We briefly review the and compare it especially with pure Lisp. remainder paper discusses techniques for implementing efficiently; in particular we describe how to compile patterns involved process. These are as incorporated our DECsystem-10 compiler (written Prolog). code generates...

10.1145/800228.806939 article EN 1977-01-01

Multilingual dependency analysis with a two-stage discriminative parser

OPENALEX - Publications

Ryan McDonald Kevin Lerman Fernando Pereira

We present a two-stage multilingual dependency parser and evaluate it on 13 diverse languages. The first stage is based the unlabeled parsing models described by McDonald Pereira (2006) augmented with morphological features for subset of second takes output from labels all edges in graph appropriate syntactic categories using globally trained sequence classifier over components graph. report results CoNLL-X shared task (Buchholz et al., 2006) data sets an error analysis.

10.3115/1596276.1596317 article EN 2006-01-01

Identifying gene and protein mentions in text using conditional random fields

OPENALEX - Publications

Ryan McDonald Fernando Pereira

Abstract Background We present a model for tagging gene and protein mentions from text using the probabilistic sequence framework of conditional random fields (CRFs). Conditional probability P ( t | o ) tag given an observation directly, have previously been employed successfully other tasks. The mechanics CRFs their relationship to maximum entropy are discussed in detail. Results employ diverse feature set containing standard orthographic features combined with expert form biological term...

10.1186/1471-2105-6-s1-s6 article EN cc-by BMC Bioinformatics 2005-05-01

Correlation Noise Modeling for Efficient Pixel and Transform Domain Wyner–Ziv Video Coding

OPENALEX - Publications

Catarina Brites Fernando Pereira

In recent years, practical Wyner-Ziv (WZ) video coding solutions have been proposed with promising results. Most of the available in literature model correlation noise (CN) between original frame and its estimation made at decoder, which is so-called side information (SI), by a given distribution whose relevant parameters are estimated using an offline process, assuming that SI encoder or originals decoder. The major goal this paper to propose more realistic WZ approach performing online CN...

10.1109/tcsvt.2008.924107 article EN IEEE Transactions on Circuits and Systems for Video Technology 2008-09-01

MPEG-21: goals and achievements

OPENALEX - Publications

Lee Burnett Rik Van de Walle Kathy Hill J. Bormans Fernando Pereira

MPEG-21 is an open standards-based framework for multimedia delivery and consumption. It aims to enable the use of resources across a wide range networks devices. We discuss MPEG-21's parts, achievements, ongoing activities, opportunities new technologies.

10.1109/mmul.2003.1237551 article EN IEEE Multimedia 2003-10-01

The Need for Open Source Software in Machine Learning

OPENALEX - Publications

Sören Sonnenburg Mikio L. Braun Cheng Soon Ong Samy Bengio Léon Bottou and 11 more

Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, field machine learning has developed large body powerful algorithms diverse applications. However, true potential these methods is not used, since existing implementations are openly shared, resulting in software with low usability, and weak interoperability. We argue that this situation can be significantly improved by increasing incentives...

10.5555/1314498.1314577 article EN Journal of Machine Learning Research 2007-12-01

Coming Soon ...