- Natural Language Processing Techniques
- Speech and dialogue systems
- Semantic Web and Ontologies
- Topic Modeling
- Peer-to-Peer Network Technologies
- Advanced Text Analysis Techniques
- Caching and Content Delivery
- Data Management and Algorithms
- Web Data Mining and Analysis
- Multi-Agent Systems and Negotiation
- Data Mining Algorithms and Applications
- Algorithms and Data Compression
- Information Retrieval and Search Behavior
- Speech Recognition and Synthesis
- Recommender Systems and Techniques
- Rough Sets and Fuzzy Logic
- French Language Learning Methods
- semigroups and automata theory
- Service-Oriented Architecture and Web Services
- AI in Service Interactions
- Text Readability and Simplification
- linguistics and terminology studies
- Data Quality and Management
- Usability and User Interface Design
- Access Control and Trust
École Polytechnique Fédérale de Lausanne
2006-2021
Amazon (Germany)
2021
University of Neuchâtel
2010
Laboratoire de Recherche en Informatique
2007
University of Geneva
2005
Nemocnice Pardubického Kraje
2002-2005
École Polytechnique
2004
University of Lausanne
2001-2002
Télécom Paris
1998
The suitability of peer-to-peer (P2P) approaches for full-text Web retrieval has recently been questioned because the claimed unacceptable bandwidth consumption induced by from very large document collections. In this contribution we formalize a novel indexing/retrieval model that achieves high performance, cost-efficient indexing with highly discriminative keys (HDKs) stored in distributed global index maintained structured P2P network. HDKs correspond to carefully selected terms and term...
In this paper, we present a query-driven indexing/retrieval strategy for efficient full text retrieval from large document collections distributed within structured P2P network. Our indexing is based on two important properties: (1) the generated index stores posting lists carefully chosen term combinations, and (2) containing too many references are truncated to bounded number of their top-ranked elements. These properties guarantee acceptable storage bandwidth requirements, essentially...
We present Alvis peers, a full-text P2P retrieval engine designed to offer performance comparable centralized solutions while scaling very large number of peers. It is the result our research efforts within project Alvis1 European FP 6 STREP ALVIS, http://www.alvis.info/ that aims at building truly-distributed semantic search engine. To cope with problem unscalable bandwidth consumption in network, implements novel model indexes highly-discriminative keys (HDKs)---terms and term sets...
We present a query-driven algorithm for the distributed indexing of large document collections within structured P2P networks. To cope with bandwidth consumption that has been identified as major problem standard approach single term indexing, we leverage index stores up to top-k references only carefully chosen combinations. In addition, since number possible combinations extracted from collection can be very large, propose use query statistics such are indeed frequently requested by users....
This paper presents an FPGA-based implementation of a co-processing unit able to parse context-free grammars real-life sizes. The application fields such parser range from programming language syntactic analysis very demanding natural applications where parsing speed is important issue.
In this paper we present the AlvisP2P IR engine, which enables efficient retrieval with multi-keyword queries from a global document collection available in P2P network. such network, each peer publishes its local index and invests part of computing resources (storage, CPU, bandwidth) to maintain fraction index. This investment is rewarded by network-wide accessibility documents via search facility. The engine uses an optimized overlay network relies on novel indexing/retrieval mechanisms...
The design of efficient textual similarities is an important issue in the domain data exploration. Textual are for example central document collection structuring (e.g. clustering), or information retrieval (IR) which relies on computation measuring adequacy between a query and documents. objective this paper to present compare several similarity measures framework distributional semantics (DS) model IR. This extension standard vector space model, further takes co-frequencies terms given...