Martin Rajman

ORCID: 0000-0002-1521-4920
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Speech and dialogue systems
  • Semantic Web and Ontologies
  • Topic Modeling
  • Peer-to-Peer Network Technologies
  • Advanced Text Analysis Techniques
  • Caching and Content Delivery
  • Data Management and Algorithms
  • Web Data Mining and Analysis
  • Multi-Agent Systems and Negotiation
  • Data Mining Algorithms and Applications
  • Algorithms and Data Compression
  • Information Retrieval and Search Behavior
  • Speech Recognition and Synthesis
  • Recommender Systems and Techniques
  • Rough Sets and Fuzzy Logic
  • French Language Learning Methods
  • semigroups and automata theory
  • Service-Oriented Architecture and Web Services
  • AI in Service Interactions
  • Text Readability and Simplification
  • linguistics and terminology studies
  • Data Quality and Management
  • Usability and User Interface Design
  • Access Control and Trust

École Polytechnique Fédérale de Lausanne
2006-2021

Amazon (Germany)
2021

University of Neuchâtel
2010

Laboratoire de Recherche en Informatique
2007

University of Geneva
2005

Nemocnice Pardubického Kraje
2002-2005

École Polytechnique
2004

University of Lausanne
2001-2002

Télécom Paris
1998

The suitability of peer-to-peer (P2P) approaches for full-text Web retrieval has recently been questioned because the claimed unacceptable bandwidth consumption induced by from very large document collections. In this contribution we formalize a novel indexing/retrieval model that achieves high performance, cost-efficient indexing with highly discriminative keys (HDKs) stored in distributed global index maintained structured P2P network. HDKs correspond to carefully selected terms and term...

10.1109/icde.2007.368968 article EN 2007-04-01

In this paper, we present a query-driven indexing/retrieval strategy for efficient full text retrieval from large document collections distributed within structured P2P network. Our indexing is based on two important properties: (1) the generated index stores posting lists carefully chosen term combinations, and (2) containing too many references are truncated to bounded number of their top-ranked elements. These properties guarantee acceptable storage bandwidth requirements, essentially...

10.1145/1277741.1277857 article EN 2007-07-23

We present Alvis peers, a full-text P2P retrieval engine designed to offer performance comparable centralized solutions while scaling very large number of peers. It is the result our research efforts within project Alvis1 European FP 6 STREP ALVIS, http://www.alvis.info/ that aims at building truly-distributed semantic search engine. To cope with problem unscalable bandwidth consumption in network, implements novel model indexes highly-discriminative keys (HDKs)---terms and term sets...

10.1145/1183579.1183588 article EN 2006-11-11

We present a query-driven algorithm for the distributed indexing of large document collections within structured P2P networks. To cope with bandwidth consumption that has been identified as major problem standard approach single term indexing, we leverage index stores up to top-k references only carefully chosen combinations. In addition, since number possible combinations extracted from collection can be very large, propose use query statistics such are indeed frequently requested by users....

10.5555/1366804.1366823 article EN Scalable Information Systems 2007-06-06

This paper presents an FPGA-based implementation of a co-processing unit able to parse context-free grammars real-life sizes. The application fields such parser range from programming language syntactic analysis very demanding natural applications where parsing speed is important issue.

10.1109/fpga.2000.903411 article EN 2002-11-11

In this paper we present the AlvisP2P IR engine, which enables efficient retrieval with multi-keyword queries from a global document collection available in P2P network. such network, each peer publishes its local index and invests part of computing resources (storage, CPU, bandwidth) to maintain fraction index. This investment is rewarded by network-wide accessibility documents via search facility. The engine uses an optimized overlay network relies on novel indexing/retrieval mechanisms...

10.14778/1454159.1454190 article EN Proceedings of the VLDB Endowment 2008-08-01

The design of efficient textual similarities is an important issue in the domain data exploration. Textual are for example central document collection structuring (e.g. clustering), or information retrieval (IR) which relies on computation measuring adequacy between a query and documents. objective this paper to present compare several similarity measures framework distributional semantics (DS) model IR. This extension standard vector space model, further takes co-frequencies terms given...

10.1109/dexa.1999.795163 article EN 1999-01-01
Coming Soon ...