NFDI4DS | UHH-SEMS - Publication Details

Éric Gaussier

ORCID: 0000-0002-8858-3233

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5014352159

Research Areas

Topic Modeling
Natural Language Processing Techniques
Text and Document Classification Technologies
Advanced Text Analysis Techniques
Information Retrieval and Search Behavior
Semantic Web and Ontologies
Time Series Analysis and Forecasting
Algorithms and Data Compression
Image Retrieval and Classification Techniques
Biomedical Text Mining and Ontologies
Face and Expression Recognition
Rough Sets and Fuzzy Logic
Advanced Image and Video Retrieval Techniques
Web Data Mining and Analysis
Data Management and Algorithms
Complex Network Analysis Techniques
Spam and Phishing Detection
Data Mining Algorithms and Applications
Bayesian Modeling and Causal Inference
Machine Learning and Algorithms
Advanced Graph Neural Networks
Speech and dialogue systems
Neural Networks and Applications
Recommender Systems and Techniques
Radiomics and Machine Learning in Medical Imaging

Université Grenoble Alpes
2015-2024

Laboratoire d'Informatique de Grenoble
2015-2024

Centre National de la Recherche Scientifique
2014-2024

Institut polytechnique de Grenoble
2017-2024

Laboratoire Interdisciplinaire de Physique
2022

Université Joseph Fourier
2008-2019

Académie de Grenoble
2019

Xerox (France)
1998-2018

Heriot-Watt University
2018

Laboratoire d'Informatique et d'Automatique pour les Systèmes
2011-2018

Complex Embeddings for Simple Link Prediction

OPENALEX - Publications

Théo Trouillon Johannes Welbl Sebastian Riedel Éric Gaussier Guillaume Bouchard

In statistical relational learning, the link prediction problem is key to automatically understand structure of large knowledge bases. As in previous studies, we propose solve this through latent factorization. However, here make use complex valued embeddings. The composition embeddings can handle a variety binary relations, among them symmetric and antisymmetric relations. Compared state-of-the-art models such as Neural Tensor Network Holographic Embeddings, our approach based on arguably...

10.48550/arxiv.1606.06357 preprint EN other-oa arXiv (Cornell University) 2016-01-01

An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition

OPENALEX - Publications

George Tsatsaronis Georgios Balikas Prodromos Malakasiotis Ioannis Partalas Matthias Zschunke and 17 more

This article provides an overview of the first BIOASQ challenge, a competition on large-scale biomedical semantic indexing and question answering (QA), which took place between March September 2013. assesses ability systems to semantically index very large numbers scientific articles, return concise user-understandable answers given natural language questions by combining information from articles ontologies.The 2013 comprised two tasks, Task 1a 1b. In participants were asked automatically...

10.1186/s12859-015-0564-6 article EN cc-by BMC Bioinformatics 2015-04-29

Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

OPENALEX - Publications

Benjamin Piwowarski Max Chevalier Éric Gaussier

10.1145/3331184 preprint EN 2019-07-18

Knowledge graph completion via complex tensor factorization

OPENALEX - Publications

Théo Trouillon Christopher R. Dance Éric Gaussier Johannes Welbl Sebastian Riedel and 1 more

In statistical relational learning, knowledge graph completion deals with automatically understanding the structure of large graphs--labeled directed graphs-- and predicting missing relationships--labeled edges. State-of-the-art embedding models propose different trade-offs between modeling expressiveness, time space complexity. We reconcile both expressiveness complexity through use complex-valued embeddings explore link such unitary diagonalization. corroborate our approach theoretically...

10.5555/3122009.3208011 article EN arXiv (Cornell University) 2017-01-01

Deep k-Means: Jointly clustering with k-Means and learning representations

OPENALEX - Publications

Maziar Moradi Fard Thibaut Thonet Éric Gaussier

10.1016/j.patrec.2020.07.028 article EN publisher-specific-oa Pattern Recognition Letters 2020-07-18

Relation between PLSA and NMF and implications

OPENALEX - Publications

Éric Gaussier Cyril Goutte

Non-negative Matrix Factorization (NMF, [5]) and Probabilistic Latent Semantic Analysis (PLSA, [4]) have been successfully applied to a number of text analysis tasks such as document clustering. Despite their different inspirations, both methods are instances multinomial PCA [1]. We further explore this relationship first show that PLSA solves the problem NMF with KL divergence, then implications relationship.

10.1145/1076034.1076148 article EN 2005-08-15

Word sequence kernels

OPENALEX - Publications

Nicola Cancedda Éric Gaussier Cyril Goutte J.-M. Renders

We address the problem of categorising documents using kernel-based methods such as Support Vector Machines. Since work Joachims (1998), there is ample experimental evidence that SVM standard word frequencies features yield state-of-the-art performance on a number benchmark problems. Recently, Lodhi et al. (2002) proposed use string kernels, novel way computing document similarity based matching non-consecutive subsequences characters. In this article, we propose technique with sequences...

10.5555/944919.944963 article EN Journal of Machine Learning Research 2003-03-01

Towards automatic extraction of monolingual and bilingual terminology

OPENALEX - Publications

Béatrice Daille Éric Gaussier Jean-Marc Langé

In this paper, we make use of linguistic knowledge to identify certain noun phrases, both in English and French, which are likely be terms. We then test compare different statistical scores select the "good" ones among candidate terms, finally propose a method build correspondences multi-words units across languages.

10.3115/991886.991975 article EN 1994-01-01

LSHTC: A Benchmark for Large-Scale Text Classification

OPENALEX - Publications

Ioannis Partalas Aris Kosmopoulos Nicolas Baskiotis Thierry Artières George Paliouras and 4 more

LSHTC is a series of challenges which aims to assess the performance classification systems in large-scale large number classes (up hundreds thousands). This paper describes dataset that have been released along series. The details construction datsets and design tracks as well evaluation measures we implemented quick overview results. All these datasets are available online runs may still be submitted on server challenges.

10.48550/arxiv.1503.08581 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Information-based models for ad hoc IR

OPENALEX - Publications

Stéphane Clinchant Éric Gaussier

We introduce in this paper the family of information-based models for ad hoc information retrieval. These draw their inspiration from a long-standing hypothesis IR, namely fact that difference behaviors word at document and collection levels brings on significance document. This has been exploited 2-Poisson mixture models, notion eliteness BM25, more recently DFR models. show here that, combined with notions related to burstiness, it can lead simpler better

10.1145/1835449.1835490 preprint EN 2010-07-19

Evaluation measures for hierarchical classification: a unified view and novel approaches

OPENALEX - Publications

Aris Kosmopoulos Ioannis Partalas Éric Gaussier Γεώργιος Παλιούρας Ion Androutsopoulos

10.1007/s10618-014-0382-x article EN Data Mining and Knowledge Discovery 2014-09-05

Period-aware content attention RNNs for time series forecasting with missing values

OPENALEX - Publications

Yagmur Gizem Cinar Hamid Mirisaee Parantapa Goswami Éric Gaussier Ali Aït-Bachir

10.1016/j.neucom.2018.05.090 article EN Neurocomputing 2018-05-30

Survey and Evaluation of Causal Discovery Methods for Time Series

OPENALEX - Publications

Karim ASSAAD Emilie DEVIJVER Éric Gaussier

We introduce in this survey the major concepts, models, and algorithms proposed so far to infer causal relations from observational time series, a task usually referred as discovery series. To do so, after description of underlying concepts modelling assumptions, we present different methods according family approaches they belong to: Granger causality, constraint-based approaches, noise-based score-based logic-based topology-based difference-based approaches. then evaluate several...

10.1613/jair.1.13428 article EN cc-by Journal of Artificial Intelligence Research 2022-02-28

A geometric view on bilingual lexicon extraction from comparable corpora

OPENALEX - Publications

Éric Gaussier J.-M. Renders Irina Matveeva Cyril Goutte Hervé Déjean

We present a geometric view on bilingual lexicon extraction from comparable corpora, which allows to re-interpret the methods proposed so far and identify unresolved problems. This motivates three new that aim at solving these Empirical evaluation shows strengths weaknesses of methods, as well significant gain in accuracy extracted lexicons.

10.3115/1218955.1219022 article EN 2004-01-01

Improving backfilling by using machine learning to predict running times

OPENALEX - Publications

Éric Gaussier David Glesser Valentin Reis Denis Trystram

The job management system is the HPC middleware responsible for distributing computing power to applications. While such systems generate an ever increasing amount of data, they are characterized by uncertainties on some parameters like running times. question raised in this work is: To what extent it possible/useful take into account predictions times improving global scheduling?

10.1145/2807591.2807646 preprint EN 2015-10-27

Generalized k-means-based clustering for temporal data under weighted and kernel time warp

OPENALEX - Publications

Saeid Soheily-Khah Ahlame Douzal-Chouakria Éric Gaussier

10.1016/j.patrec.2016.03.007 article EN Pattern Recognition Letters 2016-03-18

Enhanced Retrieval of Long Documents: Leveraging Fine-Grained Block Representations with Large Language Models

OPENALEX - Publications

Minghan Li Éric Gaussier Guodong Zhou

In recent years, large language models (LLMs) have demonstrated exceptional power in various domains, including information retrieval. Most of the previous practices involve leveraging these to create a single embedding for each query, passage, or document individually, strategy exemplified and used by Retrieval-Augmented Generation (RAG) framework. While this method has proven effective, we argue that it falls short fully capturing nuanced intricacies document-level texts due its reliance...

10.48550/arxiv.2501.17039 preprint EN arXiv (Cornell University) 2025-01-28

An approach based on multilingual thesauri and model combination for bilingual lexicon extraction

OPENALEX - Publications

Hervé Déjean Éric Gaussier Fatia Sadat

This paper focuses on exploiting different models and methods in bilingual lexicon extraction, either from parallel or comparable corpora, specialized domains. First, a special attention is given to the use of multilingual thesauri, search strategies based such thesauri are investigated. Then, method combine for extraction presented. Our results show that combination significantly improves results, hierarchical information contained our thesaurus, UMLS/MeSH, primary importance. Lastly,...

10.3115/1072228.1072394 article EN Proceedings of the 17th international conference on Computational linguistics - 2002-01-01

Coming Soon ...