Éric Gaussier

ORCID: 0000-0002-8858-3233
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Text and Document Classification Technologies
  • Advanced Text Analysis Techniques
  • Information Retrieval and Search Behavior
  • Semantic Web and Ontologies
  • Time Series Analysis and Forecasting
  • Algorithms and Data Compression
  • Image Retrieval and Classification Techniques
  • Biomedical Text Mining and Ontologies
  • Face and Expression Recognition
  • Rough Sets and Fuzzy Logic
  • Advanced Image and Video Retrieval Techniques
  • Web Data Mining and Analysis
  • Data Management and Algorithms
  • Complex Network Analysis Techniques
  • Spam and Phishing Detection
  • Data Mining Algorithms and Applications
  • Bayesian Modeling and Causal Inference
  • Machine Learning and Algorithms
  • Advanced Graph Neural Networks
  • Speech and dialogue systems
  • Neural Networks and Applications
  • Recommender Systems and Techniques
  • Radiomics and Machine Learning in Medical Imaging

Université Grenoble Alpes
2015-2024

Laboratoire d'Informatique de Grenoble
2015-2024

Centre National de la Recherche Scientifique
2014-2024

Institut polytechnique de Grenoble
2017-2024

Laboratoire Interdisciplinaire de Physique
2022

Université Joseph Fourier
2008-2019

Académie de Grenoble
2019

Xerox (France)
1998-2018

Heriot-Watt University
2018

Laboratoire d'Informatique et d'Automatique pour les Systèmes
2011-2018

In statistical relational learning, the link prediction problem is key to automatically understand structure of large knowledge bases. As in previous studies, we propose solve this through latent factorization. However, here make use complex valued embeddings. The composition embeddings can handle a variety binary relations, among them symmetric and antisymmetric relations. Compared state-of-the-art models such as Neural Tensor Network Holographic Embeddings, our approach based on arguably...

10.48550/arxiv.1606.06357 preprint EN other-oa arXiv (Cornell University) 2016-01-01

This article provides an overview of the first BIOASQ challenge, a competition on large-scale biomedical semantic indexing and question answering (QA), which took place between March September 2013. assesses ability systems to semantically index very large numbers scientific articles, return concise user-understandable answers given natural language questions by combining information from articles ontologies.The 2013 comprised two tasks, Task 1a 1b. In participants were asked automatically...

10.1186/s12859-015-0564-6 article EN cc-by BMC Bioinformatics 2015-04-29

In statistical relational learning, knowledge graph completion deals with automatically understanding the structure of large graphs--labeled directed graphs-- and predicting missing relationships--labeled edges. State-of-the-art embedding models propose different trade-offs between modeling expressiveness, time space complexity. We reconcile both expressiveness complexity through use complex-valued embeddings explore link such unitary diagonalization. corroborate our approach theoretically...

10.5555/3122009.3208011 article EN arXiv (Cornell University) 2017-01-01

10.1016/j.patrec.2020.07.028 article EN publisher-specific-oa Pattern Recognition Letters 2020-07-18

Non-negative Matrix Factorization (NMF, [5]) and Probabilistic Latent Semantic Analysis (PLSA, [4]) have been successfully applied to a number of text analysis tasks such as document clustering. Despite their different inspirations, both methods are instances multinomial PCA [1]. We further explore this relationship first show that PLSA solves the problem NMF with KL divergence, then implications relationship.

10.1145/1076034.1076148 article EN 2005-08-15

We address the problem of categorising documents using kernel-based methods such as Support Vector Machines. Since work Joachims (1998), there is ample experimental evidence that SVM standard word frequencies features yield state-of-the-art performance on a number benchmark problems. Recently, Lodhi et al. (2002) proposed use string kernels, novel way computing document similarity based matching non-consecutive subsequences characters. In this article, we propose technique with sequences...

10.5555/944919.944963 article EN Journal of Machine Learning Research 2003-03-01

In this paper, we make use of linguistic knowledge to identify certain noun phrases, both in English and French, which are likely be terms. We then test compare different statistical scores select the "good" ones among candidate terms, finally propose a method build correspondences multi-words units across languages.

10.3115/991886.991975 article EN 1994-01-01

LSHTC is a series of challenges which aims to assess the performance classification systems in large-scale large number classes (up hundreds thousands). This paper describes dataset that have been released along series. The details construction datsets and design tracks as well evaluation measures we implemented quick overview results. All these datasets are available online runs may still be submitted on server challenges.

10.48550/arxiv.1503.08581 preprint EN other-oa arXiv (Cornell University) 2015-01-01

We introduce in this paper the family of information-based models for ad hoc information retrieval. These draw their inspiration from a long-standing hypothesis IR, namely fact that difference behaviors word at document and collection levels brings on significance document. This has been exploited 2-Poisson mixture models, notion eliteness BM25, more recently DFR models. show here that, combined with notions related to burstiness, it can lead simpler better

10.1145/1835449.1835490 preprint EN 2010-07-19

We introduce in this survey the major concepts, models, and algorithms proposed so far to infer causal relations from observational time series, a task usually referred as discovery series. To do so, after description of underlying concepts modelling assumptions, we present different methods according family approaches they belong to: Granger causality, constraint-based approaches, noise-based score-based logic-based topology-based difference-based approaches. then evaluate several...

10.1613/jair.1.13428 article EN cc-by Journal of Artificial Intelligence Research 2022-02-28

We present a geometric view on bilingual lexicon extraction from comparable corpora, which allows to re-interpret the methods proposed so far and identify unresolved problems. This motivates three new that aim at solving these Empirical evaluation shows strengths weaknesses of methods, as well significant gain in accuracy extracted lexicons.

10.3115/1218955.1219022 article EN 2004-01-01

The job management system is the HPC middleware responsible for distributing computing power to applications. While such systems generate an ever increasing amount of data, they are characterized by uncertainties on some parameters like running times. question raised in this work is: To what extent it possible/useful take into account predictions times improving global scheduling?

10.1145/2807591.2807646 preprint EN 2015-10-27

In recent years, large language models (LLMs) have demonstrated exceptional power in various domains, including information retrieval. Most of the previous practices involve leveraging these to create a single embedding for each query, passage, or document individually, strategy exemplified and used by Retrieval-Augmented Generation (RAG) framework. While this method has proven effective, we argue that it falls short fully capturing nuanced intricacies document-level texts due its reliance...

10.48550/arxiv.2501.17039 preprint EN arXiv (Cornell University) 2025-01-28

This paper focuses on exploiting different models and methods in bilingual lexicon extraction, either from parallel or comparable corpora, specialized domains. First, a special attention is given to the use of multilingual thesauri, search strategies based such thesauri are investigated. Then, method combine for extraction presented. Our results show that combination significantly improves results, hierarchical information contained our thesaurus, UMLS/MeSH, primary importance. Lastly,...

10.3115/1072228.1072394 article EN Proceedings of the 17th international conference on Computational linguistics - 2002-01-01
Coming Soon ...