Théo Gigant

ORCID: 0009-0003-6392-8519
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Semantic Web and Ontologies
  • Biomedical Text Mining and Ontologies
  • Artificial Intelligence in Healthcare and Education
  • Video Analysis and Summarization
  • Music and Audio Processing

Laboratoire des signaux et systèmes
2024

CentraleSupélec
2023

Université Paris-Saclay
2023

Centre National de la Recherche Scientifique
2023

Training and evaluating language models increasingly requires the construction of meta-datasets --diverse collections curated data with clear provenance. Natural prompting has recently lead to improved zero-shot generalization by transforming existing, supervised datasets into a diversity novel pretraining tasks, highlighting benefits meta-dataset curation. While successful in general-domain text, translating these data-centric approaches biomedical modeling remains challenging, as labeled...

10.48550/arxiv.2206.15076 preprint EN cc-by-sa arXiv (Cornell University) 2022-01-01

10.18653/v1/2024.emnlp-main.1078 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2024-01-01

Large language models and multimodal language-vision give impressive results on current available summarization benchmarks, but are not designed to handle long documents. Most datasets composed of either mono-modal documents or short In order develop for understanding summarizing real-world videoconference records that typically around 1 hour long, we propose a dataset 9,103 extracted from the German National Library Science Technology (TIB) archive, along with their abstract. Additionally,...

10.1145/3617233.3617238 article EN 2023-09-20
Coming Soon ...