Julien Chaumond

ORCID: 0000-0003-3188-1616
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Speech Recognition and Synthesis
  • Speech and dialogue systems
  • Domain Adaptation and Few-Shot Learning
  • Advancements in Photolithography Techniques
  • Explainable Artificial Intelligence (XAI)
  • AI in cancer detection
  • Biomedical Text Mining and Ontologies
  • Machine Learning and Data Classification
  • Machine Learning in Materials Science
  • Multimodal Machine Learning Applications
  • Non-Destructive Testing Techniques
  • Semiconductor Lasers and Optical Devices
  • Neural Networks and Applications
  • Ferroelectric and Negative Capacitance Devices
  • Medical Image Segmentation Techniques
  • Speech and Audio Processing
  • Cell Image Analysis Techniques
  • Power Systems and Technologies
  • Parallel Computing and Optimization Techniques
  • Advanced Text Analysis Techniques

Hugging Face
2018-2020

FACE Foundation
2018

Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, Alexander Rush. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2020.

10.18653/v1/2020.emnlp-demos.6 article EN cc-by 2020-01-01

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large on-the-edge and/or under constrained computational training or inference budgets remains challenging. In this work, we propose a method to pre-train smaller general-purpose language representation model, called DistilBERT, which can then be fine-tuned with good performances on wide range of tasks like its larger counterparts. While most prior work...

10.48550/arxiv.1910.01108 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Recent progress in natural language processing has been driven by advances both model architecture and pretraining. Transformer architectures have facilitated building higher-capacity models pretraining made it possible to effectively utilize this capacity for a wide variety of tasks. \textit{Transformers} is an open-source library with the goal opening up these wider machine learning community. The consists carefully engineered state-of-the art under unified API. Backing curated collection...

10.48550/arxiv.1910.03771 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Recent progress in natural language processing has been driven by advances both model architecture and pretraining. Transformer architectures have facilitated building higher-capacity models pretraining made it possible to effectively utilize this capacity for a wide variety of tasks. Transformers is an open-source library with the goal opening up these wider machine learning community. The consists carefully engineered state-of-the art under unified API. Backing curated collection...

10.5281/zenodo.5347031 article EN cc-by 2020-10-01

We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is combination of Transfer learning based training scheme and high-capacity Transformer model. Fine-tuning performed by using multi-task objective combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq information-retrieval models....

10.48550/arxiv.1901.08149 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Quentin Lhoest, Albert Villanova del Moral, Yacine Jernite, Abhishek Thakur, Patrick von Platen, Suraj Patil, Julien Chaumond, Mariama Drame, Plu, Lewis Tunstall, Joe Davison, Mario Šaško, Gunjan Chhablani, Bhavitvya Malik, Simon Brandeis, Teven Le Scao, Victor Sanh, Canwen Xu, Nicolas Patry, Angelina McMillan-Major, Philipp Schmid, Sylvain Gugger, Clément Delangue, Théo Matussière, Lysandre Debut, Stas Bekman, Pierric Cistac, Thibault Goehringer, Mustar, François Lagunas, Alexander Rush,...

10.18653/v1/2021.emnlp-demo.21 preprint EN cc-by 2021-01-01

Evaluation is a key part of machine learning (ML), yet there lack support and tooling to enable its informed systematic practice. We introduce Evaluate on the Hub --a set tools facilitate evaluation models datasets in ML. library best practices for measurements, metrics, comparisons data models. Its goal reproducibility evaluation, centralize document process, broaden cover more facets model performance. It includes over 50 efficient canonical implementations variety domains scenarios,...

10.48550/arxiv.2210.01970 preprint EN cc-by arXiv (Cornell University) 2022-01-01

The scale, variety, and quantity of publicly-available NLP datasets has grown rapidly as researchers propose new tasks, larger models, novel benchmarks. Datasets is a community library for contemporary designed to support this ecosystem. aims standardize end-user interfaces, versioning, documentation, while providing lightweight front-end that behaves similarly small internet-scale corpora. design the incorporates distributed, community-driven approach adding documenting usage. After year...

10.48550/arxiv.2109.02846 preprint EN cc-by arXiv (Cornell University) 2021-01-01

We reformulate the problem of encoding a multi-scale representation sequence in language model by casting it continuous learning framework. propose hierarchical which short time-scale dependencies are encoded hidden state lower-level recurrent neural network while longer dynamic having meta-learner update weights an online meta-learning fashion. use elastic consolidation as higher-level to prevent catastrophic forgetting our

10.18653/v1/p18-2001 article EN cc-by 2018-01-01

The advancement of speech technologies has been remarkable, yet its integration with African languages remains limited due to the scarcity corpora. To address this issue, we present AfroDigits, a minimalist, community-driven dataset spoken digits for languages, currently covering 38 languages. As demonstration practical applications conduct audio digit classification experiments on six [Igbo (ibo), Yoruba (yor), Rundi (run), Oshiwambo (kua), Shona (sna), and Oromo (gax)] using...

10.48550/arxiv.2303.12582 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

We consider the task of word-level language modeling and study possibility combining hidden-states-based short-term representations with medium-term encoded in dynamical weights a model. Our work extends recent experiments on models dynamically evolving by casting problem into an online learning-to-learn framework which meta-learner is trained gradient-descent to continuously update model weights.

10.48550/arxiv.1803.10631 preprint EN other-oa arXiv (Cornell University) 2018-01-01

We reformulate the problem of encoding a multi-scale representation sequence in language model by casting it continuous learning framework. propose hierarchical which short time-scale dependencies are encoded hidden state lower-level recurrent neural network while longer dynamic having meta-learner update weights an online meta-learning fashion. use elastic consolidation as higher-level to prevent catastrophic forgetting our

10.48550/arxiv.1805.05758 preprint EN other-oa arXiv (Cornell University) 2018-01-01
Coming Soon ...