MIAI MEETING - 9th November 2023

On  November 9, 2023


11:00 – 11:10:

Introduction, Hervé Martin & Eric Gaussier (Director and Scientific director of MIAI)

11:10 – 11:45:

NLP Beyond Correlation, Maxime Peyrard (CNRS Junior Professor Chair)

In this presentation, we will provide a concise overview of the importance of a causal perspective and demonstrate various applications in Natural Language Processing (NLP). For the training language models, we will show how the principle of invariance can be harnessed to create models that exhibit better generalization capabilities. Then, we will delve into the necessity of causal analysis for model interpretability and introduce the notion of causal abstraction laying out a research direction for the future. Lastly, we will explore how adopting a causal perspective can offer valuable insights to enhance our evaluation methodologies.

11:45 – 12:20:

Evaluation of large language models for French : from FlauBERT to Pantagruel, Didier Schwab (Senior lecturer, LIG ) & Lorraine Goeuriot (Senior lecturer, UGA )

12:20 – 12:55:

Synthetic Data and Large Language Models  - a Curse in Disguise? Matthias Gallé (Machine Learning manager, Cohere )

The use of synthetically generated data has grown in popularity over the last year. This popularity comes on top of the double hope of continuing the meteoric rise of non-annotated data that powered self-training; as well as breaking the prophesized obstacle (by scaling laws prophets) that "we are running out of textual data".
This somehow counterintuitive popularity has raised concerns on hidden biases and model collapse. In this talk we are going to review a few real-world usages of synthetic data as well as pitfalls to avoid

12:55 – 13:00:

Conclusion, Hervé Martin

13:00 – 14:00


Published on  December 11, 2023
Updated on  December 12, 2023