Extending text-driven 3D human motion generation with Diffusion Models using LLM paraphrasing
- With: Félix Fourreau, supervised by Leore Bensabath
- Ressources: Slides; Report; Code
- When: 2025
- Associated to: MVA
The project involved creating a text-driven 3D human motion generation model using diffusion models. We studied extensively the generalization capabilities of the model across different datasets, that have different textual description styles. We show that by paraphrasing the text descriptions, we can improve the generalization capabilities of the model. We also explore the impact of different augmentations on the model’s performance. Finally, inspired by the success of diffusion models in image generation, we explore the use of ConvUnets with attention mechanisms as the backbone of the model.
