KTH School of Electrical Engineering and Computer Science seeks two PhD students in generative AI for 3D gesture motion.
Project description
In face-to-face communication, humans use co-speech gestures – such as hand, arm, head, and body movements – to enhance clarity, naturalness, and efficiency. Our lab is a pioneer and global leader in leveraging 3D motion-capture data and generative AI to synthesise these gestures for virtual characters and humanoid robots.
The specific focus of this project is to use data on human opinion – e.g., subjective ratings of whether a gesture looks good visually, or whether it’s a good fit for the character’s speech – to further improve gesture generation. You will not only try to predict human opinion of these aspects, but also use ratings data to further improve models using direct preference optimisation (DPO) or other reinforcement learning from human feedback (RLHF) techniques. There may also be opportunities to pursue improved benchmarking and evaluation practices for gesture motion, sign-language synthesis, or joint multimodal synthesis of speech audio and gesture motion together.
Supervision: Assistant Professor Gustav Eje Henter; co-supervisors Prof. Jonas Beskow and Assistant Prof. Éva Székely
KTH Royal Institute of Technology in Stockholm is Sweden’s largest technical research and learning institution and home to students, researchers and faculty from around the world.
