Chalmers is seeking a PhD position in theoretical machine learning to uncover how and why transformers work using information theory.
Project description
Transformers are central to many of today’s most successful AI models, from language understanding to computer vision. Yet, their success remains largely empirical, with limited theoretical understanding. This PhD project aims to close that gap by developing a rigorous mathematical framework to analyze and explain transformers using information-theoretic tools.
You will be co-supervised by:
- Prof. Giuseppe Durisi, Department of Electrical Engineering
- Prof. Rebecka Jörnsten, Department of Mathematical Sciences
You will join the Communication Systems (CS) group within the Division of Communications, Antennas, and Optical Networks (CAOS) at the Department of Electrical Engineering. The group includes nine professors and a dynamic team of PhD students and postdocs working in areas such as wireless and optical communication, information security, information theory, and machine learning.