Postdoc position at Umeå University in the WASP NEST project STING – Synthesis and analysis with Transducers and Invertible Neural Generators.
Project description and working tasks
Human communication is multimodal in nature, and occurs through combinations of speech, language, gesture, facial expression, and similar signals. STING aims to design models that capture this richness, uniting synthesis and analysis with the help of transducers and deep neural generative models. This involves connecting concrete, continuous valued sensory data such as images, sound, and motion, with high-level, predominantly discrete, representations of meaning, which has the potential to endow synthesis output with human understandable high-level explanations, while simultaneously improving the ability to attach probabilities to semantic representations. The bidirectionality also allows us to create efficient mechanisms for explainability, and to inspect and enforce fairness in the models.
The partner research groups bring complementary expertise to the project: KTH has extensive experience with probabilistic deep learning for analysis and synthesis of human verbal and nonverbal communication. Umeå University, on the other hand, are experts on transducer and grammar models for generating semantic graphs, and have recently started to apply these to the task of parsing multimodal data. They also contribute experience with bias analysis and mitigation. Linköping University complements these aspects with in-depth knowledge of natural language processing, language being a discrete yet observable signal modality of great interest for bridging the two ends of the project.
In addition to its scientific value, the project is expected to have a substantial societal imprint. The resulting technologies may, e.g., be used to create virtual patients for medical training, to model non-playable characters in video games, and to derive affective states and underlying health issues from human speech and nonverbal behaviour. Read more about the NEST project at:
Possible research directions for the postdoctor are:
- Work on the latest deep generative models with diffusion models and normalising flows
- Combining discrete and continuous methods for synthesis and/or analysis
The project is conducted within the research group for the Foundations of Language Processing at Umeå University. The group studies theoretical and practical aspects of representing language on computers, and its interconnection with other sources of information. The work of the group spans from formal language theory to applied natural language processing. The group consists of five senior researchers and five PhD students. More information is available at:
https://www.umu.se/en/research/groups/foundations-of-language-processing/