WARA Media and Language

The WASP Research Arena for Media and Language (WARA M&L) builds a multidisciplinary ecosystem that bridges scientific fields and industrial sectors.

Our Objectives

The arena builds strong partnerships through collaborative projects in Media AI, leveraging ongoing investments in media infrastructure to evaluate emerging technologies and promote cross-industry knowledge sharing. Our research is centered on the generation and analysis of media data, as well as understanding its broader societal impacts. To accelerate advancements through various technology-readiness levels, we offer comprehensive support in data management, benchmarking, and engineering.

Photo: Peter Karlsson, Svarteld form & foto

 

Research Focus Areas

Through active dialogue with our community, we’ve identified key areas of research: Embodied Machine Learning, Graph-Based Models, Multimodal Foundation Models, and Interactive and Creative AI.

Embodied machine learning

Machine learning models in robotics and autonomous systems are often trained on small, task-specific datasets and struggle with skill transfer to new tasks. In contrast, foundation models, pretrained on large datasets, show better generalization and can solve problems not directly represented in their training data. These models, especially when multimodal, have the potential to significantly improve robot autonomy, from perception and human-robot interaction to planning. Vision-language models, for example, enhance visual recognition and generalizable action planning. Furthermore, robots that autonomously interact with their environment and update their models through real-time data collection are key to building more informed foundation models. A combination of reinforcement learning, representational learning, and language grounding could help solve many current challenges. The arena benefits from expertise in this area through its collaboration with Danica Kragic and KTH’s Robotics, Perception, and Learning lab, and sees a potential for further interaction with WARA Robotics.

Graph-based models

Generative AI is a popular research area with diffusion models and normalizing flows being applied to diverse tasks, such as language, image generation, source code, gestures, and music. However, current methods often generate media without symbolic representation, such as raster images instead of vector-based ones, limiting user flexibility and making it harder for systems to maintain semantics when editing images. For example, generative systems like Midjourney may misinterpret images when asked to create variations, leading to distorted results. To address this, a two-step generation process, first producing a graph-based representation and then the surface form, could improve accuracy. Key researchers in this area include Frank Drewes,  Anastasia Varava, Henrik Björklund, and Ruibo Tu.

 Multimodal foundation models

The recent collaboration with RISE, NVIDIA, and AI Sweden on Language Models (LLMs) has been valuable for both practical and academic insights, particularly around the GPT-SW3 model series (access the model here). The project has deepened understanding of user needs and challenges, with many favoring model-agnostic systems to integrate the best cost-performance LLMs. Some organizations, however, may require private, cloud-based instances to protect intellectual property. Public bodies with sensitive data, such as the Swedish Tax Agency and Swedish Armed Forces, need open models that can be hosted on premises. Looking ahead, the focus is on developing small to medium-sized foundation models, especially for multimodal data like time-series or graphs, where significant scientific and practical gains are expected. Ongoing collaboration with AI Sweden will complement this by providing larger, more versatile models. Key researchers in this area include Love Börjesson (KB Labs) and Marco Kuhlmann (LiU), with an emphasis on attracting international talent. 

Interactive and Creative AI

AI is fundamentally transforming how we interact with data, necessitating the parallel development of the fields of Human-Computer Interaction and User Experience. On one hand, challenges arise in ensuring Trustworthy and Explainable AI. For instance, there is the question of how to convey to the user the information and assumptions on which an AI bases its decisions, help the user understand what tasks fall outside the scope of the AI’s capabilities.

Another important question is human-AI teaming, which requires effective methods for interpreting and controlling semi-automatic systems. The context can be autonomous mining or forestry, but can also encompass gaming and media production workflows.  To this end, we want to collaborate with international contacts acquired through the Gaming Stream, and with the researchers linked to WASP-HS. Arena members with specific expertise in this domain include Gutav Eje Henter and Konrad Tollmar. Relevant partners include Electronic Arts, King and Motorica with which we organize events and projects, including the GENEA Challenge.

Photo: Peter Karlsson, Svarteld form & foto

 

Our Community 

At WARA Media & Language, our community is a vibrant, inclusive, and collaborative network that brings together experts from various industries, academia, and PhD students. We believe in the power of diverse perspectives, and our community thrives on the exchange of ideas across these sectors. Throughout the year, we host a variety of seminars and conferences that foster deeper engagement and knowledge sharing. Two key recurring events are the WASP Summer School on Generative AI, where participants explore the latest advancements in AI, and the WARA Community Days, which serve as a hub for our community to connect, collaborate, and dive deeper into cutting-edge research and applications in Media AI.

To join our community, sign up here, you will also get on our  email list so that you get the latest news and opportunities in the arena.

Photo: Peter Karlsson, Svarteld form & foto

 

Opportunities for PhD Students

We hosts an annual WASP Summer School on Generative AI at the Visualization Center in Norrköping. This event features expert lectures on generative language models, invertible neural networks, and speech synthesis. Students also collaborate on designing prototype avatars, with final presentations held in the 3D dome theatre at the Visualization Center. This week-long event offers both in-depth knowledge and valuable networking opportunities with peers, professors, and industry experts.

We also offer various workshops and events in collaboration with our partners.

Video: Peter Karlsson, Svarteld form & foto

 

WARA Media & Language Podcast

Stay updated on the latest AI research in Media, Language, and Gaming by tuning into the WARA Media & Language podcast. Hear insights from industry leaders, tech companies, and startups within our community.

Collaborate with us

We welcome collaborations with organizations and individuals dedicated to advancing AI in Media and Language. Our network includes both established enterprises and international initiatives like Mila in Montreal and the  British DPP.

Contact: johanna@cs.umu.se to explore collaborative opportunities.

Stay Connected: Follow us on LinkedIn to stay updated on upcoming events and opportunities.

Video: Peter Karlsson, Svarteld form & foto

 

The Core Team

Johanna Björklund, Project Manager, Umeå University & Codemill AB

Sandor Albrecht, Co-project Manager, KAW

Ivana von Proschwitz, Community Manager, WARA Media & Language

Anastasia Varava, Data Scientist, SEB

Konrad Tollmar, Research Director, EA Games / Associate Professor, KTH Royal Institute of Technology

Gustav Eje Henter, Assistant Professor, KTH Royal Institute of Technology

Jonas Unger, Professor, Linköping University

Alexandra Kafka Larsson, CEO, Parsd

Contact

Ivana von Proschwitz

WARA Media & Language Community Manager

Johanna Björklund

Project Manager WARA Media & Language, Adj. member AMG, Assoc. Professor, Department of Computing Science, Umeå University, Co-founder, Adlede and Codemill
We use cookies to personalise content and ads, to provide social media features and to analyse our traffic. We also share information about your use of our site with our social media, advertising and analytics partners. View more
Cookies settings
Accept
Privacy & Cookie policy
Privacy & Cookies policy
Cookie name Active
The WASP website wasp-sweden.org uses cookies. Cookies are small text files that are stored on a visitor’s computer and can be used to follow the visitor’s actions on the website. There are two types of cookie:
  • permanent cookies, which remain on a visitor’s computer for a certain, pre-determined duration,
  • session cookies, which are stored temporarily in the computer memory during the period under which a visitor views the website. Session cookies disappear when the visitor closes the web browser.
Permanent cookies are used to store any personal settings that are used. If you do not want cookies to be used, you can switch them off in the security settings of the web browser. It is also possible to set the security of the web browser such that the computer asks you each time a website wants to store a cookie on your computer. The web browser can also delete previously stored cookies: the help function for the web browser contains more information about this. The Swedish Post and Telecom Authority is the supervisory authority in this field. It provides further information about cookies on its website, www.pts.se.
Save settings
Cookies settings