Seamlessly integrating simulation models with machine learning can take AI systems to a whole new level. This is what we set out to do in the cross-disciplinary project __main__.
Our aim is to develop new methodology for systematically combining data-driven and simulation-based models at an unprecedented level of integration, thereby bridging the gap between the physical and virtual worlds.
AI systems, in particular machine learning and algorithmic decision making, are rapidly becoming important components in virtually all aspects of technology and exhibit an increasing societal impact. Although data-driven and learning-based algorithms and systems have paved the way for disruptive technological breakthroughs, several challenges remain unsolved.
One central challenge is to exploit the benefits from classical approaches such as engineered models and simulators for, or in combination with, data-driven models. In other words, future research is required to develop models and methods for systematic integration of data coming from both the real and virtual worlds in machine learning frameworks.
Three technological challenges are tightly coupled to the research in __main__:
- Zero latency augmented reality (AR)
- Zero artifact data augmentation (DA)
- Zero failure hybrid modelling (HM)
Read more about these challenges in the scientific presentation below.
The research in __main__ is characterized by N-E-S-T
The integration of physical and virtual worlds has the potential to be the next disruptive step in diverse applications of AI and autonomous systems, ranging from augmented reality experiences for human consumption, to data generation for machine learning, and reliable predictive modeling of dynamical and spatio-temporal processes.
This NEST gathers research groups in computer vision, computer graphics, human-system interaction, and statistical machine learning, forming a unique combination of expertise required to address the challenge posed by alignment and integration of physical and virtual worlds.
The NEST itself forms a cross-disciplinary environment with strong links to various application domains that will all benefit from the basic methodological research. Together with the partners, IKEA Communications, Arriver, the Swedish Traffic Administration, AI Sweden, and WASP WARA Media and Language, we will explore different applications including product visualization, generation of synthetic data for autonomous vehicles, and simulation of traffic scenarios.
The __main__ core team consists of researchers from the Computer Graphics and Image Processing group at Linköping University, the Computer Vision Laboratory at Linköping University, the division for Statistics and Machine Learning at Linköping University, and AASS Machine Perception and Interaction Lab at Örebro University
The aim of __main__ is to establish a new modeling paradigm for end-to-end learning and systematic integration of data and simulators. The three technological challenges approach the problem from three different vantage points and present fundamental theoretical research questions that overlap and span across them all.
The ability to seamlessly insert virtual objects into streaming video sequences, AR, is a holy grail application intersecting the fields of computer vision and graphics. Current solutions suffer from limited segmentation and tracking accuracy, temporal discontinuity, and lags in rendering. Eliminating the delays and temporal instability of object alignments in current AR systems requires predicting the observer’s pose, motion, attention, and quality expectation.
Here we take a machine learning approach to predict those aspects of human-system interaction to attain zero latency. Driven by weakly annotated user experience, the whole process of segmentation, geometric and photometric prediction, and rendering is trained to minimize perceivable lags and spatio-temporal misalignments. This results in a seamless integration of traditional and neural rendering.
To take data augmentation to the next level, we are developing frameworks for mixing real data with virtual objects. For instance, in autonomous driving and active safety systems, the use of data augmentation makes it possible to efficiently populate captured training data sets with complex traffic scenarios. This includes scenarios, locations, or rare events not easily and safely captured in real footage. The augmentation needs to determine which features that are relevant for learning to control the trade-off between rendering complexity and achieved performance.
Spatio-temporal prediction is a central task in many domains, e.g., epidemiology, climatology, and transportation systems, usually addressed by either data-driven methods or sophisticated simulation models, both suffering from respective failure modes: Simulation is limited by the underlying (inevitably simplifying) assumptions, and machine learning degrades in boundary cases with scarce data or non-stationary data distributions.
We therefore aim at optimally integrating both approaches to achieve zero failure. In principle, the system will take both the real data and the simulator as inputs and learn a predictor end-to-end. This involves learning if and to what extent the different components can be trusted and if the integrated information is in itself insufficient for enabling accurate and robust predictions.
Key research challenges
To enable this leap forward, data from the physical world and the models from the virtual worlds need to be tightly integrated. This integration must consider spatial and temporal alignment, physical constraints, and incorporation of user-centric, external, or unmodeled factors.
This requires addressing a number of key challenges such as prediction over time, determination of context in complex, open, and dynamic environments, modeling of human interaction, and estimation of non-deterministic or indeterminable physical processes. The main multi-disciplinary scientific challenge is how to learn the entire alignment and integration process end-to-end under weak, user-centric supervision.
A key challenge central to all research and development is how to meet UN’s Sustainable Development Goals (SDG). Our NEST relates to and has potential impact on several of the 17 SDGs.
- The applications in hybrid modelling for traffic planning and road safety estimation relates to SDG 11.2 towards safe and sustainable transportation systems.
- The AR applications for “at home” visualization of furniture and living space optimization relates to SDG 11.3 and urbanization.
- With a foundation in the research within __main__, we are collaborating with climate researchers to develop AI-powered decision support systems to support SDG 13 by increasing knowledge about climate change and global warming, and by facilitating researchers and decision-makers with tools for visualization and analysis.
Work with us
We are currently seeking PhD students for this project:
- PhD student in Computer Vision
- PhD student in Computer Science
- PhD student in Computer Graphics and Machine Learning
As a PhD student in WASP you participate in the WASP Graduate School. It is a national graduate school with a strong focus on scientific excellens. A key value is also supporting networking with industry and academia, on a national as well as international arena.