The aim of the AI/MATH area is to provide a mathematical theory of the fundamental building blocks of AI. Recently, the fast progress in AI/ML has — to a large extent — been based on heuristic best practices obtained from result-driven engineering principles. In this rapidly developing chain of events, the mathematical understanding and justification has not been able to keep up with the field.
Our aim is to develop and formulate underlying mathematical concepts and theorems that can explain and help understand the above described progress. Distilling mathematical ideas and mechanisms that are behind different successful applications will lead to a deeper understanding of the field, and will promote interdisciplinary cross-fertilization.
With regards to the coming recruitments, they will be instrumental in forming the final program. We see several possible research directions that are relevant for the mathematical foundations, for example:
Verifiable, rigorous methods: This is a challenge as AI is used for increasingly more important tasks, that may be life and death critical. Without a thorough understanding of the inner workings of AI, it is impossible to predict a model’s behavior on previously untested input data. Developing a firm theoretical foundation of reachability analysis for AI is a challenge for mathematics.
Reproducibility/robustness: Large-scale numerical computations that are performed on heterogeneous platforms are known to be non-deterministic, thus producing different results when repeatedly run with the same initial data. Large AI models share this problem. How can we find an appropriate standard for reproducibility? What is a correct measure to use? Which properties will guarantee that a system produces reproducible results, and which properties will prevent this? How robust is a model to small perturbations of its parameters, or of small changes to the training data? This questions are important to address, and require fundamental research.
Optimization: Here the AI-community is facing large challenges. Today, the workhorse of all training algorithms is based on gradient decent. Despite thousands of academic papers on this topic, and countless ways of applying the method, the process still relies on trial and error. Matters are not helped by the apparent disconnect between the researchers in the field of AI and large parts of the optimization community. In AI there seems to be very little use of the highly advanced theory of nonlinear constrained optimization already developed within the optimization community. Promoting a transfer of knowledge to the field of AI would certainly improve matters.
Complexity: In short, this boils down to the question “What problems can we learn to solve and at what cost?”. Understanding the computational complexity of a given class of problem instances helps us find – in a quantifiable manner – a good balance between training set size, model complexity, and prediction quality. Complexity results in terms of worst-case and average-case scenarios will lead to a deeper understanding of the absolute potential of methods used for AI. Here, classical approximation theory has a large role to play. Applying this large field of mathematics to AI will open for new possibilities and can give the necessary tools for solving these problems.