Topological Data Analysis

The course in Topological Data Analysis, 6 credits, is held in Autumn 2019. It is an elective course. More information about dates, location and course responsible teachers will be given further on.

How to give a machine a sense of geometry? There are two aspects of what a sense is: technical tool and ability to learn to use it. This learning ability is essential. For example we are born with technical ability to detect smells and through our lives we develop it, depending on needs and environment around us. In this course the technical tool we introduce to describe geometry is based on homology. The main aim of the course is to explain how versatile this tool is and how to use this versatility to give a machine ability to learn to sense geometry.

Technical tool
Homology, the central theme of the 20th century geometry, has been particularly useful for studying spaces with controllable cell decompositions such as Grassmann varieties. During the last decade there has been an explosion of applications ranging from neuroscience to vehicle tracking, protein structure analysis and the nano characterization of materials, testifying to the usefulness of homology to describe also spaces related to data sets. One might ask: why homology? Often due to heterogeneity or the presence of noise, it is very hard to understand our data. In these cases rather than trying to fit the data with complicated models a good strategy is to first investigate shape properties of such data. Here homology comes into play.

Learning
We explain how to use homology to convert geometry of datasets into features suitable for
statistical analysis and machine learning. It is a process that translates spacial and geometrical information into information that can be analysed through more basic operations such as counting and integration. Furthermore we provide an entire space of such translations.
Learning how to choose an appropriate translation in this space can be done in the spirit of machine learning.

Contact persons 

  • Wojciech Chachólski, wojtek at kth.se
  • Florian Pokorny, fpokorny at kth.se
  • Martina Scolamiero, scola at kth.se

 

Module #1 (12-13 September, KTH)
Using homology to encode spatial and geometric information by collections of vector spaces.

We will present the interplay between structures on sets such as partitions, pseudo-metrics, simplicial complexes and dendograms and parametrised vector spaces.
We show how parametrised homology is a natural generalization of hierarchical clustering.
The introduction to homology will be unconventional. To follow the course, prior knowledge of algebraic topology is in fact not necessary. Practical tools will be offered to build up an understanding of homology through computer calculations and experiments.

The meeting will end with a discussion of some software packages with a focus on challenges for large scale computations.

Module #2 (24-25 October, KTH)
Stability of homological invariants.

We will present how functions with positive values lead to metrics on parametrised vector spaces and explain how such metrics lead to stable descriptors of parametrised homology via a process called hierarchical stabilisation. In this way effectiveness of extracting geometrical information from data is converted into metric learning. The content of this meeting is based on our own current research.

Module #3 (14-15 November, KTH)
Learning with homological invariants.

During this meeting we illustrate how to analyse data using feature vectors obtained from hierarchical stabilisation.
We focus primarily on classification tasks. We present robustness of these methods with respect to re-sampling among other statistical properties. This meeting will end with a discussion of research frontiers in understanding the geometry of data in a supervised learning context.