Organizers: Valentina Disarlo, Diaaeldin Taha
Location: The Seminar takes place in Seminar Raum C (Mathematikon) or online via Zoom. Contact us or join the HEGL Mailing List to get the Zoom coordinates.
23.01.23 – Linear Causal Disentanglement
Speaker: Anna Seigal (Harvard)
Time: 15.15–16.15 (Zoom) — note the unusual time
Abstract: Causal disentanglement seeks a representation of data involving latent variables that relate to one another via a causal model. We consider linear causal disentanglement: observed variables that are a linear transformation of a linear latent causal model. The setup is identifiable if the linear transformation and the latent causal model are unique. We show that one intervention on each latent variable is sufficient and, in the worst case, necessary for identifiability. Based on joint work with Chandler Squires and Caroline Uhler.
09.01.23 – Emergence of Lie Symmetries in Functional Architectures Learned by CNNs
Speaker: Noemi Montobbio (IIT)
Time: 14.15–15.15 (Zoom)
Abstract: Convolutional Neural Networks (CNNs) are a powerful tool providing outstanding performances on image classification tasks, based on an architecture designed in analogy with information processing in biological visual systems. The functional architectures of the early visual pathways have often been described in terms of geometric invariances, and several studies have leveraged this framework to investigate the analogies between CNN models and biological mechanisms. Remarkably, upon learning on natural images, the translation-invariant filters of the first layer of a CNN have been shown to develop as approximate Gabor functions, resembling the orientation-selective receptive profiles found in the primary visual cortex (V1). With a similar approach, we modified a standard CNN architecture to insert computational blocks compatible with specific biological processing stages, and studied the spontaneous development of approximate geometric invariances after training the network on natural images. In particular, inserting a pre-filtering step mimicking the Lateral Geniculate Nucleus (LGN) led to the emergence of a radially symmetric profile well approximated by a Laplacian of Gaussian, which is a well-known model of receptive profiles of LGN cells. Moreover, we introduced a lateral connectivity kernel acting on the feature space of the first network layer. We then studied the learned connectivity as a function of relative tuning of first-layer filters, thus re-mapping it into the roto-translation space. This analysis revealed orientation-specific patterns, which we compared qualitatively and quantitatively with established group-based models of V1 horizontal connectivity.
19.12.22 – Topological Graph Neural Networks
Speaker: Edward de Brouwer (KU Leuven)
Time: 14.15–15.15 CET (Zoom)
Abstract: Graph neural networks (GNNs) are a powerful architecture for tackling graph learning tasks, yet have been shown to be oblivious to eminent substructures such as cycles. In this talk, we introduce TOGL, a novel layer that incorporates global topological information of a graph using persistent homology. TOGL can be easily integrated into any type of GNN and is strictly more expressive (in terms the Weisfeiler–Lehman graph isomorphism test) than message-passing GNNs. Augmenting GNNs with TOGL leads to improved predictive performance for graph and node classification tasks, both on synthetic data sets, which can be classified by humans using their topology but not by ordinary GNNs, and on real-world data.
12.12.22 – Machine Learning with Mathematicians
Speaker: Alex Davies (Deep Mind)
Time: 13.00–14.00 (Zoom) — note the unusual time
Abstract: Can machine learning be a useful tool for research mathematicians? There are many examples of mathematicians pioneering new technologies to aid our understanding of the mathematical world: using very early computers to help formulate the Birch and Swinnerton-Dyer conjecture and using computer aid to prove the four colour theorem are among the most notable. Up until now, there hasn’t been significant use of machine learning in the field and it hasn’t been clear where it might be useful for the questions that mathematicians care about. In this talk, we will discuss how working together with top mathematicians to use machine learning to achieve two new results – proving a new connection between the hyperbolic and geometric structure of knots, and conjecturing a resolution to a 50-year problem in representation theory, the combinatorial invariance conjecture. Through these examples, we demonstrate a way that machine learning can be used by mathematicians to help guide the development of surprising and beautiful new conjectures.
21.11.22 – ML for Pure Math
Speaker: Jim Halverson (Northeastern in Boston)
Time: 14.15–15.15 (Zoom)
Abstract: Progress in machine learning (ML) is poised to revolutionize a variety of STEM fields. But how could these techniques — which are often stochastic, error-prone, and blackbox — lead to progress in pure mathematics, which values rigor and understanding? I will exemplify how ML can be used to generate conjectures in a Calabi-Yau singularity problem that is relevant for physics, and will demonstrate how reinforcement learning can yield truth certificates that rigorously demonstrate properties of knots. The second half of the talk will utilize ML theory instead of applied ML. Specifically, I will develop a neural tangent kernel theory appropriate for flows in the space of metrics (realized as neural networks), and will realize Perelman’s formulation of Ricci flow as a specialization of the general theory.
07.11.22 – Benchmarking Machine Learning Methods Using OpenML and mlr3
Speaker: Sebastian Fischer (LMU Munich)
Time: 14.15–15.15 (Zoom)
Abstract: Benchmark studies are an integral part of machine learning research. The two main components are the datasets that are used to compare the methods and the software that supports the researcher in carrying out the experiment. OpenML is a platform for sharing datasets and machine learning results and is a great tool to obtain datasets.
mlr3 is an ecosystem of machine learning packages in the R language, which among other things allows for benchmarking algorithms with ease. This presentation will show how OpenML and mlr3 can be used together to make benchmarking machine learning methods as easy as possible, by using the interface R package mlr3oml.
24.10.22 – Riemannian Geometry in Machine Learning
Speaker: Isay Katsman (Yale)
Time: 14.15–15.15 (Zoom)
Abstract: Although machine learning researchers have introduced a plethora of useful constructions for learning over Euclidean space, numerous types of data in various applications benefit from, if not necessitate, a non-Euclidean treatment. In this talk I cover the need for Riemannian geometric constructs to (1) build more principled generalizations of common Euclidean operations used in geometric machine learning models as well as to (2) enable general manifold density learning in contexts that require it. Said contexts include theoretical physics, robotics, and computational biology. I will cover one of my papers that fits into (1) above, namely the ICML 2020 paper “Differentiating through the Fréchet Mean.” I will also cover two of my papers that fit into (2) above, namely the NeurIPS 2020 paper “Neural Manifold ODEs” and the NeurIPS 2021 paper “Equivariant Manifold Flows.” Finally, I will briefly discuss directions of relevant ongoing work.
26.09.22 – Detecting Danger in Gridworlds using Gromov’s Link Condition
Speaker: Tom Burns (OIST Graduate University)
Time: 14.15–15.15 (Seminar Raum 5, Mathematikon) – note the unusual location
Abstract: Gridworlds have been long-utilised in AI research, particularly in reinforcement learning, as they provide simple yet scalable models for many real-world applications such as robot navigation, emergent behaviour, and operations research. We initiate a study of gridworlds using the mathematical framework of reconfigurable systems and state complexes due to Abrams, Ghrist & Peterson. State complexes represent all possible configurations of a system as a single geometric space, thus making them conducive to study using geometric, topological, or combinatorial methods. The main contribution of this work is a modification to the original Abrams, Ghrist & Peterson setup which we introduce to capture agent braiding and thereby more naturally represent the topology of gridworlds. With this modification, the state complexes may exhibit geometric defects (failure of Gromov’s Link Condition). Serendipitously, we discover these failures occur exactly where undesirable or dangerous states appear in the gridworld. Our results therefore provide a novel method for seeking guaranteed safety limitations in discrete task environments with single or multiple agents, and offer useful safety information (in geometric and topological forms) for incorporation in or analysis of machine learning systems. More broadly, our work introduces tools from geometric group theory and combinatorics to the AI community and demonstrates a proof-of-concept for this geometric viewpoint of the task domain through the example of simple gridworld environments.