Diving into new scientific fields with machine learning
Jakob Macke, professor of machine learning in science, spoke with us about the similarities and differences between computational neuroscience and machine learning - and the fun of diving into different scientific fields.
Jakob Macke, © Friedhelm Albrecht/ Universität Tübingen
BN/ Duppé: Mr. Macke, last year you were appointed professor for “Machine Learning in Science” at the University of Tübingen, which was established as part of the Cluster of Excellence “Machine Learning – New Perspectives for the Sciences”. How did you, as an established neuroscientist, arrive in machine learning?
Macke: I have always been interested in both machine learning and neuroscience. After all, computational neuroscience is building the bridge between these disciplines. In Tübingen, the focus of my research will expand: In the Cluster of Excellence, we are not ‘only’ working on applications in the neurosciences, but on applications for and in other fields.
Asked the other way around. How did you become a neuroscientist, originally?
Originally, I studied mathematics and had a keen interest in machine learning and statistics even then. During my student days, I added neuroscience to my areas of interest. To put it simply, I always wondered: how does the world get into our heads? How do we measure environmental influences with our senses and use them to create a model of our world that allows us to carry out actions? During my studies, I spent one summer in a neuroscience lab in Cold Spring Harbor, New York, analyzing experimental data from 2-photon microscopy. Thanks to this hands-on experience, I decided not to pursue my PhD in mathematics or statistics, but to combine neuroscience and machine learning.
This way you are building exactly the bridge computational neuroscience stands for: to interpret and process experimental data by means of algorithms. Where’s the difference to machine learning per se?
We are looking at two separate disciplines, each with its own scientific processes, but they do overlap: computational neuroscience deals with the basic question of how to better understand nervous systems through mathematical or computational modeling.
One tries to explain specific experimental findings with the help of models. The experimental observation might be as follows: during a memory task, some neurons in the brain fire strongly. The modeling task is to find a model that explains: why this is so? What are the neuronal mechanisms? Which purpose does this serve? In practice, it often means that the theoretical work derives from the experimental observations.
Often, one focuses on very simple experimental observations. Since the brain is so complex, the questions asked must be strongly reduced in order to be able to make specific statements in the presence of this complexity.
Ultimately, all this means that modelers often spend a lot of time creating a very precise model for a specific issue ‘by hand’, or rather with their expert knowledge.
In contrast, machine learning is more concerned with the general question of how to learn from data. So, initially it’s not about the brain at all, but primarily about artificial information processing systems. At the beginning there is a complex high-dimensional data set in which one tries to recognize patterns. The goal is less to understand why something is the way it is, but about arriving at a system that works.
Would algorithms then be the means to an end in computational neuroscience, and accordingly, machine learning the tool? What about pure machine learning? Which questions do you address there?
In machine learning, one asks how to develop algorithms that can learn from data. There are different ways to go about this: The classical task in machine learning is so-called ‘supervised learning’, i.e. you give an algorithm input and output signals; the algorithm should then learn how these input signals can be associated with the output signals. To take the classic example: an algorithm is fed many images; some of them show cats. The algorithm must learn by itself which pictures show cats and which do not. Here, one does not try to understand what it means to be a cat or what constitutes a cat, the algorithm merely learns to recognize statistical patterns which help recognize a cat.
The Cluster of Excellence Machine Learning for Science is a big investment in the research of Machine Learning to make this knowledge available to different disciplines. In this context, the classical example of object recognition is very simple. Your work, in particular, is a lot more complex when regarding such algorithms and machine learning.
Research in the Cluster of Excellence is based on the idea that machine learning provides very powerful tools that help scientists do their work faster and more efficiently than conventional methods.
Machine learning can help in many situations making them less time-consuming and more efficient. Algorithms can, for example, reduce the workload required to find models which can explain the data. Thus, scientists cut down on time as they no longer need to construct, try out and compare a large number of different models in order to find the suitable one. The computer can take care of this and thus expedite scientific discovery.
This is referring to the algorithm detective you have developed. Hence the question: How does a question come about? Are you (still) acting as a neuroscientist or do you always work with different disciplines?
Most of our work emerges indeed from collaborations with other scientific disciplines. Often it starts with a specific question or with a scientific problem that can be solved with machine learning.
The work on the ‘algorithm detective’, for instance, was about how we can fit biophysical models to data: We had observations from the measurements of a cell, which we wanted to explain with a biophysical model. We had a tough time of it with doing that ourselves. Finally, we decided on automating it. And then we found that not only could we solve this specific problem, we could actually use it to find algorithms applicable to other problems. We started with a specific problem and found out that the methodology can be applied to other models, first within neuroscience and then in other disciplines as well.
Currently, we are working with geophysicists. Naturally, it is no longer about the model of how a cell responds; we are trying to find a model of how to describe the melting of ice in Antarctica. This is a completely different scientific question, but the underlying mathematics, the underlying inference problems that we solve, are actually related — for biophysical models, for the melting of ice in Antarctica or for models of gravitational waves in physics.
It is also great fun to dive into completely different fields; naturally, it is extremely important to work together with experts in the respective field. We tend to be amateurs there, relying on the expertise of our colleagues. However, it is always fun to be able to ‘get your teeth into’ the challenges and gain insight into exciting scientific fields to which our work can then contribute.
Returning to the beginning of the question: neuroscience remains our scientific home. It is still the scientific field from which the majority of our applications come. And, of course, the one in which we are most likely to be able to also contribute our own neuroscience expertise.
Do you detect a change in the research field of neuroscience given the new possibilities in machine learning?
The interplay between machine learning and neuroscience is extremely exciting right now.
First, ML gives us powerful tools to analyze data. Second, the brain is in some ways the most powerful information processing algorithm there is. The hope is that the better we understand the brain, the better we can adopt its processing principles and try to transfer them into better algorithms. There are many examples – not so much from my work – showing that this inspiration has also been important to some of the advances in machine learning in recent years. That is, the explosion in ML and AI has been fueled by ideas from neuroscience. There is a very clear indication of how these two fields can interact.
Looking at it from the other side, it’s also clear that neuroscience can learn from machine learning. ML creates artificial information processing algorithms, which can lead to hypotheses for how the brain can solve tasks. If we have taught an algorithm a difficult task, we can analyze how the algorithm solved the task and then check whether the brain solves that task in a similar way or not at all. Thus, inspiration for new neuroscientific questions can also arise from the analysis of ML algorithms. Modern ML algorithms are often so complex that it takes a lot of work to understand how they work. Once you understand it, or even aspects of it, you retrace the steps and investigate whether the brain solves it similarly or differently. In both cases one can learn something. Our goal is not to show that both, the algorithm and the brain, work in the same way, but to learn from their differences and to advance science.
So we’re talking bio-inspired vs algorithm-inspired learning. Both are of course very data-focused and require large amounts of data. In this context, data integration/data management is a major topic for the science of the future. The first preliminary assessments for data infrastructures of different disciplines have just been finished. Which role does this topic play in your research?
I think a more professional and a more transparent data management has been one of the crucial developments in neuroscience over the last couple of years. The field has learned a lot about how important it is to approach this together and in a professional way. I think this is great, especially since it should be professionalized even further. A basic prerequisite for our work is, of course, that we not only have access to the data, but that this data is also managed in a professional manner so that we can use it straight away. This is why I consider these initiatives not only extremely valuable but also highly important for our work and for the scientific field as a whole.
Now, the situation has changed quite recently for you as a key scientist in Machine Learning and Computational Neuroscience. You will become the head of the Bernstein Center Tübingen. Will computational neuroscience in Tübingen be more machine learning-oriented from now on?
Since this is a new challenge for me, I consider this leading role also as a coordinating job. To me, it is crucial to reflect as center on where we are heading to scientifically. Tübingen has a great deal of expertise in both neuroscience and machine learning. Therefore, it seems obvious to try and combine the strengths of both disciplines even more intensively than before.
A second aspect is the connection with clinical neuroscience. This will be more important for my own research in the next few years and also reflect back into the Bernstein Center. At Tübingen, there is a very strong clinical neuroscience and I believe it bears great potential: machine learning could not only strengthen basic neuroscience research, but also clinical research, such as neurology. These two aspects will play a major role in my own research in the future, which is why I will also try to make sure that within the Bernstein Network and the Bernstein Center we can exploit the strengths of this connection.
How do you consider the connection to the Bernstein Network that you just mentioned? And how would you assess the networking with the other Bernstein Centers with their different research foci?
I believe that the Bernstein Network lives through and builds on the fact that there are different centers with different scientific foci. It would be a pity if the individual centers only differed geographically but not scientifically.
I think it will be very exciting and challenging for the Bernstein Network in the coming years to join these individual strengths in a way that emphasizes the strengths of the individual locations while taking into account that there are strong research groups that cannot be mapped one-to-one to these centers: Tübingen, as I said, might have a focus on bringing together machine learning and neuroscience, but of course there are also excellent groups working in this area at other locations in Germany. Bringing together local expertise and scientific focal points irrespective of location is, I believe, both an opportunity and a challenge for the network.
Second, the world is changing and the scientific field is changing as well. Aspects that may have been the focus for years are being dropped and replaced to some extent by new developments, such as a stronger focus on machine learning, which inevitably links them to other topics. It is important that the Bernstein Network embraces and actively shapes these changes.
Interview: C. Duppé (Bernstein Network), February 2021