This course will be organized in 3 blocks over 9 weeks (start: week of 24.02.2025)
Note that lectures will start at 5pm sharp; from 4.30pm until 5pm, there will be a debrief of the exercise sheet of the previous week!
The course format will comprise a weekly 60-minute online lecture and a weekly hybrid (in-person/online) practical Python session. Lectures will be given by teachers from all participating universities. Lectures and practical exercises on all three application areas will be centered around one recent publication illustrating a specific application and method.
The course will end with a 2-day workshop and hackathon meeting in Heidelberg on May 30/31-June 1st 2025 during which students will be able to implement a short project and listen to scientific lectures.
Students attending this course are expected to have some basic statistics knowledge and machine-learning fundamentals. You can use the lecture material from last year’s edition, in particular the four introductory lectures:
Date | Title | Speaker | Links |
---|---|---|---|
Intro Lecture 1 | Intro and Mathematical foundation to DL | Bartek Wilczynski (Warsaw) | Lecture materials, Practical session , Video recording (26.03) |
Intro Lecture 2 | Convolutional and Recurrent neural networks | Marco Frasca (Milano) | Lecture materials, Practical session ,Video recording (4.03) |
Intro Lecture 3 | Autoencoders and variational autoencoders | Carl Herrmann (Heidelberg) | Lecture materials, Practical session , Video recording |
Intro Lecture 4 | Attention mechanisms and transformers | Dario Malchiodi (Milano) | Lecture materials, Practical session , Video recording |
Recommended books are among others:
As the practical sessions will be mostly based on Python and pyTorch, some basic knowledge in python is required (see reference [4] for a good overview of pyTorch for example).
Specifically, we expect that the following theoretical concepts are familiar:
Link to the weekly Zoom lecture (Note: you will get the passphrase to join the meeting from your local instructor!)
Note that lectures start at 5pm; from 4.30 until 5pm, there will be a short debrief of the exercise sheet of the previous week!
Date | Title | Speaker | Content | Links |
---|---|---|---|---|
Week 1 - 24.02 | Models for multimodal data integration | Britta Velten (Heidelberg) | This lecture will provide an overview of the basic statistical concepts that are important for the joint analysis of multi-modal omics data with a focus on probabilistic models for data integration. We will discuss statistical properties of multi-omics data and the resulting challenges for the data analysis, followed by an overview on different strategies for both supervised and unsupervised integration. Taking MOFA as example for an unsupervised method we will discuss the properties of probabilistic factor models for joint dimension reduction of multiple omics data sets. We will also discuss avenues to account for multiple sample groups and omics data with temporal and spatial resolution in probabilistic models. | Slides and Notebook Video recording |
Week 2 - 3.03 | VAE in single-cell genomics | Carl Herrmann (Heidelberg) | In this lecture, I will review recent applications of AE and VAE in the field of genomics, in particular single-cell genomics. We will see how these application can help perform clustering of cell populations and allow to denoise sparse data. Finally, I will present some recent VAE models which are interpretable, i.e. in which the neurons of the model can be interpreted as biological entities. For those not familiar with gemomics, I will start with a brief review of some concepts and data types. | Slides and Notebooks Video recording |
Week 3 - 10.03 | Deep learning for predicting non-coding DNA activity | Bartek Wilczynski (Warsaw) | We know that the transcriptional gene regulation in multicellular organisms depends on the action of hundreds of thousands non-coding regulatory sequences scattered in the genome. Since there are so many of them, and we ususally cannot assess their activity directly in the cells and tissues, annotating their activity experimentally in full is difficult. If we simplify their function into activation of transcription in different cellular contexts, the task of annotation becomes similar to a classical ML problem of multi-class classification. Recently, many studies have been published attempting to solve this problem to different degrees using Convolutional Neural Networks. We will discuss a few recent such papers and discuss their successes as well as some pitfalls of training classifier models without clearly defined negative examples. | Slides and notebook Video recording |
Week 4 - 17.03 | AlphaFold, EMSFold to predict structure of proteins | Joanna Sulkowska (Warsaw) | ||
Week 5 - 24.03 | Protein design in the deep learning era, from inverse folding to diffusion models | Elodie Laine (Paris) | The revolution in protein structure prediction has boosted the development of deep learning-based methods for designing de novo protein with desired properties. This session will introduce the modern computational pipeline for protein design, from generating novel protein folds to designing amino acid sequences compatible with them. We will cover a wide range of deep learning architectures and frameworks, including graph neural networks and diffusion models. We will discuss a few recent groundbreaking applications that were unimaginable just a few years ago. | |
Week 6 - 31.03 | Deep Architectures for sampling macromolecules | Grégoire Sergeant-Perthuis (Paris) | ||
Week 7 - 7.04 | Deep learning models for protein-ligand binding site prediction | David Hoksza (Prague) | We will introduce methods for predicting protein-small molecule binding sites, with a focus on structure-based approaches. We will briefly cover traditional non-ML and ML methods, followed by deep learning techniques such as convolutional neural networks, graph neural networks, and the latest addition to the field—protein language models. | |
Week 8 - 14.04 | Deep learning for image segmentation | Karl Rohr (Heidelberg) | The lecture introduces deep learning methods for image segmentation. The focus is on Convolutional Neural Networks (CNNs) and encoder-decoder network architectures for cell segmentation. We will discuss the well-known networks U-Net and Cellpose, and their application for computer-based analysis of cell microscopy image data. | |
Week 9 - 28.04 | Intro to BioImage Analysis and Deep Learning Utilization | Martin Schatz (Prague) | ||
14.04 - 31.05 | Project phase | |||
30.05-1.06 | Final meeting Heidelberg | The course will end with a 2-day workshop and hackathon meeting in Heidelberg during which students will be able to implement a short project and listen to scientific lectures. |