GIACOMO BORACCHI  TEACHING


Machine Learning for NonMatrix Data AA 2019/2020, PhD Course, Politecnico di Milano Overview: Deep learning models have proven to be very successful in multiple fields in science and engineering, ranging from autonomous driving to human machine interaction. Deep networks and datadriven models have often outperformed traditional handcrafted algorithms and achieved superhuman performance in solving many complex tasks, such as image recognition. The vast majority of these methods, however, are still meant for numerical input data represented as vectors or matrices, like images. More recently, the deeplearning paradigm has been successfully extended to cover nonmatrix data, which are challenging due to their sparse and scattered nature (e.g., point clouds or 3D meshes) or presence of relational information (e.g., graphs). Neuralbased architectures have been proposed to process input data such as graphs and point clouds: such extensions were not straightforward, and indicate one of the most interesting research directions in computer vision and pattern recognition. Mission and Goals: This course aims at presenting datadriven methods for handling nonmatrix data, i.e., data that are not represented as arrays. The course will give an overview of machine learning and deep learning models for handling graphs, point clouds, texts and data in bioinformatics. Moreover, most relevant approaches in reinforcement learning and selfsupervised learning will be presented. More information on the Course program page Organizers: Giacomo Boracchi, Cesare Alippi, Matteo Matteucci. Politecnico di Milano. Dates: From June 23rd 2020 to June 26th 2020, 6 seminars of 4 hours each. Invited Speakers:
PhD students from Polimi and affiliated PhD programs (e.g. Bologna) need to officially register to the course in order to give the exam and get credits or attendance certificates. Students from other universities are welcome to attend the lectures, but cannot give the exam nor receive an official attendance certificate from our secretariat. Students that need an official attendance certificate have to perform an official registration following the procedure on the PhD website . In case this case, an administrative fee (32E) is requested to students from other universities. Both officially registered students and informal attendees, need to fill in this google form to receive the MS Teams link and attend the sessions Teaching Modality: Online lectures in a MS Teams session. Registered students will be provided with the link Course/Exam Logistics: You can find details in these Slides Schedule and Abstracts: Selfsupervised Learning and Domain Adaptation Alessandro Giusti Senior Researcher at IDSIA, Lugano Tuesday June 23rd 2020, 14:30  18:30 Realworld applications of machine learning often face challenges due to two main issues which recur in many application scenarios: the cost of acquiring reliable, large, labeled training datasets; and the difficulty in generalizing trained models to the deployment domain. The tutorial will cover a set of stateoftheart techniques to overcome these issues. First, we discuss several successful examples of selfsupervised learning, a classic approach in robotics which consists in the automated acquisition of ground truth labels by exploiting multiple sensors during the robot's operation; more recently, a related but broader line of research has grown in the field of deep learning, which aims to use the data itself as a supervisory signal, based on simple, intuitive ideas with compelling results. Then, we delve into domain adaptation techniques, which tackle the issue of handling differences between the training and the deployment domains; this is a key challenge in many practical applications, where large datasets are available (or cheap to acquire) in some domain (e.g. in simulations), but models must be deployed in a different domain (e.g. the real world) where labeled training data is expensive. This section of the tutorial will feature handson experiments by implementing stateoftheart techniques. Self Supervised Learning, Domain Adaptation. Reinforcement Learning And Application Of Deeplearning Models In RL Alessandro Lazaric Facebook Paris Wednesday June 24th 2020, 9:00  13:00 Reinforcement learning (RL) focuses on designing agents that are able to learn how to maximize reward in unknown dynamic environments. This very general framework is motivated by a wide variety of applications ranging from recommendation systems to robotics, from treatment optimization to computer games. Unlike in other fields of machine learning, an RL agent needs to learn without a direct supervision of the best actions to take and it solely relies on the interaction with the environment and a (possibly sparse and sporadic) reward signal that implicitly defines the task to solve. Solving this problem poses several challenges such as credit assignment (understand which actions performed in the past are responsible for achieving high reward in the future), efficient exploration of the environment (to discover how the environment behaves and where most of reward is), approximation and generalization (to generalize the experience collected in some parts of the environment towards the rest of it). In the lecture, we will mostly focus on the first and last challenge. In particular, we will study how deep learning techniques can be effectively integrated into "standard" RL algorithm to be able to learn representations of the state of the environment that allow for generalization. Some of these techniques, such as DQN and TRPO, are nowadays at the core of the major successes of RL such as achieving superhuman performance in games (e.g., Atari, StarCraft, Dota, and Go) as well simulated and real robotic tasks. Talk Slides Deep Learning Models For Text Mining And Analysis Mark Carman Politecnico di Milano Wednesday June 24th 2020, 14:30  18:30 Deep Learning has revolutionised the area of text processing recently. Up until to a few years ago, it was inconceivable that one might try to train a classifier on text in one language and then apply it directly to text in another language (without any form of training on the latter). Now it is commonplace to do so. This is possible through the use of powerful language models that have been pretrained on large multilingual corpora. The application of sophisticated unsupervised pretraining thus provides the ability to easily transfer knowledge from one domain (or natural language) to another. In this talk I'll run through a brief history of language and sequence modelling techniques. I'll describe the stateoftheart transformer architectures that are used to build famous models like GPT2 and BERT. We'll discuss how these models can be used for various types of prediction problems, and describe some interesting applications to problems in multilingual classification, image question answering, data integration, bioinformatics Talk Slides Machine Learning And Deep Learning Models For Handling Graphs Jonathan Masci NNAISENSE SA, Switzerland Thursday June 25th 2020, 9:00  13:00 Deep learning methods have achieved unprecedented performance in computer vision, natural language processing and speech analysis, enabling many industryfirst applications. Autonomous driving, image synthesis and deep reinforcement learning are just few examples of what is now possible on grid structured data with deep learning at scale on GPUs and dedicated hardware. However, tasks for which data comes arranged on grids and sequences cover only a small fraction of the fundamental problems of interest. Most of the interesting problems have, in fact, to deal with data that lie on nonEuclidean domains for which deep learning methods were not originally designed. The need to operate powerful nonlinear data driven models on this data led to the creation of Geometric Deep Learning, a new and rapidly growing area of research that focuses on methods and applications for graph and manifold structured data. Despite the field being still in its infancy, it can already list numerous breakthroughs on classic graph theory problems such as graph matching, 3D shape analysis and registration, fMRI and structural connectivity networks, scene reconstruction and parsing, drug design and protein synthesis. At the core of this new wave of deep learning successes is the ability of models to directly deal with nonEuclidean data through generalization of convolution and sub sampling operators and, more generally, thanks to models that can use structure to induce computation in what I call the Structured Computation Model. The lecture will start with a pragmatic introduction to graph convolutions, both in the spectral and spatial domain, and to the message passing framework. Applications and recent achievements in the field will then follow, starting with node and graph classification in the inductive and transductive setting, and progressing to finally realize that popular methods in metalearning, oneshot and fewshot learning, structured latent space models are particular cases of the Structured Computational Model. I will show how such a general and unified framework can help the crossfertilization of different disciplines to achieve better results, faster. The lecture will finally give an outlook of where the field is going and of new and exiting research directions and industrial applications that are waiting to be revolutionized. Talk Slides 3d Shape Matching and Registration Maks Ovsjanikov Laboratoire d'Informatique (LIX), Ecole Polytechnique, France Thursday June 25th 2020, 14:30  18:30 In this talk, I will describe several learningbased techniques shape comparison and computing correspondences between nonrigid 3D. This problem arises in many areas of shape processing, from statistical shape analysis to anomaly detection and deformation transfer among others. Traditionally this problem has been tackled with purely axiomatic methods, but with recent availability of largescale datasets, new approaches have been proposed that exploit learning for finding dense maps (correspondences) between 3D shapes. In this talk, I will give an overview of recent successful methods in this area and will especially highlight how geometric information and principles can be injected into the learning pipeline resulting in robust and effective matching methods (both supervised and unsupervised). Finally, time permitting, I will also describe a link between matching and 3D shape synthesis, pointing out how similar methods can be used to achieve both tasks. Talk Slides Part I Talk Slides Part II Talk Slides Part III Supervised Learning to Causal Inference in Large Dimensional Settings Gianluca Bontempi Universite Libre de Bruxelles, Belgium Friday June 26th 2020, 9:00  13:00 We are drowning in data and starving for knowledge” is an old adage of data scientists that nowadays should be rephrased into ”we are drowning in associations and starving for causality”. The democratization of machine learning software and big data platforms is increasing the risk of ascribing causal meaning to simple and sometimes brittle associations. This risk is particularly evident in settings (like bioinformatics, social sciences, economics) characterised by high dimension, multivariate interactions, dynamic behaviour where direct manipulation is not only unethical but also impractical. The conventional ways to recover a causal structure from observational data are scorebased and constraintbased algorithms. Their limitations, mainly in high dimension, opened the way to alternative learning algorithms which pose the problem of causal inference as the classification of probability distributions. The rationale of those algorithms is that the existence of a causal relationship induces a constraint on the observational multivariate distribution. In other words, causality leaves footprints in the data distribution that can be hopefully used to reduce the uncertainty about the causal structure. This first part of the presentation will introduce some basics of causal inference and will discuss the stateoftheart on machine learning for causality (notably causal feature selection) and some application to bioinformatics. The second part of the talk will focus on the D2C approach which featurizes observed data by means of information theory asymmetric measures to extract meaningful hints about the causal structure. The D2C algorithm performs three steps to predict the existence of a directed causal link between two variables in a multivariate setting: (i) it estimates the Markov Blankets of the two variables of interest and ranks its components in terms of their causal nature, (ii) it computes a number of asymmetric descriptors and (iii) it learns a classifier (e.g. a Random Forest) returning the probability of a causal link given the descriptors value. The final part of the presentation is more prospective and will introduce some recent work to implement counterfactual prediction in a data driven setting. Talk Slides  