Data Science Seminar

Le 14 mars 2019 à Télécom Paris. Séminaire entièrement en anglais.
March 14, 2019

The seminar took place from 2PM to 4PM (room C49), and featured two talks:

Talk 1: Thomas Bonald (Télécom Paris): Hierarchical graph clustering: quality metrics and algorithms

You can download the slides of this talk.

Abstract: This talk is about hierarchical clustering for graph-structured data. We present a novel agglomerative algorithm as well as a quality metric based on the sampling distribution of node pairs induced by the graph. A fundamental property of the proposed metric is that it is interpretable in terms of graph reconstruction: it is exactly the information loss induced by the representation of the graph as a tree. It is applicable to any tree and, in particular, to a tree of height 2 corresponding to a usual graph clustering, i.e., a partition of the set of nodes. It can also be used to compress the binary tree returned by any hierarchical clustering algorithm so as to get a compact, meaningful representation of the hierarchical structure of the graph. The results are illustrated on both synthetic and real datasets.

Talk 2: Alessandro Rudi (Inria Sierra): Structured prediction via implicit embeddings

You can download the slides of this talk.

Abstract: In this talk we analyze a regularization approach for structured prediction problems. We characterize a large class of loss functions that allows to naturally embed structured outputs in a linear space. We exploit this fact to design learning algorithms using a surrogate loss approach and regularization techniques. We prove universal consistency and finite sample bounds characterizing the generalization properties of the proposed methods. Experimental results are provided to demonstrate the practical usefulness of the proposed approach.