Paper: | MLSP-L2.1 |
Session: | Kernel Machines |
Time: | Wednesday, May 17, 16:30 - 16:50 |
Presentation: |
Lecture
|
Topic: |
Machine Learning for Signal Processing: Learning Theory and Modeling |
Title: |
Kernel Density Estimation, Affinity-Based Clustering, and Typical Cuts |
Authors: |
Deniz Erdogmus, Miguel A. Carreira-Perpinan, Umut Ozertem, Oregon Health & Science University, United States |
Abstract: |
The typical cut [1,2,3] is a clustering method that is based on the probability $p_{nm}$ that points $\x_n$ and $\x_m$ are in the same cluster over all possible partitions (under the Boltzmann distribution for the mincut cost function). We present two contributions regarding this algorithm. (1) We show that, given a kernel density estimate of the data, minimising the overlap between cluster densities is equivalent to the mincut criterion. This gives a principled way to determine what affinities and scales to use in the typical-cut algorithm, and more generally in clustering and dimensionality reduction algorithms based on pairwise affinities. (2) We introduce an iterated version of the typical-cut algorithm, where the estimated $p_{nm}$ are used to refine the affinities. We show this procedure is equivalent to finding stationary points of a certain objective function over clusterings; and that at the stationary points the value of $p_{nm}$ is $1$ if $n$ and $m$ are in the same cluster and a small value otherwise. Thus, the iterated typical-cut algorithm sharpens the $p_{nm}$ matrix and makes the cluster structure more obvious. |