VLFeat offers a hierarchical version of integer k-means, which
recursively applies vl_ikmeans
to compute finer and finer
partitions. For more details see
Hierarchical Integer
k-means API reference and the Integer
k-means tutorial.
First, we generate some random data to cluster in [0,255]^2
:
To cluster this data, we simply use vl_hikmeans
:
Here nleaves
is the desired number of leaf
clusters. The algorithm terminates when there are at least
nleaves
nodes, creating a tree with depth =
floor(log(K nleaves))
To assign labels to the new data, we use vl_hikmeanspush
:
The output tree
is a MATLAB structure representing the tree of
clusters:
The field centers
is the matrix of the cluster centers at the
root node. If the depth of the tree is larger than 1, then the field
sub
is a structure array with one entry for each cluster. Each
element is in turn a tree:
with a field centers
for its clusters and a field
sub
for its children. When there are no children, this
field is equal to the empty matrix
VLFeat supports two different implementations of k-means. While they
produce identical output, the Elkan method is sometimes faster.
The method
parameters controls which method is used. Consider the case when K=10
and our data is now 128 dimensional (e.g. SIFT descriptors):