ClusterEngine

Class template defines a framework for performing clustering operation. The class is receives a list of data elements and clusters them using the algorithms in the ClusterList class template.

[ GAMB | Source | Keywords | Summary | Ancestors | All Members | Descendants ]

Quick Index

AUTHORS
CHANGES LOG
GOALS
USAGE

Class Summary

class ClusterEngine
: public ClusterList< unsigned int, ClusterType>

{
public:
ClusterEngine(const float maxDist, const unsigned int dim, const unsigned int preClusterFactor );
~ClusterEngine();
void addElement(const ElementType& data);
void cluster();
void cluster(const float maxDist);
void fastCluster();
void fastCluster(const float maxDist);
protected:
}; // ClusterEngine

Back to the top of ClusterEngine


AUTHORS

Meir Fuchs (meirfux@math.tau.ac.il).

Copyright: SAMBA group, Tel-Aviv Univ. Israel, 1997.

Back to the top of ClusterEngine


CHANGES LOG

Back to the top of ClusterEngine


GOALS

The ClusterEngine class template serves as a framework for clustering algorithms. It is based on the ClusterList class template. ClusterEngine's added value is its ability to efficiently generate potential neighbor pairs. The user may only add the data elements themselves without specifying which of them actually neighbor each other.

Back to the top of ClusterEngine


USAGE

The ClusterEngine class template is in many ways similar to the ClusterList class template. It need only receive the type or class of elements being clusered (ClusterType) and the type of cluster that holds them ClusterType.

Several methods must be defined if the Cluster type if it is to be used with ClusterEngine. Most of these methods are described with the ClusterList class but are described here again:

a. A dist method returning the distance between two Cluster instances.

  float dist(const Cluster&);
b. A join method capable of joining two clusters. The join method receives a single argument of type Cluster and should join the given Cluster object with the parent object (this).

  void join(const Cluster&);
c. The updates() method. Under normal circumstances the updates() method should return the number of times the method join was called on the given class instance. The updates method is used as a signature for the Cluster instance wether it has been updated. If two Cluster instances have not been updated then the distance between them need not be recalculated. The updates function may be used to minimize the calls to the dist method when calculating the distance between two Cluster objects is time consuming. For example if the user has the updates() method return a constant 0, the distances between the Cluster objects will never be calculated and the distance between the Cluster elements will be used. (This is good for nearest neighbor clustering).

  unsigned int updates() const;
d. A constructor that constructs a single-element ClusterType from a given ElementType

  ClusterType(const ElementType&);
e. An element() method returning an ElementType for a single-element cluster. This function is used by ClusterEngine at the outset of the algorithm when all clusters are single-element clusters.

    const ElementType& element() const;

ElementType must support a single method - operator[]. This method is used in generating the neighbors lists. Data elements are thrown into a Grid (a GeomHash object) and neighbors in the grid are added to the list of neighbor pairs in ClusterList. operator[] is used by the GeomHash insert method.

    float operator[](const int dim);

The following is an example of a class named Transformation used to cluster 3-dimensional rigid transformations using 3 rotational parameter and 3 translational parameters. The transformations are clustered (joined) using a weighted average. For brevity we use ElementType = ClusterType = Transformation

    class Transformation
    {
    public:
      Transformation(const ReferenceFrame& rt, const float transScore)
        : trans(rt), scr(transScore), joins(0)  {}

      float dist(const Transformation& t)  const  {
        return (trans.rotation().dist(t.trans.rotation()) +
                trans.translation().dist(t.trans.translation())/5);
      }
      void join(const Transformation& t)  {
        float sum = scr + t.scr;
        const Rotation3& rot = trans.rotation();
        const Rotation3& trot = t.trans.rotation();
        trans = ReferenceFrame(Rotation3(rot[0]*scr + trot[0]*t.scr,
                                         rot[1]*scr + trot[1]*t.scr,
                                         rot[2]*scr + trot[2]*t.scr,
                                         rot[3]*scr + trot[3]*t.scr),
                               (trans.translation()*scr + 
                                t.trans.translation()*t.scr) / sum);
        scr = sum;
        ++joins;
      }

      unsigned int updates() const          { return joins; }

      float operator[](const unsigned int dim) const {
        if (dim < 3)
          return trans.rotation()[dim+1];
        else
          return trans.translation()[dim-3];
      }

      const Transformation& element() const { return *this; }

    private:
      ReferenceFrame trans;
      float scr;
      int joins;
    };

To use the ClusterEngine class the user adds a list of single element clusters and then calls cluster or fastCluster. At any point the user may use ClusterEngine as an STL-like container to inspect what clusters are present.

The clustering algorithms used by the ClusterList keep the neighbor pairs sorted by their distance. When cluster or fastCluster are called, neighbor pairs with shorter distances are accessed first. Each such pair is a ssociated with a Cluster. If the distance between the two Cluster objects does not exceed a given threshold they are joined. In the fastCluster version of the algorithm neighbors are processed in the order in which they are found. In the cluster version, neighbor pairs may be repositioned in the distance sorted list if the clusters are changed via previous join operations. Both versions are quite efficient although putting an upper bound on the complexity of cluster method seems difficult.

A function that uses Transformation class to cluster 3-D rigid transformations may look something like the following:

    typedef ClusterEngine<Transformation, Transformation> TransCluster;
    TransCluster clusterList(maxDist*2, 6);
    for (int i=0; i<count; ++i)
      clusterList.addElement(transformation[i]);

    clusterList.cluster(maxDist);

    for (TransCluster::iterator it = clusterList.begin();
         it != clusterList.end();
         ++it) {
      cout << (*it);
This example assumes the existance of an array of Transformation objects. Also a maxDist parameter sets an upper-bound on the inter-cluster distance for clusters that may be joined. It is assumed that operator<< was defined.

Back to the top of ClusterEngine


ClusterEngine(const float maxDist, const unsigned int dim, const unsigned int preClusterFactor );

Constructor. Receives three parameters. The first, maxDist, defines the maximal distance between two neighboring single-element clusters. If two single-element clusters are more distant then the given maxDist they will not be added to the neighbors lists. The second parameter, dim, specifies the dimension of the grid into which the single element clusters are thrown.

The third parameter, preClusterFactor specifies the level of pre-clustering desired. This parameter is set to 0 by default, meaning no pre-clustering at all. When parameter is > 0 a finer grid with cube size of maxDist/preClusterFactor is used to pre-cluster the elements. Within the finer grid elments are clustered so that there is no more then one cluster for each bucket/grid cube. The bigger the pre-clustering factor, the finer the grid, the closer the results are to the normal clustering run. Pre-clustering may be used when elements are clumped together into large groups causing a near n^2 complexity on the number of neighbor pairs.

  ClusterEngine(const float maxDist, const unsigned int dim,
                const unsigned int preClusterFactor = 0);

Back to the top of ClusterEngine


~ClusterEngine();

Since the preCluster grid is a pointer to such a grid invoked by new there is a necessity to write a destructor

  ~ClusterEngine();

Back to the top of ClusterEngine


void addElement(const ElementType& data);

Add an element to be clustered. The elements are added as single-element clusters.

  void addElement(const ElementType& data);

Back to the top of ClusterEngine


void cluster();

Prepares neighbors lists and calls ClusterList's cluster() method with the maxDist given in the constructor.

  void cluster();

Back to the top of ClusterEngine


void cluster(const float maxDist);

Prepares neighbors lists and calls ClusterList's cluster() method.

  void cluster(const float maxDist);

Back to the top of ClusterEngine


void fastCluster();

Prepares neighbors lists and calls ClusterList's fastCluster() method with the maxDist given in the constructor.

  void fastCluster();

Back to the top of ClusterEngine


void fastCluster(const float maxDist);

Prepares neighbors lists and calls ClusterList's fastCluster() method.

  void fastCluster(const float maxDist);

Back to the top of ClusterEngine


All Members

public:
void addNeighbors(const Data& data1, const Data& data2);
void addNeighbors(const Data& data1, const Data& data2, float dist);
iterator addCluster(const Data& data);
iterator addCluster(const Data& data, const Cluster& cluster);
void cluster(float maxDist);
void fastCluster(float maxDist);
const_iterator getCluster(const Data& data) const;
iterator getCluster(const Data& data);
// Container Methods
const_iterator begin() const;
iterator begin();
const_iterator end() const;
iterator end();
size_type size() const;
void addElement(const ElementType& data);
void cluster();
void cluster(const float maxDist);
void fastCluster();
void fastCluster(const float maxDist);
protected:

Back to the top of ClusterEngine


Ancestors

Inheritance chain for ClusterEngine:

Back to the top of ClusterEngine


Descendants

Class is not inherited by any others.

Back to the top of ClusterEngine


Generated from source by the Cocoon utilities on Sun Nov 15 13:35:25 2009 .

Report problems to jkotula@unimax.com