Affinity Propagation

Affinity Propagation is a clustering algorithm that identifies a set of exemplars that best represent the data points in a given dataset. The algorithm works by iteratively sending messages between the data points to determine which points should be considered as exemplars, and which points should be assigned to those exemplars.

The algorithm then proceeds to iteratively update the responsibility and availability matrices until convergence is reached. At each iteration, the responsibility matrix is updated based on the availability matrix, and the availability matrix is updated based on the responsibility matrix.

Similarity Matrix (s)

The similarity matrix is a square matrix that represents the pairwise similarity between each pair of data points in a given dataset. The similarity between two data points can be calculated using any appropriate similarity measure, such as Euclidean distance, cosine similarity, or correlation coefficient.

Barring those on the diagonal, every cell in the similarity matrix is calculated by negating the sum of the squares of the differences between participants (Euclidean Distance).
The higher the similarity between two data points, the more likely they are to be assigned to the same cluster or have one of them act as an exemplar for the other.

Preference Value

In Affinity Propagation clustering algorithm, preference value (also called "self-similarity" i.e S(k,k) or "preference parameter") is a parameter that is used to control the number of clusters generated by the algorithm.

The preference value represents the degree to which each data point wants to be an exemplar or a cluster center. It measures the suitability of a point to be chosen as an exemplar. A higher preference value will lead to more exemplars and smaller clusters, while a lower preference value will lead to fewer exemplars and larger clusters.

Responsibility Matrix (r)

The responsibility matrix is a square matrix that represents the degree to which a data point is responsible for suggesting another data point as an exemplar. The responsibility matrix is updated in each iteration of the algorithm along with the availability matrix, and the two matrices together determine which data points are selected as exemplars and which data points are assigned to those exemplars.

We start off by constructing an availability matrix with all elements set to zero. Then, we calculate every cell in the responsibility matrix using the following formula:

The responsibility matrix reflects the competition between data points to suggest each other as exemplars, and can be seen as a measure of how strongly a data point thinks that another data point should become an exemplar. If the responsibility value for a data point is high, it means that it thinks that another data point should become an exemplar, and therefore it is more likely to be assigned to that exemplar.

Availability Matrix (a)

In Affinity Propagation, the availability matrix is a square matrix that represents the degree to which a data point is available to act as an exemplar for other data points. The availability matrix is updated in each iteration of the algorithm along with the responsibility matrix, and the two matrices together determine which data points are selected as exemplars and which data points are assigned to those exemplars.

We use a separate equation for updating the elements on the diagonal of the availability matrix than we do the elements off the diagonal of the availability matrix.

The proceeding formula is used to fill in the elements on the diagonal:

where i refers to the row and k the column of the associated matrix.

The following equation is used to update off-diagonal elements:

The availability matrix reflects the competition between data points to act as exemplars, and can be seen as a measure of how strongly a data point wants to become an exemplar. If the availability value for a data point is high, it means that there are many other data points that would like to be assigned to it, and therefore it is more likely to be selected as an exemplar.

The availability matrix is used in conjunction with the responsibility matrix to update the exemplar and cluster assignments in each iteration of the algorithm, until convergence is achieved.

Criterion Matrix (c)

The criterion matrix is a square matrix that represents the quality of the clustering solution at each iteration of the algorithm. The criterion matrix is calculated as the sum of the similarity values between each data point and its assigned exemplar, minus the maximum similarity value between any data point and any other exemplar that is not its own.

Each cell in the criterion matrix is simply the sum of the availability matrix and responsibility matrix at that location.

The criterion matrix is used as a measure of the quality of the clustering solution, and is used to determine when the algorithm has converged. Convergence is achieved when the criterion matrix no longer changes significantly between iterations, indicating that the exemplar and cluster assignments have stabilized.

The criterion matrix is also used to determine the optimal number of clusters in the dataset. The optimal number of clusters can be determined by analyzing the criterion matrix and looking for a "knee point" or an elbow in the curve of the criterion values. The knee point represents the number of clusters at which adding additional clusters does not significantly improve the quality of the clustering solution.

The highest criterion value of each row is designated as the exemplar. Rows that share the same exemplar are in the same cluster.

Damping Factor

Another point to note is that a damping factor is always used to avoid numerical oscillations in the iterations. Specifically, a new value in the matrices is defined as

val_new = (dampfact)(val_old) + (1-dampfact)(val_new)

Termination Conditions

After a fixed number of iterations.
After changes in the messages fall below a threshold.
After the local decisions stay constant for some number of iterations

Final Note :

Affinity Propagation is an unsupervised machine learning algorithm that is particularly well suited for problems where we don’t know the optimal number of clusters.

Search This Blog

Clustering Visualization V-Team