A collection of method for computing similarity scores (also called a similarity index) between two (possibly fuzzy) clusterings. Indexes are typically in the range [0, 1], but many random clusterings produce values very close to 1 making them difficult to interpret. An Adjusted Similarity Index is computed using the following equation:
Adjusted Indexes are in the range
Computing an adjusted similarity index requries choices of both a score and a random model. This package provides implementations of several indexes and random models that are extensions of the Rand Index to fuzzy clusterings. The choice of a particular index and random model is problem dependent. See [1] and [2] for a discussion of this selection.
The following indexes and random models are implemented. If you use any of these indices please consider citing the paper where they were introduced.
Using the NDC and permutation model is called the Adjusted Degree of Concordance [6]. Note that the Frobenious Index may only be adjusted with the permutation model (see [4] for details).
This package is available from the julia general repository.
using Pkg
Pkg.add("FuzzyCLusteringSimilarity")Then import the module.
using FuzzyClusteringSimilarityYou can run the unit tests to insure the package was properly installed.
Pkg.test("FuzzyClusteringSimilarity")The package exports three main functions. It also exports a typing system of similarity scores and random models to be used to direct the function to the desired implementation. For an example of using the package, check out the code that produced the figures in [1].
function adjustedsimilarity(
z1::AbstractMatrix{<:Real},
z2::AbstractMatrix{<:Real},
index::AbstractIndex,
model::AbstractRandAdjustment;
onesided::Bool=true
)This is the main function to be used to compare two fuzzy clusterings, z1 and z2. The clusterings are in the form of a c x n matrix of n objects into c clusters. If the clustering is hard, ensure the matrix entries have type <:INT to call the proper random model.
function similarity(
z1::AbstractMatrix{<:Real},
z2::AbstractMatrix{<:Real},
index::AbstractIndex
)The similarity function can be used to compute and unadjusted index.
function expectedsimilarity(
z1::AbstractMatrix,
z2::AbstractMatrix,
index::AbstractIndex,
model::AbstractRandAdjustment;
onesided::Bool=true
)The expected similarity function computes the expected similarity index between random clusterings (using the provided random model).
function massageMatrix(matrix::AbstractMatrix)Massage a matrix to enable julia's multiple dispatch. Matrix is formated with objects as columns and clusters and rows. If matrix is a hard clustering, the type is converted to Bool.
[1] DeWolfe, R., Andrews, J.L. Random models for adjusting fuzzy rand index extensions. Adv Data Anal Classif (2025). https://doi.org/10.1007/s11634-025-00625-w
[2] Gates AJ, Ahn Y-Y (2017) The impact of random models on clustering similarity. J Mach Learn Res 18(87):1–28. http://jmlr.org/papers/v18/17-039.html
[3] E. Hullermeier, M. Rifqi, S. Henzgen and R. Senge, "Comparing Fuzzy Partitions: A Generalization of the Rand Index and Related Measures," in IEEE Transactions on Fuzzy Systems, vol. 20, no. 3, pp. 546-556, (2012). https://doi.org/10.1109/TFUZZ.2011.2179303
[4] Andrews, J.L., Browne, R. & Hvingelby, C.D. On Assessments of Agreement Between Fuzzy Partitions. J Classif 39, 326–342 (2022). https://doi.org/10.1007/s00357-021-09407-3
[5] T. Denoux, S. Li, and S. Sriboonchitta, “Evaluating and Comparing Soft Partitions: An Approach Based on Dempster–Shafer Theory,” IEEE Trans. Fuzzy Syst., vol. 26, no. 3, pp. 1231–1244, (2018), https://doi.org/10.1109/TFUZZ.2017.2718484.
[6] D’Ambrosio, A., Amodio, S., Iorio, C. et al. Adjusted Concordance Index: an Extensionl of the Adjusted Rand Index to Fuzzy Partitions. J Classif 38, 112–128 (2021). https://doi.org/10.1007/s00357-020-09367-0