Clustering with Fuzzy C-means

Oleh Dubetcky
5 min readMay 5, 2024

Fuzzy C-means (FCM) clustering is an extension of the traditional K-means clustering algorithm, allowing data points to belong to multiple clusters with varying degrees of membership. Instead of hard assignments like in K-means, where each data point is assigned to the cluster with the nearest centroid, FCM assigns a membership value for each data point indicating its degree of belongingness to each cluster.

Photo by Shlomo Shalev on Unsplash

The FCM algorithm works by iteratively updating cluster centroids and membership degrees until convergence. The membership degree of a data point for a particular cluster is determined based on its distance to the cluster centroid relative to the centroids of other clusters.

Here’s a simplified outline of the FCM algorithm:

  1. Choose the number of clusters 𝑘 and the fuzziness parameter 𝑚.
  2. Initialize cluster centroids randomly or using some other method.
  3. Compute the membership degrees for each data point for each cluster based on their distances to the cluster centroids using the formula:

where:

  • 𝜇𝑖𝑗 is the membership degree of data point 𝑖i for cluster 𝑗
  • 𝑥𝑖​ is the 𝑖th data point
  • 𝑣𝑗​ is the centroid of cluster 𝑗
  • 𝑐 is the total number of clusters
  • 𝑚 is the fuzziness parameter (typically 𝑚>1)

4. Update cluster centroids using the formula:

5. Repeat steps 3 and 4 until convergence (when the centroids and membership degrees stop changing significantly).

FCM is sensitive to the initial selection of cluster centroids and may converge to different solutions for different initializations. Therefore, it’s common to run the algorithm multiple times with different initializations and choose the solution with the lowest objective function value.

FCM is widely used in various applications, including image segmentation, pattern recognition, and data mining, where the boundaries between clusters are not well-defined and data points may belong to multiple clusters simultaneously.

Clustering is indeed a versatile machine learning technique used for identifying inherent patterns and grouping similar data points together based on their characteristics. Here’s a bit more detail on how it’s applied in the scenarios you mentioned:

  1. Market Segmentation: In marketing, clustering can be immensely useful for segmenting customers based on their preferences, behaviors, demographics, or purchasing patterns. By clustering customers into groups with similar characteristics, marketers can tailor their products, services, and marketing strategies to each segment’s specific needs and preferences. For example, a retail company might use clustering to identify segments of customers who are interested in high-end fashion, budget-friendly deals, or eco-friendly products.
  2. Text Clustering: Clustering can also be applied to texts, documents, or news articles for various purposes such as topic modeling, document organization, and recommendation systems. By clustering similar documents together, it becomes easier to summarize large corpora, identify emerging topics, or recommend related content to users. For instance, a news aggregator might use text clustering to group news articles into categories like politics, sports, technology, entertainment, etc., enabling users to explore content based on their interests.

In both cases, clustering algorithms such as K-means, hierarchical clustering, DBSCAN, or fuzzy clustering (like Fuzzy C-means) can be employed depending on the nature of the data and the desired outcome. These algorithms help identify meaningful clusters within the data, allowing for deeper insights and more targeted decision-making in various domains.

Here’s a simple example of how you can perform text clustering using the Fuzzy c-means (FCM) algorithm:

Let’s say you have a collection of news articles and you want to cluster them based on their content. Here’s how you could approach it:

  1. Preprocessing:
  • Tokenize the documents: Split each document into words or tokens.
  • Lowercase the tokens: Convert all words to lowercase to ensure consistency.
  • Remove stop words: Eliminate common words like “the”, “and”, “is”, etc., that don’t carry much meaning.
  • Stemming or lemmatization: Reduce words to their root form to reduce dimensionality.

2. Feature Extraction:

  • Represent each document as a TF-IDF (term frequency-inverse document frequency) vector. This vector represents the importance of each word in the document relative to the entire corpus.

3. Fuzzy c-means Clustering:

  • Initialize the parameters: Determine the number of clusters (k) and the fuzziness exponent (m).
  • Initialize cluster centers: Randomly initialize cluster centers.
  • Calculate the degree of membership: Compute the degree of membership of each document to each cluster center using the TF-IDF vectors and the distance metric (e.g., Euclidean distance).
  • Update cluster centers: Update the cluster centers based on the membership values.
  • Repeat the membership calculation and cluster center update until convergence.

4. Evaluation:

  • Assess the quality of clusters using evaluation metrics such as cluster cohesion, cluster separation, or silhouette score.

Here’s a Python code snippet using the skfuzzy library for Fuzzy c-means clustering:

import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from skfuzzy.cluster import cmeans

# Sample documents
documents = [
"Machine learning is the future of technology.",
"Natural language processing is a branch of artificial intelligence.",
"Data science involves analyzing and interpreting large datasets.",
"Deep learning models can achieve state-of-the-art performance.",
"Information retrieval techniques are used in search engines."
]

# Preprocessing: TF-IDF feature extraction
vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(documents).toarray() # Convert to dense array

# Fuzzy c-means clustering
n_clusters = 2 # Number of clusters
fuzzy_cmeans = cmeans(X.T, n_clusters, m=2, error=0.005, maxiter=1000)

# Membership matrix
membership_matrix = fuzzy_cmeans[1]

# Assign documents to clusters based on maximum membership
clusters = np.argmax(membership_matrix, axis=0)

# Print cluster assignments
for i, doc in enumerate(documents):
print(f"Document '{doc}' belongs to cluster {clusters[i] + 1}")

Complete example at Colab

If you liked the article, you can support the author by clapping below 👏🏻 Thanks for reading!

Oleh Dubetsky|Linkedin

--

--

Oleh Dubetcky

I am an management consultant with a unique focus on delivering comprehensive solutions in both human resources (HR) and information technology (IT).