My Data Science Journey: K Means Clustering of Mall Customer Data

As a part of the Udemy Machine Learning A-Z course, I got my hands dirty with a little K-Means clustering. The problem was fairly simple, where we received a sample of 200 customers of a local mall. This dataset contained information about customer gender, age, annual income, and spending score. We want to determine whether a customer is Careless, Sensible, Standard, Careful, or a Target for marketing campaigns due to their high income and high spending.

Within the Jupyter notebook below, the income and spending score were used to conduct the classification after noticing that the age and gender had no significant impact on the spending score. The "Elbow Method" was implemented to determine the optimal number of clusters for our analysis. In the visualization, a good choice looks to be five.

Once we've found our optimal number of buckets, the two chosen columns are fitted to the K-Means clustering algorithm.

We then visualize our predictions in Python, but ultimately decide that Tableau will more clearly describe our dataset.

Using the dashboard above, the customers can be clearly seen in their appropriate clusters. It's evident that the choice of five clusters was ideal, and that the "Target Customers" are well defined. I found this exercise to be really useful, and you're more than welcome to give it a shot! I've embedded the jupyter notebook below, and linked to the github as well.

github:
https://github.com/SLPeoples/Machine-Learning-A-Z/tree/master/Part%2004%20-%20Clustering/24_K_Means

My Data Science Journey

Wednesday, January 10, 2018

K Means Clustering of Mall Customer Data

1 comment: