Customer segmentation is the process of dividing a broad customer base into smaller segments based on shared characteristics in order to identify high yield groups that are likely to have growth potential or be the most profitable.

The process not only provides a deeper understanding of the given customer base but it is typically the starting point in helping to identify potential high value target customers within a larger market or population.

When segmenting customers for the purpose of identifying new targets, analysts typically look for common characteristics (rule-set) such as shared needs, demographics or other similar attributes. This is known as rule-based modeling and is effective when rules are relatively simple and well defined, with a limited number of patterns.

What if your rule-set isn’t well defined? What if you have a complex customer base and you aren’t sure what patterns exist? What if you don’t know which combination of attributes will yield the best results? What if you don’t know, what you don’t know?

Fortunately, machine learning is especially effective at applying statistical modeling to data to answer these questions. As the name implies, the technology is also extremely good at continually optimizing a given model. Machine learning can be separated into supervised and unsupervised learning models.

Supervised Learning is exactly as it sounds. Models are supervised in that they are constructed to produce a predefined output. For example, sales need to know which customers may be a risk of defecting. Therefore, the statistical model is designed specifically to use labeled data to predict which customers will churn.

These algorithms are trained and optimized on a sample set of data known as training data, before being applied to a larger data set for analysis.

Unsupervised Learning on the other hand, is used to uncover and describe hidden structure within “unlabeled” data, meaning a classification or categorization is not included in the observations. These statistical models are generally exploratory and used to gain deeper insights that may not be readily apparent.

Determining which approach is right for you depends on your specific needs. If you have a relatively simple rule-set and all the necessary data you may be able to fast forward to rules-based modeling. Chances are, you will benefit from utilizing supervised learning to generate high value customer attributes that dramatically improve on the rules-based approach.

If you are like many and you are missing information, you do not know if you have all the information or if unseen insights exist in your customer base, you will likely benefit from unsupervised learning.

The benefits associated with coordinating multiple modeling approaches extend far beyond the final output itself. The very process of utilizing data science will ask questions you did not know needed answers, helping you build a better business.