Gain Actionable Insights on Your Customers with Machine Learning

Expanding your customer base is key to your company's growth. But how many of you also focus on keeping your existing customers?

If you don't, you should! The cost of acquiring a new customer is anywhere from five to 25 times more expensive than the cost of retaining one. Your customers are much too valuable to lose!

How then, do you reduce the rate at which your customers leave you, also known as customer churn? Broad strategies like customer loyalty programs and continuously improving your competitive advantage can help.

Using data to gain insights

But is there a way to focus your efforts on your “highest risk” customers? If only you knew who was about to leave, you could give them the extra attention that could convince them to stay.

How do you identify these people? Is there any past behavior that hints at customer dissatisfaction? Are there signs they've outgrown your service or product? Could there be a competitor that's pulling them away?

In other words, are they leaving a trail of evidence showing they are on their way out? If so, how do you find this trail?

Thanks to today's modern technology, we have more data available about our customers than ever before.

This includes things like:

  • their age
  • where they live
  • their likes and dislikes
  • the number of times they visited your website
  • their order history
  • the number of calls to customer service

... just to name a few. This information can help us build a profile of and learn about each and every customer. But how feasible would that be for our entire, potentially large, customer bases? And returning to the original problem, how would we know which ones are high-risk? Thankfully, computers are great at working with large datasets like customer data.

And more recently, computers have become very good at learning and predicting too!

Machine Learning

Machine learning is a subset of artificial intelligence. It enables computers to learn without being explicitly programmed. While not a new concept, advancements in low-cost, distributed, parallel computing and the availability of large datasets have resulted in faster and more accurate results.

Traditionally, computers need specific instructions to operate. For example, if we know beforehand the characteristics of a high-risk customer, we can program our computers to watch for these customers. The problem is, we don't truly know what these characteristics are. With machine learning, however, computers can now figure that out for us!

Behind the “magic” of machine learning are complex mathematical algorithms. These algorithms uncover the correlations in the data, and use them to generate analytical models to base predictions on. How would we apply this ability to predict customer churn?

Predicting Customer Churn

At Project Ricochet, we specialize in building web and mobile apps. These apps usually generate a lot of user data. We're discovering how this data, combined with machine learning, can lead to valuable insights for our clients, including information about users they're likely to lose as customers.

The first step in any machine learning undertaking is understanding the problem we're trying to solve. In this case, the problem is, "How can we predict which customers are about to churn?" This will help us determine the data we need and the algorithm to use.

Next, we gather the data that we'll use to train and test our model. For customer churn, we'll pull customer data from databases, customer support logs, and web analytics to name just three sources. For every customer in these datasets, we'll also need to classify them according to whether they've churned or not.

You might be wondering... "Isn't that what we're trying to solve?" Not quite. Remember, we're not building a model to identify who has and hasn't churned. Instead, we're predicting who is likely to churn.

We then prepare the data. This includes removing any corrupted or bad data, as well as outliers that can skew our model. We keep important features like the customers’ number of calls and order history. Phone numbers, however, are unique to each customer, and have no bearing on whether a customer will churn. We can safely drop features like that. We then randomize and split the data into two parts – training data and test data.

Next, we choose the model and its machine learning algorithm to train it with our data. This choice depends on a number of factors, including the problem we're trying to solve and the type and amount of data we have. For predicting market churn, we'll use K-Nearest Neighbors (kNN). It's one of the simplest machine learning models to use and understand.

With kNN, every customer in the training dataset is represented as a data point in an n-dimension space based on their features (where n is the number of features). This step is called training our model. By doing this, we can predict if an active customer is likely to churn depending on whether most or all of their k nearest neighbors have churned.


A simplified example of kNN using two features (# of orders, # of complaints) and k = 3

We'll then evaluate our model using our test data by comparing what the model predicts with the actual value. Parameter tuning, such as changing the k value, can help us achieve higher accuracy.

Finally, predicting and operationalizing our solution will involve regularly running customers (including any new ones) through the model so that it can alert the client when a customer is likely to churn. This allows the client to engage with this customer before they leave.

The future today

Business Insider expects there to be nearly 10 million self-driven cars on the road by 2020, powered by machine learning. Deloitte expects that same year will see 95% of the top 100 enterprise software companies release AI-enabled apps. Gartner predicts that by 2022, more than 80% of enterprise IoT projects will include an AI component.

However, you don't have to wait for this grand AI future to take advantage of it today. Using the data you likely already have, forecasting sales, detecting fraud, uncovering process optimizations, and reducing customer churn are just some of the ways your business can take advantage of machine learning today.




Project Ricochet is a full-service digital agency specializing in Open Source.

Is there something we can help you or your team out with?




A bit about Marty:

Marty has a B.S. in Computer Science and over 15 years of software architecture, development, and management experience - building everything from large scale, enterprise back end systems to innovative client side apps using Meteor.js. In his free time, he loves playing hockey and imbibing Tim Horton's coffee when he's not playing with his three kids.