Random Forest on iOS
This chapter will provide you with an overview of the random forest algorithm. We will first look at the decision tree algorithm and, once we have a handle on it, try to understand the random forest algorithm. Then, we will use Core ML to create a machine learning program that leverages the random forest algorithm and predicts the possibility of a patient being diagnosed with breast cancer based on a given set of breast cancer patient data.
As we already saw in Chapter 1, Introduction to Machine Learning on Mobile, any machine learning program has four phases: define the machine learning problem, prepare the data, build/rebuild/test the model, and deploy it for usage. In this chapter, we will try to relate these with random forest and solve the underlying machine learning problem.
Problem definition: The breast cancer data for certain patients is provided and we want to predict the possibility of diagnosing breast cancer for a new data item.
We will be covering the following topics:
- Understanding decision trees and how to apply them to solve an ML problem
- Understanding decision trees through a sample dataset and Excel
- Understanding random forests
- Solving the problem using a random forest in Core ML:
- Technical requirements
- Creating a model file using the scikit-learn and pandas libraries
- Testing the model
- Importing the scikit-learn model into the Core ML project
- Writing an iOS mobile application and using the scikit-learn model in it to perform the breast cancer prediction