Ways to carry out the inference – making predictions
Once the model is created, we need to use the model for a new dataset in order to infer or make the predictions. Similar to how we had various ways in which we could carry out the training process, we can have multiple approaches to carry out the inference process also:
- On a server:
- General cloud computing
- Hosted machine learning
- Private cloud/simple server machine
- On a device
Inference on a server would require a network request and the application will need to be online to use this approach. But, inference on the device means the application can be a completely offline application. So, obviously, all the overheads for an online app, in terms of speed/performance, and so on, is better for an offline application.
However, for inference, if there are more compute resources—that is, processing power/memory is required—the inference cannot be done on the device.