Validating/testing
Software engineers are familiar with testing and debugging software source code, but how should ML models be tested? Pieces of algorithms and data input/output routines can be unit tested, but often it is unclear how to ensure that the ML model itself, which presents as a black box, is correct.
The first step to ensuring correctness and sufficient accuracy of an ML model is validation. This means applying the model to predict or classify the validation data subset, and measuring the resulting accuracy against project objectives. Because the training data subset was already seen by the algorithm, it cannot be used to validate correctness, as the model could suffer from poor generalizability (also known as overfitting). To take a nonsensical example, imagine an ML model that consists of a hash map that memorizes each input sample and maps it to the corresponding training output sample. The model would have 100% accuracy on a training data subset, which was previously memorized, but very low accuracy on any data subset, and therefore it would not solve the problem it was intended for. Validation tests against this phenomenon.
In addition, it is a good idea to validate model outputs against user acceptance criteria. For example, if building a recommender system for TV series, you may wish to ensure that the recommendations made to children are never rated PG-13 or higher. Rather than trying to encode this into the model, which will have a non-zero failure rate, it is better to push this constraint into the application itself, because the cost of not enforcing it would be too high. Such constraints and business rules should be captured at the start of the project.