上QQ阅读APP看书，第一时间看更新

Evaluating unsupervised learning algorithms

This is a bit trickier. Because unsupervised learning is not concerned with predictions, we cannot directly evaluate performance based on how well the model can predict a value. That being said, if we are performing a cluster analysis, such as in the previous marketing segmentation example, then we will usually utilize the silhouette coefficient (a measure of separation and cohesion of clusters between -1 and 1) and some human-driven analysis to decide if a feature engineering procedure has improved model performance or if we are merely wasting our time.

Here is an example of using Python and scikit-learn to import and calculate the silhouette coefficient for some fake data:

attributes = tabular_data
cluster_labels = outputted_labels_from_clustering

from sklearn.metrics import silhouette_score
silhouette_score(attributes, cluster_labels)

We will spend much more time on unsupervised learning later on in this book as it becomes more relevant. Most of our examples will revolve around predictive analytics/supervised learning.

It is important to remember that the reason we are standardizing algorithms and metrics is so that we may showcase the power of feature engineering and so that you may repeat our procedures with success. Practically, it is conceivable that you are optimizing for something other than accuracy (such as a true positive rate, for example) and wish to use decision trees instead of logistic regression. This is not only fine but encouraged. You should always remember though to follow the steps to evaluating a feature engineering procedure and compare baseline and post-engineering performance.

It is possible that you are not reading this book for the purposes of improving machine learning performance. Feature engineering is useful in other domains such as hypothesis testing and general statistics. In a few examples in this book, we will be taking a look at feature engineering and data transformations as applied to a statistical significance of various statistical tests. We will be exploring metrics such as R²and p-values in order to make judgements about how our procedures are helping.

In general, we will quantify the benefits of feature engineering in the context of three categories:

Supervised learning: Otherwise known as predictive analytics
- Regression analysis—predicting a quantitative variable:
  - Will utilize MSE as our primary metric of measurement
- Classification analysis—predicting a qualitative variable
  - Will utilize accuracy as our primary metric of measurement
Unsupervised learning: Clustering—the assigning of meta-attributes as denoted by the behavior of data:
- Will utilize the silhouette coefficient as our primary metric of measurement
Statistical testing: Using correlation coefficients, t-tests, chi-squared tests, and others to evaluate and quantify the usefulness of our raw and transformed data

In the following few sections, we will look at what will be covered throughout this book.