Vectorizing the training data