Hello Tony, thanks for visiting! I was busy so couldn’t get back earlier.

1 min readJul 10, 2019

Cross validation divides the data-set into training and test sets randomly. This is different from what you have as real test-set, that you obtained from train-test-split. So when you apply normalization (StandardScalar) without pipeline, the test fold within cross-validation contain info from the training set. Thus the best parameters obtained this way are biased.

Hopefully this helps!

Written by Saptashwa Bhattacharyya

Responses (2)