1 min readJul 10, 2019
Hello Tony, thanks for visiting! I was busy so couldn’t get back earlier.
Cross validation divides the data-set into training and test sets randomly. This is different from what you have as real test-set, that you obtained from train-test-split. So when you apply normalization (StandardScalar) without pipeline, the test fold within cross-validation contain info from the training set. Thus the best parameters obtained this way are biased.
Hopefully this helps!