How to pick the best training & testing points? Thread on Cross Validation. 🧵 twitter.com/levikul09/status/1616039029912530944/photo/1

Thread

How to pick the best training & testing points?

Thread on Cross Validation.

🧵

We cannot use all data for model training, because that would cause overfitting.

We can of course select randomly, but there is a better option:

Cross Validation.

1/5

The steps Cross Validation does:

1. Divides the data into groups.

2. Iterates through the groups.

- Tries group combinations as training data.

- Uses the other group as testing data.

Let's see an example!

2/5

The first iteration can be:

- Group 1 & 2 as training data

- Group 3 as testing data

Of course every iteration will result in a different model.

In this case we will have 3 models.

Each with a different testing set.

Why is it good?

3/5

Different results mean that we can compare them.

Using different testing datasets, the prediction errors will differ for each model.

With Cross Validation you can select the best performing model.

4/5

That's it for today.

I hope you've found this thread helpful.

Like/Retweet the first tweet below for support and follow @levikul09 for more Data Science threads.

Thanks 😉

5/5

Mentions

See All

Akshay 🚀 @akshay_pachaar · Jan 19, 2023

Post
From Twitter

Simple, intuitive & effective. Great share Levi! 👍

Thread by Levi

Thread

Mentions