Training loss in machine learning is a measure of how well the model is able to fit the training data. It is calculated by comparing the model’s predictions with the actual values in the training set, and it is typically represented as a single number or a small set of numbers. The goal of training is to minimize the training loss, which is done by adjusting the model’s parameters to reduce the difference between the predictions and the actual values.
So is a bigger or smaller training loss better?
A smaller training loss is better. The training loss is a measure of how well the model is able to fit the training data, so a smaller training loss means that the model’s predictions are closer to the actual values in the training set. This is desirable because it indicates that the model has learned the underlying patterns in the data well and is able to generalize well to new, unseen data.
On the other hand, a bigger training loss means that the model’s predictions are farther away from the actual values, which implies that the model is not able to fit the training data as well and may not generalize as well to new, unseen data.
What would a good step / training loss table look like?
A good step/training loss table would typically show a decrease in the training loss as the number of training steps increases. The decrease in the training loss would usually be steady at the beginning of the training process, but it might slow down as the model approaches a minimum. The final training loss should be close to the global minimum and in general, a lower number the better.
Here’s an example of what a step/training loss table might look like:
This table shows that the training loss starts at 0.5 and decreases over time to 0.05, which is close to global minimum. It’s worth mentioning that sometimes, the training loss can fluctuate or even increase before decreasing again, this is because the model is trying to reach the global minimum, which can be a noisy process.
Written with ChatGPT