TLDR
Proper scoring rules offer a model evaluation framework for probabilistic forecasts. Calibration tells whether our predictions are statistically consistent with observed events or values. Sharpness captures uncertainty in the predictions (predictive distribution) without considering the actual outcomes. We want our model evaluation technique to be immune to ‘hedging’. Hedging your bets means betting on both sides of an argument or teams in a competition. If your metric or a score can be ‘hacked’ then you might have a problem.via the TL;DR App
no story
Written by nikolao | Combines ideas from data science, humanities and social sciences. Views are my own.