This blog and the Passing Project as a whole are putting sound methodologies in place to ensure the accuracy of our data. Most importantly, we test for inter-rater reliability. Inter-rater reliability is the process of making sure two or more "raters" collect the same data for the same games. While this is difficult to do with few data trackers, we test all games for which there are multiple trackers. The reliability of the data increases as the number of trackers increase (just one of the many reasons you should join the Passing Project! Hit us up on Twitter @cofstats / @RK_Stimp or send an email to hockeypassingstats@gmail.com.)
Inter-rater reliability testing is critical for several reasons:
- It corrects for errors. We're human. We miss things, we make typos, you name it.
- It corrects for bias. I'm a Flames fan. I think I'm unbiased because I started this blog to gain a deeper understanding of the Calgary Flames, good or bad. But at the end of the day it doesn't matter what I think.
- Inter-rater reliability testing highlights data points in which there is disagreement among trackers. These specific plays can be reviewed to ensure trackers know how to appropriately code that type of play. Tracker disagreement can also suggest the need for improved data definitions.
It cannot be said enough: the accuracy of the data is critical. Without it, all else fails. We're paying our due diligence here at the Passing Project.
***
If you'd like to join the Passing Project and collect data for an NHL team, you can reach out to Ryan Stimson on Twitter @RK_Stimp / or by email hockeypassingstats@gmail.com
***
***
References
http://en.wikipedia.org/wiki/Inter-rater_reliability
No comments:
Post a Comment