This story draft by @escholar has not been reviewed by an editor, YET.
Authors:
(1) Yuan Wang, University of Rochester (e-mail: [email protected]);
(2) Yangxin Fan, University of Rochester (e-mail: [email protected]).
WP fatal police shooting dataset insight
Fatal police shooting rate and victims race prediction
We defined reporting deviation rate and total absolute reporting deviation rate to evaluate the media’s reporting bias.
In WP dataset analysis, we used FP-growth and word cloud to reveal the frequent patterns and DBSCAN clustering to find fatal shooting hotspots. We also implemented correlation analysis to analyze correlation between multiple numeric attributes and fatal police shooting rate and tested the significance of their correlations. We used Ttest/ANOVA to measure the significance of fatal police shooting rate by categorical attributes.
In fatal police shooting rate prediction, we used results of correlation analysis to select numeric predictors. We constructed a series of regression models, including Kstar, KNearest-Neighbor, Random Forest, and Linear Regression, to predict state level’s fatal police shooting rate. We measured their performance by ten-fold cross validation scores. In victims’ race prediction, we used Chi-square testing to do variables selection. We built a series of classification models, including Gradient Boosting Machine, Multi-class Classifier, Logistic Regression, and Na¨ıve Bayes Classifier, to predict the race of fatal police shooting victims. We measured their performance by stratified five-fold cross validation scores.
This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.