Too Long; Didn't Read
Data imbalance refers to when the classes in a dataset are not equally distributed, which can then lead to potential risks in training a model. There are several methods to balancing training data and overcoming imbalanced data, including resampling and weight balancing. In a world where AI is proliferating, it is important that we place a particular focus on training data to reduce the risk of biased outputs. An imbalanced crime dataset would perpetuate racial and gender biases that exist in the dataset when using artificial intelligence to predict criminal behavior.