Too Long; Didn't Read
Imbalance data means the classes we want to predict are disproportional. Classes that make up a large proportion of the data are called majority classes. Those that make a smaller portion are minority classes. The true positive rate drops from 97% to 33% for class 1. Using balanced class weight improves recall from 33% to 96%, but incurs many false positive and precision decreases from 100% to 36%. Another approach is to apply up-sampling. This means we randomly sample with replacement from minority class to increase proportion of minority class.