Too Long; Didn't Read
The most commonly used metric to perform autoscaling is CPU utilisation. Usually, the proper utilization of instance resources will cause 70+% CPU utilization in case of high load. That is why it is usually recommended to scale the system out when CPU utilization reaches 50%. If the instance is failing health-checks without CPU consumption reaching 70% it may be caused by some application or configurations restrictions which blocks it from using all resources more effectively. If at some points the application has more requests than allowed threads so that it can not respond to health-check of LB and will be removed.