Too Long; Didn't Read
Multicollinearity is a well-known challenge in multiple regression. The term refers to the high correlation between two or more explanatory variables, i.e. predictors. It can be an issue in machine learning, but what really matters is your specific use case. In many cases, multiple regression is used with the purpose of understanding something. For example, an ecologist might want to know what kind of environmental and biological factors lead to changes in the population size of chimpanzees. We think of machine learning algorithms as black boxes that need to predict, but that black box sometimes needs to be understood as well. That's when multicollinearity is an issue.