Too Long; Didn't Read
Swahili (also known as Kiswahili) is one of the most spoken languages in Africa. It is spoken by 100–150 million people across East Africa. News in local languages plays an important cultural role in many African countries. The goal of this project was to build an open-source text dataset focused on News articles. I mainly focus on collecting news in different categories such as Local, International, Business or Financial, health, sports, and entertainment news. The dataset is open source, and NLP practitioners can access the dataset and learn from it.