Too Long; Didn't Read
Debricked has achieved a not so small feat – we are now able to actively keep and maintain a clone of all data on GitHub. To understand all the why’s and how we have interviewed our Head of Data Science, Emil Wåréus. The short answer is – to have a better and faster representation of the data that we need to service our customers. In terms of cloning all the repositories we are looking at about 20 terabytes of data. There are about 10,000 – 30,000 pull requests each hour, 100,000 open source issues and 12,000 active users.