There’s been a lot of talk about women in tech these days. Headlines include both the repugnant and up-lifting. One the ugly side, we’ve all read about the sexual harassment and mistreatment at famous tech companies. On a more uplifting note, more campaigns and movies are celebrating women’s contribution to tech — and inspiring future ladies to take up STEM careers.
For International Women’s Day, we decided to tackle this issue with APIs. We built a project that determines how many of Stack Overflow’s top-ranked 3,479 users are women. We collected this data by connecting to public APIs (Stack Overflow and Clarifai’s Visual Recognition API) through RapidAPI.
Since RapidAPI connects to multiple APIs through one abstraction layer, finding this data was relatively easy. Here are the APIs that we used through RapidAPI.
Stack Exchange API
We exported the top 3,500 Stack Overflow users with the Stack Exchange API. We sorted users based on Stack Overflow reputation. A user gains or loses reputation by, in Stack Overflow’s words, “posting good questions and useful answers.”
Clarifai API
We used the Clairfai Visual Recognition API’s Demographic model to determine the gender of the user based on their profile picture. The Clarifai API works by returning a probability that the face in the image is a masculine or a feminine. When the API could not find a face in the image, we categorized the result as unclassified.
Now, onto the results!
You can dig into the raw data for yourself, but here are the key stats and some additional analysis.
The vast majority of Stack Overflow profiles had a gender neutral image or icon as a profile picture, which made it trickier to categorize. Out of the top 3,500 profiles, 1,512 had a profile image with a discernible gender.
To give you some context, here are some example profile pictures that the API would (obviously) not classify with a gender.
Since the Clarifai Demographic model is still in alpha, we went through the top 100 results manually as well. We found 77 masculine profiles, 21 unclassified and 1 feminine profile. Unclassified users either had gender neutral profiles (see Finding 0), did not provide enough information to determine gender or identified outside the male/female gender binary. One user even identified as non-human!
The Clairfai API’s Demographic model is pretty fun to play with — you can test a call in your browser through the RapidAPI package. However, we noticed the model is definitely still in alpha. For example, this Seminoles fan’s profile picture was returned as “feminine.”
The API may have some limitations when it comes to illustrated characters or icons.
Determining gender on Stack Overflow turned out to be trickier than we initially anticipated. Many users favor icons or illustrations over profile pictures, which made it more difficult to identify gender. To get better results, a future iteration of this experiment could include the Gender API, which determines a probability that a name is associated with a masculine or feminine gender. Accuracy improves with location, so we exported that data as well.