Too Long; Didn't Read
How will a neural <a href="https://hackernoon.com/tagged/network" target="_blank">network</a> designed to do simple word prediction perform when trained on posts collected from online forums? If we choose sources dedicated to a niche topic (Ex: a specific video game), how will the model handle topic-specific words (Ex: character and spell names) and slang? Is the ‘noise’ from casual English (slang, misspellings, mixed grammar, etc.) significant enough to discount using this type of data for <a href="https://hackernoon.com/tagged/learning" target="_blank">learning</a> exercises? In this exercise, we scrape discussion boards for the online game ‘DoTA2’ and see how the results compare to using classically curated data sets.