Alter, Extract, Scrape, Process, Generate, Predict, Understand, Visualize
TL; DR: In order to do all the challenging tasks he or she is supposed to do, a data scientist uses tools just like a mechanic would. But which ones are currently among the most essential? Let's take a quick look at 2019's essential data science tools: Matlab, Excel, Tableau and The Natural Learning Toolkit.
MATLAB doesn't need an introduction, but will get one anyway. The multi-paradigm numerical computing environment (quite the tongue breaker) processes mathematical data. It facilitates algorithmic implementation, matrix functions and modeling of data. In data science, MATLAB is used for both fuzzy logic and neural networks. Besides it's graphics library, it's also used in image and signal processing making it quite the versatile tool for data scientists.
Bonus: Matlab is easy to integrate and able to automate various tasks. Drawback: Closed-source
In data science it is easy to forget the end goal: solving problems. And believe it or not, one of the most widely used tools to analyze data is Excel!
Visualization and of course processing data? Excel, well, it's good at it. Often underestimated, it comes with various filters, slicers, tables and of course a variety of formulae.
Bonus: Power it up with ToolPak!
Drawback: As one would guess, it's more meant for small to non enterprise level in this context.
What better way to visualize a visualization tool like Tableau than Tableau?
Focused on business intelligence, Tableau's core and strength lie with its ability to interface: database, analytical processing, and of course spreadsheet - but advanced.
Bonus: Can visualize geographical data and open source. Drawback: Free version is a bit more limited
4. Natural Language Toolkit
Wacky graphic? Exactly. NLP is not about visualization afterall.
Able to deal with the development of statistical models it can help computers understand human language; Thanks to machine learning and algorithms. Widely used for processing techniques such as tokenization and parsing. Consists of a big collection of data for building machine learning models.
Bonus: Python comes with a collection of libraries (short NLTK) Drawback: At times It's heavy and slippery, and it has a steep learning curve.
Author Bio: Joey Bertschler balances multiple high-level roles across multiple cosmopolitan tech companies such as the Security Token Alliance, the World Data Science Forum, bitgrit and Cosmology.
He is the Vice President of the Security Token Alliance, the world’s largest think tank for the Security Token industry with over 100 partners, Advisor to the World Data Science Forum, and Brand Manager at bitgrit and Cosmology Inc. He has reached audiences of over 1 million on social media channels and is involved in a plethora of marketing campaigns in APAC and CE.