paint-brush
What are the Most Essential Data Science Tools?by@joey
477 reads
477 reads

What are the Most Essential Data Science Tools?

by Joey BertschlerSeptember 18th, 2019
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Matlab, Excel, Tableau and The Natural Learning Toolkit are the most important data science tools. Matlab is used for fuzzy logic and neural networks. Tableau's core and strength lie with its ability to interface: database, analytical processing, and of course spreadsheet - but advanced. It's heavy and slippery, and it has a steep learning curve. The Natural Language Toolkit is not about visualization afterall. It can help computers understand human language; Thanks to machine learning and algorithms.

People Mentioned

Mention Thumbnail
featured image - What are the Most Essential Data Science Tools?
Joey Bertschler HackerNoon profile picture

Alter, Extract, Scrape, Process, Generate, Predict, Understand, Visualize

TL; DR: In order to do all the challenging tasks he or she is supposed to do, a data scientist uses tools just like a mechanic would. But which ones are currently among the most essential? Let's take a quick look at 2019's essential data science tools: Matlab, Excel, Tableau and The Natural Learning Toolkit.

1. Matlab

MATLAB doesn't need an introduction, but will get one anyway. The multi-paradigm numerical computing environment (quite the tongue breaker) processes mathematical data. It facilitates algorithmic implementation, matrix functions and modeling of data. In data science, MATLAB is used for both fuzzy logic and neural networks. Besides it's graphics library, it's also used in image and signal processing making it quite the versatile tool for data scientists.

Bonus: Matlab is easy to integrate and able to automate various tasks. Drawback: Closed-source

2. Excel

In data science it is easy to forget the end goal: solving problems. And believe it or not, one of the most widely used tools to analyze data is Excel!

Visualization and of course processing data? Excel, well, it's good at it. Often underestimated, it comes with various filters, slicers, tables and of course a variety of formulae.

Bonus: Power it up with ToolPak!

Drawback: As one would guess, it's more meant for small to non enterprise level in this context.

3. Tableau

What better way to visualize a visualization tool like Tableau than Tableau?

Focused on business intelligence, Tableau's core and strength lie with its ability to interface: database, analytical processing, and of course spreadsheet - but advanced.

Bonus: Can visualize geographical data and open source. Drawback: Free version is a bit more limited

4. Natural Language Toolkit

Wacky graphic? Exactly. NLP is not about visualization afterall.

Able to deal with the development of statistical models it can help computers understand human language; Thanks to machine learning and algorithms. Widely used for processing techniques such as tokenization and parsing. Consists of a big collection of data for building machine learning models.

Bonus: Python comes with a collection of libraries (short NLTK) Drawback: At times It's heavy and slippery, and it has a steep learning curve.

References:
(1) https://www.mathworks.com/

(2) https://en.wikipedia.org/wiki/Microsoft_Excel

(3) https://www.tableau.com/

(4) https://en.wikipedia.org/wiki/Natural_Language_Toolkit

Author Bio: Joey Bertschler balances multiple high-level roles across multiple cosmopolitan tech companies such as the Security Token Alliance, the World Data Science Forum, bitgrit and Cosmology.

He is the Vice President of the Security Token Alliance, the world’s largest think tank for the Security Token industry with over 100 partners, Advisor to the World Data Science Forum, and Brand Manager at bitgrit and Cosmology Inc. He has reached audiences of over 1 million on social media channels and is involved in a plethora of marketing campaigns in APAC and CE.