paint-brush
10 Common Coding Mistakes Data Scientists Should Watch Out Forby@smitpatel
125 reads

10 Common Coding Mistakes Data Scientists Should Watch Out For

by SmitJanuary 30th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

There is high demand for quality data scientists in the market. Every business wants to integrate personalization, forecasting, clustering, and other similar processes using internal data. Data scientists often forget to back up their code and then go on to write everything from scratch again if they lose the files.
featured image - 10 Common Coding Mistakes Data Scientists Should Watch Out For
Smit HackerNoon profile picture

There is high demand for quality data scientists in the market. Every business wants to integrate personalization, forecasting, clustering, and other similar processes using internal data. Such tasks are carried out by data scientists, who are extremely important to businesses.


Today, all companies have access to data, but only a select few have data scientists that are savvy with code. Let's say you can come from a software engineering background. In that case, you already have an advantage over the majority of other data scientists who have backgrounds in math or statistics and are slowly learning data science.

Top 10 Most Common Mistakes


  1. Variable naming


Naming variables is a hard thing for every developer. Writing code is easy, but finding appropriate variable names is hard for many, and that drags them into creating bad variable names.


  1. Little to no Documentation


Data science applications are complex, and not everyone can understand them completely, but you should not make them harder. Many data scientists do not understand the power of documentation and just keep on writing code for a long time.


  1. Relying on Jupyter notebooks


Jupyter notebooks are great, but they still lack many excellent features that can help you work faster and better.


  1. Not backing up code


Backups are the most important part of any data science project. Data scientists often forget to back up their code and then go on to write everything from scratch again if they lose the files.


  1. Writing algorithms from scratch


Many people think that the identity and competencies of a data scientist should be measured by the number of algorithms they can write from scratch. But it is a huge mistake that many data scientists make on a regular basis.


  1. Not hiding data & other things while sharing code


Data science projects need to be shared with various people for validation and presentation purposes, and it is quite normal too.


  1. Relying only on one package or language


Being too dependent on one thing surely has detrimental effects. As a data scientist, your goal is to achieve the outcomes that make business decision-making or other important tasks easier, and not to preach a language or package.


  1. Not paying attention to warnings


Language creators have placed warnings for a reason, and they should be taken seriously.


  1. Not using type annotation


Python is not a statically typed language, and this means that type-checking is done only at the run time.


  1. Not following PEP standards, and conventions


Python was not invented as a language that was just there to do things in an easy way. It has a much larger objective than this, and there’s a bigger vision for the language from the creators.



Lead image source.

Also published here.