There is high demand for quality data scientists in the market. Every business wants to integrate personalization, forecasting, clustering, and other similar processes using internal data. Such tasks are carried out by data scientists, who are extremely important to businesses.
Today, all companies have access to data, but only a select few have data scientists that are savvy with code. Let's say you can come from a software engineering background. In that case, you already have an advantage over the majority of other data scientists who have backgrounds in math or statistics and are slowly learning data science.
Variable naming
Naming variables is a hard thing for every developer. Writing code is easy, but finding appropriate variable names is hard for many, and that drags them into creating bad variable names.
Little to no Documentation
Data science applications are complex, and not everyone can understand them completely, but you should not make them harder. Many data scientists do not understand the power of documentation and just keep on writing code for a long time.
Relying on Jupyter notebooks
Jupyter notebooks are great, but they still lack many excellent features that can help you work faster and better.
Not backing up code
Backups are the most important part of any data science project. Data scientists often forget to back up their code and then go on to write everything from scratch again if they lose the files.
Writing algorithms from scratch
Many people think that the identity and competencies of a data scientist should be measured by the number of algorithms they can write from scratch. But it is a huge mistake that many data scientists make on a regular basis.
Not hiding data & other things while sharing code
Data science projects need to be shared with various people for validation and presentation purposes, and it is quite normal too.
Relying only on one package or language
Being too dependent on one thing surely has detrimental effects. As a data scientist, your goal is to achieve the outcomes that make business decision-making or other important tasks easier, and not to preach a language or package.
Not paying attention to warnings
Language creators have placed warnings for a reason, and they should be taken seriously.
Not using type annotation
Python is not a statically typed language, and this means that type-checking is done only at the run time.
Not following PEP standards, and conventions
Python was not invented as a language that was just there to do things in an easy way. It has a much larger objective than this, and there’s a bigger vision for the language from the creators.
Lead image source.
Also published here.