Data Scientist/Machine Learning Engineer
Git, debugging, testing, the terminal, Linux, the cloud, networking, patterns/antipatterns - what even is this mess? Don't worry we'll go through it from beginning to end (all the way, I promise) everything you need to know to collaborate proficiency with others.
We're flooded with tools which are all titled essential to boost productivity, but... why so many of them? To answer this let's start at the very beginning and slowly work our way through our coding journey!
We all started on a small solo project working to build an app, create a simple model, or just to finish an assignment. As we begin to code we notice that it just... doesn't run 😢 and so we sigh, take a deep breath in and begin to look for what went wrong.
The first bug is just a small innocent typo, but with time we start running into more and more silly pesky bugs 🐞, each one a slight bit harder to deal with than the last! Once we read our code, find the typo and fix it (a little golden debugging) our coding journey continues, and we work on creating something slightly more impressive.
We soon get to a crossroad, we finish working on our small little program and want to work on something slightly more ambitious (yay)! Although we're ambitious, we notice one small thing - we make a good few mistakes.
Like any good student, we get a few books, read a few articles, watch a few videos, and before long we've learned several design patterns which make for a nice, smooth coding experience and antipatterns... to avoid like the plague.
Now with a few sophisticated patterns/antipatterns in mind, we feel like we're ready to show the world our coding prowess! We start névé and nervous but with passion, and so through gathering a few friends together, we begin a new chapter of our lives 😅. The work is fun and everyone wants to play their part, but soon one question arises - how can we work together?
At first, emailing/messaging code from one person to another works fine... but then a few more people pitch in, and combining every line of code becomes - unmanageable! In a moment of chaos, one man did the impossible though, Linus Torvalds extended his olive branch and gave us Git - the perfect system to collaborate with others.
Eventually, we approach another challenge, although we're writing the code just fine... we feel bogged down by our workflow. To our surprise, there's an easy and elegant solution - Linux and the terminal. Linus Torvald proposes Linux as an alternative to Windows (the ugly behemoth) and with it a terminal to write code in a fashion which completely bash's Windows.
Now with our workflow smoothened out, there are just a few questions left - how can we run this code anywhere and what if we need... more? Luckily for us, the dot com boom unfolds and the internet is ablaze! What we once had to run on our machines, can now be run on the cloud (other people/companies servers). Now we can run and distribute progressively larger (and more heavyweight) code right from the comfort of our houses!
Our code is bound to have problems... even if we're genius', they'll still crop up! We can't *completely avoid them, but we can approach each problem in just the right way, so we're able to smoothly eliminate it. There's a simple technique to help with this:
Now I know it's easier said than done, but just try this out... it makes a big difference! Just remember to keep calm, take a deep breath 🫁 and continue, if it's a bug you'll find and destroy it with time and effort 😌!
Our code works... or does it? Testing is all about finding whether something which seems to work fine actually works fine. It's about finding whether your changes break how things work (likely in a subtle way).
Testing can be simple, or complex. At its simplest, it's about looking at what we think our code does and double-checking just that, in a more complex light it's about writing small pieces of code (unit or integration tests) to test the code (yes, code to test the code).
Unit tests are for small isolated tests/scenarios and integration tests for larger/more realistic ones. Although this sounds simple (so far), testing is extremely nuanced as the way we write code has an extremely large impact on our ability to test it (hence knowledge of patterns/anti-patterns may be useful)!
There's a lot to testing and I'm not an expert, but I hope that this is enough to get you going/give you some sense of direction...
Patterns and antipatterns are just good and bad coding practices we should try and use more/less respectively. Although at their heart design patterns/anti-patterns are simple, they tend to be sorely overcomplicated! In essence, we see good and bad code all the time, so learning these comes naturally, however lots of books/articles go into fine detail by naming and shaming.
All design patterns have three basic purposes, to help create, organise (structural) or communicate (behavioural) between classes and objects.
A few examples:
Since anti-patterns are just mistakes they're a good few that exist:
Note it's more practical to pick these all up through carefully inspecting code (especially off Stack Overflow)!
Git is the collaboration one-stop-shop! It is elegant and beautiful once we learn to use it... but seemingly not before that 😧. Don't worry though, it's quite simple, Git works through tracking what changes we make (hence it's called version control), and it does this by breaking up our timeline into chunks that we've committed to using (commits).
We may now ask though - how does this help to combine our changes? Luckily for us, it's not too difficult to interpret, Git stores our work in repositories which can be shared and forked/cloned. Whenever we make changes we can commit these and then push them out to our online repositories (technically called remote repositories). Then once we're ready to share our brilliant code we can pull others over to see/confirm what we've done (with a pull request)! Although this all just sounds weirdly social right now, it gets useful when Git provides us with overviews of our changes, so we're certain that our team's outstanding work won't collide/conflict with our work.
Now there are a few more technical ways we can to use Git, primarily through segmenting work/progress into branches and providing special ways to combine our changes. Branches allow us to highlight particular parts of our codebase which we'd like to share, whilst also allowing us to isolate certain features which may be unstable/not quite ready yet! The first way to combine branches is to merge changes by adding the changes made into a new commit. The second is to replay one branch's changes on another (which we call a rebase). Which one we use depends on our situation:
Now that we've discussed the difficult concepts, let us take a look
at the terminal (explained further below) commands we can use:
To clone a repository
git clone my_website_url
To add a file/folder to be tracked in the next commit (stores changes at the time the commend's run)
git commit -m "added amazing new features"
To change branches
git checkout my_branch
To create and switch to a new branch
git checkout -b my_new_branch
To merge branches
git merge my_feature_branch
To rebase a branch (n is the number of commits to consider)
git rebase -i HEAD~n
To add an upstream branch
git remote add upstream original_repo_url
To sync a local repository (to its remote)
git fetch upstream
A few mistakes to avoid:
As explained above, Linux is an amazing replacement for Windows (it's free by the way) which is far more flexible and lightweight! One distinct feature is the inbuilt powerful terminal (called bash) which allows us to perform complex tasks easily.
Here are the essential commands:
ls ls my_folder
Change directory (into another folder)
Move a file/folder
mv old_location new_location
Copy a file
cp file_location copy_location
Copy a folder
cp -r folder_location copy_location
Run another program (like a text editor, normally vi, vim or nano)
Although they don't seem anything out of the ordinary, the terminal provides a solid way to do a variety of tasks!
Note if you ever enter a text editor you can't seem to close (likely vi/a variant of vi) hit escape and then :q!