This blog is for those who want to create their own deep learning machine but are fearing the build process. This blog serves as a guide to what are the absolute things you should look at to make sure you are set to create your own deep learning machine and don’t accidentally buy out expensive hardware that later shows out to be incompatible and creates an issue. But before we even start…
When I started to take deep learning or machine learning in general, more seriously, one thing that came in my way was: COMPUTE
And having done robotics in the past, felt like it’s the best time to create the ultimate deep learning rig myself that’ll crunch bits like anything and work blazing fast.
But obviously, why to spend so much money when you have cloud computing with you. Why not spin up another instance whenever you want, as per your will, when you don’t have to maintain any machine and also fear it’ll become obsolete one day.
Well, the story for deep learning says something else. Blog [1][2][3] compares multiple aspects of it and beautifully summarizes everything. In the end:
It’s actually cheaper, faster and easier to use a local deep learning machine than going for cloud instances, in the long run. [1][2][3]
So, let’s create our own deep learning machine.
We’ll discuss each one in detail in the upcoming sections.
Ryzen threadripper CPU
Given that most deep learning models run on GPU these days, use of CPU is mainly for data preprocessing.
If you are frequently dealing with data in GBs and if you work a lot on the analytics part where you have to make a lot of queries to get necessary insights, I’d recommend investing in a good CPU.
Else investing in a mid-level one will not do any harm since CPU only helps in batch scheduling and other small processes while the GPU is doing the training.
The threadripper series from AMD is damn powerful and gives high performance for the right price. I went for Threadripper 1900x, 8 core CPU with 16 threads.
Important:
I’d go for AMD anyday due to high performance to price ratio. It’s like paying half the price for the same performance than its Intel counterpart.
Search youtube for installation or see the motherboard guide. It’s pretty straightforward and easy but the chip is a bit delicate. Handle with care.
Three available options are:
The hard disk is a rotating magnetic disk with a brush writing and reading bits on it. As it is mechanical and fully motorized, performance is slow, takes up a lot more space and is more prone to data damage and corruption.
SSDs: They are small, fast and without any moving parts. Though a bit more expensive. It certainly matters in the overall experience. The OS becomes butter smooth when loaded on top of SSD storage. File transfer is blazing fast when in deep learning you are almost every time dealing with GBs of data. The SATA 3 slot gives you a max transfer of 600 Mbps with the AHCI drivers.
NVMe SSD: Rather than using the SATA bus, PCIe is used giving a big performance gain. Also, the transfer protocol used is not AHCI but NVMe, giving highly efficient parallel processing. In the end, we are talking 2–3 GB/s here. The numbers vary from model to model.
It’s still new tech and comes out to be a lot more expensive than SSD for what they deliver. You would hardly notice performance gains in OS with NVME SSDs than regular SSDs and a lot less waiting time in file transfer will be there. Also, NVMe means, making sure that the motherboard has m2 slots, which also increases its price as the old ones don’t have it.
To know more about SSD and NVME SSD technology, you can go through a small and crisp blog written by Kingston[4] with lots of visual images comparing both. Also, this youtube video explains NVMe[5] beautifully with animation.
Intel is there with Optane storage having 3D cross-point transistors giving more density and performance. But are very expensive and not at all worth the price. More about Optane here[6].
I went for a 500 GB NVMe M2 SSD, given I was okay to pay for the extra price with the high speed I was getting. Else SSD is a lot better than HDD.
Important:
Given your data will be residing in RAM or GPU VRAM. Storage doesn’t matter while you are training data.
Its better to invest in SSD than HDD looking at the price to performance ratio.
NVMe SSD thought very fast are expensive and also needs compatible motherboard.
As you can see they come in different sizes. Standard sizes which decide the type of case you want. As per your needs, in a given motherboard, you can check for the number of:
Its good to have extra slots for future upgrade. People generally upgrade to higher RAM and also add multiple GPUs in the future.
The choice will change from person to person. Pre-built machines generally save money by using a motherboard that doesn’t have enough slots for future upgrades. You get to choose here.
Important:
Make sure your CPU supports the given motherboard. You can choose the best combination of CPU and motherboard to save money.
Many care about the number of lanes per PCIE slot. It doesn’t matter at all with a couple of GPUs. In Tim dettmers blog[7], section “CPU and PCI-Express” explains it beautifully taking an example of Imagenet dataset. So don’t pay extra just because the other one gives a 16 lane PCIe rather than an 8 lane.
If you are planning to add more GPUs in the future, make sure the PCIE slots are far enough. Given the fact how GPUs these days comes in their beafy heat sink with multifan structure, as NVIDIA keeps adding more and more cores, most of the GPUs take the space of 2 PCIE slots. (Else you will have to look for a vertical mount GPU connector which is costly and also not easily available in most of the countries.)
GPUs taking space of 2 PCIe slots.
Yes, given the level of compression hardware is going through, this topic needed its own section.
The two elements you have to keep cool are the CPU and the GPU.
Fans are available in different sizes.
these days most common are the 120mm ones.
Also, 240/360 mm fan config just means 2/3 120mm fan attached side by side.
a typical CPUs exposed top-view
If you are going for a mid to high-end level CPU, chances are you will need a CPU cooler. CPU cooling is fine these days and not a big issue after we go for a CPU fan. A copper plate comes in contact with the exposed top of the CPU as shown in the pic above, taking away the heat and the fan then takes away that heat out of the system. The available options are:
Two types of CPU cooler
Liquid cooling as you can see along with the fan has a pump which circulates water around the tubes. Some motherboards come with a separate connector for CPU fan and pump. They can be used in regular fan ports too. In conventional ones, this heat is taken to the fans using copper tubes. This means only a single connector is used for the fan.
Both of them give a kind of similar performance but has its own ups and downs. I’d highly recommend watching LinusTechTips youtube video[8] on this.
1st one is open-air. 2nd is a blower style config.
GPU cooling is inbuilt and comes out of the box with your GPU. The two configurations are:
Airflow in both the configurations.
Because open-air throw air in all directions and has up to 3 fans, they are better for single GPU PC and also gives better-overclocking results.
This becomes a problem in a multi GPU system. Air thrown out by one GPU gets consumed by other GPUs, increasing their temperature and this goes on in a loop until the whole system ramps up to a very high temperature. Therefore in a multi GPU configuration, blower style is better as it takes the heat out of the PC case and fresh air comes inside GPU.
liquid cooling hardware on a GPU
Liquid cooling exists but that requires opening the GPU and mounting it to separate hardware, which is scary and also voids the warranty and even if someday you’d like to go to that level, parts are hardly available in all countries.
Is multi GPU worth it for the extra price?
While a single GPU is okay to handle, the temperature doesn’t go above 80 degrees in most cases. But with a multi-GPU system, cooling becomes a big issue. Even though SLI bridge or NVLink (now called by NVIDIA) is there to connect multiple GPUs together, the optimization is badly affected. Connecting 2 GPUs together ideally should give 2x the performance, but you end up getting only an additional 40% bump for most of the cases than using a single GPU. This not only draws more power, leading to higher electricity bills but heating issues make the matter worse. So, I’d suggest going for it if you’re ready to pay the extra bucks and badly need it.
All in one liquid cooling:
The very expensive all in one liquid cooling exists. Parts aren’t easily available in most of the countries and also it’s risky to buy and setup, if you’re not sure of the compatibility of the PC case you’re going to put it in. Also, they can leak at some point and replacement can be a big problem. Overall the extra effort is also not worth the performance or the price.
How many fans to have?
Good air-flow is a must have for a deep learning rig. This can be done using intake and exhaust fans. But, what’s the limit?
I’d suggest if you’re having 2 intake fans in front and one exhaust fan on the back. It’ll give you adequate airflow.
After this, adding more fans will just offer a slight dip in temperature. LinusTechTips video[9] solved this all together for me by doing an experiment in many different fan and position configurations.
Important:
Make sure to check if the CPU cooling solution you are ordering fits the specific bracket on the motherboard. For example AMD Threadripper CPU uses TR4 mount bracket.
Make sure you apply thermal paste. It’s strange how one can get a significant boost by applying thermal paste again on laptops. Dave2D’s youtube video[10] video shows the stats in detail.
Make sure you’ve put the fans in the right way. Do check all the fans for the direction of air flow, after installation and switching on the motherboard.
Check the motherboard for the number of fans it supports. Else you may have to attach them directly to the PSU, running them at full speed all the time rather than controlling their speed by your motherboard software.
Ohkae, this can be an issue. No matter how much you take care of it, sometimes, a cable can be of slightly smaller length or the RAM comes in the way of the CPU radiator you are planning to put on top of the case. Minor issues will come. Some can give you a hard time too. Some PSU or motherboard manufacturers also give cable extenders. You can also watch online youtube build videos and get the same product if you want to be very sure of getting a smooth build experience. But in most cases, things work out pretty well.
But the other one gives the best airflow!
Can never be a motherboard dying to get some fresh air
Proper airflow is not an issue with most of the cases. It’s just a marketing strategy different case manufacturers pitch how they give the best airflow to the components.
Important:
Cable length can be an issue in rare cases. You can get extenders though.
Some cases have removable fans. Some give them preinstalled in which, some have removable fans.
Make sure you get the right type according to the size of your motherboard. A case specified for a given size will fit smaller versions too in it. For eg., E-ATX case will support its smaller versions like mini-ATX or micro-ATX. The holes are drilled accordingly. So by mistake getting a bigger case is not a problem. Still, some prefer to look at a compact beauty on their desk.
Make sure you have a good room for PSU to fit in. They have a separate PSU compartment. Check out its size if possible.
The front IO of audio jacks, USB type 2/3, thunderbolt all depends from case to case. So make sure to go for one according to your needs. Obviously, the ports on your motherboard will be available to you, but they will be at the back of the PC.
Some cases have liquid CPU cooler mounting only at the top, while some allow fans to be placed in the front only. Make sure to check the possible configuration.
before and after cable management
Have sufficient zipping and cable ties for the perfect cable management. Remember cable management is an art.
You can use PCpartPicker’s website[11] to see the compatibility of all the parts you have choosen with one another.
everything can come with LEDs in them
Given that you deal with large datasets likes of images or log data of some sort, your data can fit completely inside RAM which can help speed up the processing time significantly, as this is the memory CPU uses after not getting data in their L1 or L2 cache. The transfer takes between RAM and GPU VRAM for datasets which generally doesn’t fit into VRAM. This is exceptionally fast as compared to any other storage solution, giving a transfer rate of around 20 Gbps.
A good amount of RAM should be there in a machine, but again if you have a lot of preprocessing to do, else 8 to 16 GB of it is fine.
With XMP or Extreme Memory profile setting, one can overclock RAM to higher speeds. But actually, it doesn’t even matter. RAM with a clock speed of 3000 MHz works a bit faster than 2400 MHz one, but is not an improvement that is much noticeable and so the performance to cost ratio is really poor. Also, your RAM speed doesn’t bottleneck your system generally so that having a transfer speed of 17 Gbps will make a PC slower than a 20 Gbps one.
Important:
Make sure you are taking DDR4 RAM compatible with your motherboard. Most motherboard manufacturers provide a list of supported hardware as well.
Don’t mix and match different clock speeds and manufacturers when installing more than one RAM stick. It is recommended to use the exact same type of RAM.
The DIMM slot numbers are given and based on the number of RAM sticks you have, you will have to put it in their respective slots. See the motherboard manual for it.
These slots remind me of the game boxes we used to buy and play in those days !!.
(left) a full gaming console. (right) Inside of a gaming cassette
Make sure when you go back in the memory lane, your tears don’t fall on the motherboard. Even though a final cover plating is done on top of it, the slots have the wiring open and human tears contain 0.3 mg of salt in it, enough to break things up. Seawater by the way, has 1.75 mg of salt per drop.
Two things to care about while getting a PSU are:
For the first one, you need to find out the total power usage of the system. Having a higher output one for smaller power consumption machine is okay but not the other way round. Find the total power consumption in watts using:
n — total number of GPUs you want to have, also taking in future additions.
m — total number of hard disk or SSDs you are having, also taking in future additions.
200 watt extra is for other peripherals and taking a safe margin.
You can easily find the power usage of each component on their official website.
For the second one, make sure the slots provided are equal to the number of things you are going to attach. Or simply add all the parts in PCPartPicker’s website list[12] and you’ll get to know if you are running short on total ports in the final compatibility section.
PSUs mostly have:
A good power supply unit is a must. Airflow, even with less number of fans will take care of itself but make sure you give clean and sufficient electricity to your system.
Important:
Never put a converter plug for PSU socket. Adapters have mild contacts with the plug which create a lot of heat. Also, the material of adapters are mostly cheap and have a specified amp capacity, if exceeded will burn the socket.
As the power supply sockets in India have a 5 amp and 15 amp support, for a deep learning rig you will have to put it in the bigger socket of 15 amp capacity. If you are getting type G UK plug on your PSU, better cut the plug and attach the plug you need that supports higher amp output rather than putting an adapter.
As the 15 amp plug was far from my rig, I faced extension issues. At last I went for a 9 m extra cable extension. I made sure the quality of the extended wire was top notch, as more the distance more the resistance and the final voltage drop. Use only high-quality copper wires of required thickness.
As we are not making a gaming rig, 4K display or 144 Hz refresh rate monitors are not what we are looking for.
Tim Dettmers mentioned in his blog post, how he uses 3 monitors and absolutely loves it. I can’t agree more to him but for me, I guess two monitors are more than enough. But, multi-monitor setup will for sure make you more productive.
Given how cheap displays have become these days, you can easily go for an LED than an LCD with a 24 or a 27-inch screen.
In the same lines for keyboard and mouse. A gaming one with RGB lighting and mechanical switches with long key travel is not necessary.
I personally prefer the laptop style chicklet keyboards with small key travel. Also, wireless ones are just a tad bit expensive than wired ones but it makes the environment look much cleaner and prettier. These days single USB port for both wireless keyboard and mouse are also there.
Important:
See the GPU specs for the number of display it supports. Almost all of them support more than one.
You may have to buy a display port to hdmi/VGA dongle to support multiple displays. Check the specs to see number of HDMI, VGA or display ports your GPU has.
This is the heart of your deep learning rig. The place where real training takes place. It is certainly a big topic and needs a separate blog of its own. But, to come up at least with something, I’d suggest going for:
They have 16-bit precision option available to speed things up. The new architecture has some really good performance gains.
You can also save a lot of money by buying 2nd hand GPUs on eBay available after the tragic crash of blockchain.
A GTX 1080 Ti is also a very good option. Also, don’t think you’ll get degraded performance with used ones, which is again proved in another LinusTechTips video[13].
Important:
I’d highly recommend checking on Tim Dettmer’s full-blown blog[14] for it, where he has done absolute justice to the topic.
I highly recommend going only for NVIDIA GPUs for deep learning as the CUDNN and CUDA toolkit is highly compatible with the present deep learning libraries, be it Keras, Tensorflow, PyTorch or any other library.
Deep learning softwares are first compatible with Linux based machines.
I’ve installed Ubuntu 18.04 since it now has LTS (long term support), I haven’t checked for incompatible libraries which were working on my previous 16.04 LTS versions. But the major deep learning libraries won’t have any issue at all.
Take a working laptop. Download the OS from Ubuntu’s official website[15] and don’t forget to support by donation if possible. Create a bootable pen drive using free software like Rufus[16] and insert it to the port and start your PC.
You’ll see the BIOS. It’s generally the delete button or some Fn button. Check all the attached components and see if everything is getting detected.
You may need to give preference to the attached USB to boot first in the priority, to make sure the bootable disk gets loaded.
Install Ubuntu as per instructions.
This whole process can be seen in this youtube video[17].
In the end, it’s a fun and exciting process which you will enjoy for sure. If you haven’t built a machine in the past, it’s fine. Nothing complicated. The hardware community has got you all covered. Things are very modular now. There are sites too for you to see the compatibility of all your parts together like PC Part Picker[18]. You can always refer to youtube videos full of hardware build videos on channels like Bitwit[19], to see how to do it and by the way, it’s pure bliss to watch it when it comes to life.
Thanks for reading and congrats for surviving till the end and thanks to Sandeep T for editing this blog. Until next time, Love u 3000 :)
[1] Jeff Chen, Why building your own Deep Learning Computer is 10x cheaper than AWS (2018), medium.com
[2] Jennifer Villa, Choosing Your Deep Learning Infrastructure: The Cloud vs. On-Prem Debate (2018), determined.ai
[3] Prebuilt vs Building your own Deep Learning Machine vs GPU Cloud (AWS) (2018), bizon-tech.com
[4] Understanding NVMe and SSD Technology, Kingston.com
[5] Powercert Animated Videos, M.2 NVMe SSD Explained — M.2 vs SSD (2018), Youtube.com
[6] Intel® Optane™ Technology, Intel.com
[7] Tim Dettmers, A Full Hardware Guide to Deep Learning (2018), TimDettmers.com
[8] Linus Tech Tips, Why you shouldn’t water cool your PC (2019), Youtube.com
[9] Linus Tech Tips, Case Fans — How many should you have? (2015), Youtube.com
[10] Dave Lee, $12 Hack To Boost Your Laptop Performance! (2017), Youtube.com
[11] Completed Builds, pcPartPicker.com
[12] System Builder, pcPartPicker.com
[13] Linus Tech Tips, Performance degradation — is it real? (2016), Youtube.com
[14] Tim Dettmers, [Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning](http://Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning) (2019), TimDettmers.com
[15] Ubuntu Desktop Download 18.04.2, Ubuntu.com
[16] Rufus, rufus.ie
[17] LinuxPlus, How To Install Ubuntu 18.04 LTS (2018), Youtube.com
[18] Build Guides, pcPartPicker.com
[19] Bitwit, How to Build a PC! Step-by-step (2017), Youtube.com