"DevOps Engineers Boost the Efficiency and Profitability of High-Tech Business" - Vasilii Angapovby@antagonist
655 reads
655 reads

"DevOps Engineers Boost the Efficiency and Profitability of High-Tech Business" - Vasilii Angapov

by Aremu Adams AdebisiOctober 18th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Vasilii Angapov, a renowned DevOps engineer with certifications from top tech companies, discusses his journey and significant projects. He started at GDC, managing large infrastructures and later built a private cloud for Mazda Logistics. His expertise in OpenStack led him to Wargaming, where he set up a cloud for 5,000 servers globally, improving their Time to Market metric significantly. At Bell Integrator, he worked on the Neuton ML platform and ensured its disaster-resistance. Currently, at Li9, he's implemented fault-tolerant systems for Veeva Systems and optimized infrastructures for Thryv Holdings Inc. He's also containerized legacy COBOL applications and customized OpenStack functionalities. Vasilii highlights the rise of MLOps, containerization, and cloud platforms as key trends in DevOps. He believes continuous learning, not necessarily a university degree, is essential for success in the IT industry.
featured image - "DevOps Engineers Boost the Efficiency and Profitability of High-Tech Business" - Vasilii Angapov
Aremu Adams Adebisi HackerNoon profile picture

The IT industry is booming, and there is a high demand for skilled engineers and developers - especially in fields like DevOps, where a wide range of knowledge and vision are essential.

Vasilii Angapov, one of the most renowned DevOps engineers and a respected expert in containerization and cloud technologies, shared his insights on the current trends in the industry and how to become a truly exceptional professional.

Vasilii Angapov is the Senior Architect at an international IT consulting firm based in the United States. He holds the rare and highly valuable Red Hat Certified Architect certification, along with top-tier professional certifications from Amazon, Google, Microsoft, IBM, and the Linux Foundation.

Q: Vasilii, you have more than 10 years of experience and unique expertise as a DevOps engineer. You started your DevOps career in the CIS, working for the consulting company GDC. What kind of projects did you work on there?

A: At GDC, one of my main roles was as a system administrator and lead UNIX engineer for Mazda Logistics, which has an office in Brussels. I had to maintain a large infrastructure based on the VMware vSphere virtualization platform.

I was directly responsible for more than 200 servers in several data centers, while the company had over a thousand machines running RHEL, SLES, and HP-UX operating systems. I also had to make architectural decisions and implement new software solutions, often manually. It was challenging and demanding work, sometimes requiring night shifts.

I managed to simplify these processes significantly for the company by automating infrastructure management using the Puppet configuration management system.

Another major project I was involved in was building an internal private cloud for Mazda Logistics based on the OpenStack platform. The company wanted to speed up application development and delivery by adopting new private cloud technologies.

While deploying the private cloud with my team, I not only learned how OpenStack works, but also developed some extra features, such as Neutron SSL VPN, Neutron port-forwarding, and Ceph dashboard.

These credentials enhance my marketability and drive my ongoing growth, pushing me to stay updated in the ever-evolving tech landscape. Also, I take pride in being one of only fifteen Red Hat Certified Architects in CIS countries.

Q: There are few specialists in the market who have extensive knowledge of OpenStack and experience with large cloud systems based on it. How did this rare expertise affect your future career in DevOps?

A: My knowledge of OpenStack was the key factor that enabled me to join Wargaming, a legendary video game developer based in Cyprus. Wargaming is the company behind the MMO game World of Tanks, which has an audience of more than 160 million people worldwide.

They also needed a private cloud on OpenStack - and there were hardly any specialists with relevant experience in the market. OpenStack has a high learning curve: it is not easy to understand and master the principles of this architecture, let alone study it in depth.

That is why my OpenStack certificates from Red Hat impressed the company at that time, and they entrusted me with the project.

It was an ambitious project: they wanted to set up a cloud from scratch for 5,000 servers across the world, from the United States and Europe to Asia and Australia.

I completed it in two years and migrated more than two-thirds of Wargaming's web services to OpenStack.The company gained a significant performance boost: one of the key product metrics, Time to Market (TTM), improved by ten times. This is the time it takes for a company's product to go from the stage of code writing to its end user in a finished form.

Before I joined the company, TTM was measured in weeks and months, but after that, it was measured in hours, sometimes minutes. That is why I can proudly say that I brought unique value to the company and trained a team of 15 people as well.

Q: You are not only an expert in OpenStack - you also have strong skills in virtualization and software containerization systems.  Where and how did you use this knowledge? How did it benefit the business?

A: A significant stage in my career was working with Bell Integrator, a large international consulting company. Through them, I worked with the American office of Juniper Networks, which is the second largest company in the world in terms of network equipment sales, i.e., a major part of the Internet runs on Juniper Networks equipment.

There I was developing software for the NFV (Network Functions Virtualization) and SDN (Software Defined Network) project - writing code in Python. NFV makes network hardware more efficient: instead of physical devices, services are delivered using virtual network functions (VNFs) that run on standard servers.

Virtual resources can be allocated to users on demand, utilizing the potential of devices 100% - as a result, the company saves millions of dollars on buying network equipment, and consequently, the end user gets a cheaper service. Besides that, during my time with Juniper Networks, I learned about containerization technology.

As part of my work at Bell Integrator, I also participated in creating a unique project called Neuton - a machine learning model that could make predictions based on statistical data loaded into it. At that time, the Neuton model significantly surpassed existing analogs in terms of prediction accuracy and speed.

I deployed the model in public cloud services - partly in AWS, and partly in Google Cloud.

Eventually, Google Corporation became interested in our model. At their request, we took part in architectural planning with the Google team. They also gave us a big discount for computing resources in their cloud.

As a result, the Neuton ML platform completely migrated to the Google Cloud platform. Moving from AWS to Google was an interesting experience that allowed me to enhance my knowledge of these environments.

Another aspect of the experience I gained from this project was building and testing fault-tolerant systems. My task was to make sure that the infrastructure on which Neuton was running would automatically recover within an hour if an entire region of a cloud provider went down.

It was a big challenge, the infrastructure was extensive - with many virtual machines, clusters, and databases - but I managed to set it all up so that it could resume quickly after any disaster without human intervention.

In the process, I gained invaluable experience in stress testing, load testing, and identifying bottlenecks in distributed systems. In the end, through careful Disaster Recovery scenarios and stress testing, my team and I were able to create a system that was not just fault-tolerant, but disaster-resistant.

I also automated as many processes as possible, and the system I built essentially worked on its own.

Q: What international projects are you currently working on?

A: Currently, I am working as a Senior Architect for Li9, a company that provides DevOps and cloud solutions. As part of my collaboration with them, I worked with Veeva Systems, a company that creates software for the US healthcare industry.

There I implemented a fault-tolerant development platform called Gitlab. My goal was to create a system that could withstand, for example, the failure of an entire region. My work significantly improved the RTO (Recovery Time Objective), which is the time it takes for a system to recover from a failure.

Before I joined the project, it was about 6-10 hours, because the system had to be repaired manually in case of an emergency.

The introduction of the new architecture reduced the RTO to 10 minutes - the system was restored automatically. Another indicator that shows the importance of the quality work of the architect is RPO (Recovery Point Objective), which measures how much data is lost in case of system failure and recovery.

For example, in Veeva Systems, all the data for the previous half an hour of operation was lost in a breakdown. With the new system, all the data was preserved, which means that the RPO was reduced to almost zero.

I also worked on a project for Expeditors, an American logistics company that is on the Fortune 500 list. They used ManageIQ, an infrastructure management platform that allowed developers to request resources and managers to estimate costs and authorizations. I was part of the team that successfully added a multi-stage approval system to it and optimized the system.

Today, I have been working at a large US company Thryv Holdings Inc. for more than two years as a Senior DevOps Engineer. At Thryv, for example, I have implemented several production Kubernetes clusters in different data centers and AWS/Azure cloud services - now the company's applications are hosted in several regions.

Plus, I optimized the AWS and Azure accounts themselves, saving the company hundreds of thousands of dollars a year.

Millions of dollars more were saved when I migrated the company's enterprise task scheduling system from IBM Tivoli Workload Scheduler (TWS), a proprietary service, to Argo Workflows, an open-source solution.

Additionally, I automated routine tasks using Ansible AWX, saving Thryv engineers valuable time.

Q: You've often had to build unique systems and set up complex processes for which standard methods are suitable. Can you give us some examples of when you had to improvise and invent something new?

A: At Thryv, I had the opportunity to create a unique solution: to containerize legacy applications written in one of the first programming languages - COBOL. It started to be actively used back in 1968, and since then, it has been used in 90% of financial transactions worldwide.

Of course, by now this language is super outdated, and almost all the specialists who have ever worked with it retired long ago.

We faced a difficult assignment. It would seem that such an old programming language as COBOL was hardly compatible with modern approaches to systems design in general and containerization technology in particular.

Nevertheless, my team and I managed to get rid of outdated and extremely resource-intensive mainframes and to carry out a complete optimization. As a result, the client company saved millions of dollars with our contribution.

Another illustrative example of when I had to apply non-standard approaches was in the process of customizing the functionality of the OpenStack platform. I had to enhance it in order to align the system with the needs of Wargaming.

I was involved in developing various plugins, for example, I created a VPN-as-a-service plugin for OpenStack from scratch. I also made a number of modifications to the OpenStack code written in Python.

I have also automated OpenStack deployment as much as possible. Previously, deploying this system was a huge headache: there was no single universally accepted way to install it, and the process turned into an agonizing quest in which every mistake was costly.

However, I automated the installation and adapted the process even for people who don't know the system thoroughly.

A: The first and most obvious trend is the growing demand for MLOps (Machine Learning Operations) engineers. Machine learning technologies have been around for a while, but they have recently become more advanced and widely used, especially with the advent of ChatGPT technology, ML received an impetus to the popularity effect.

The challenge for MLOps engineers is to automate the deployment of machine learning models on various computing platforms. This requires a deep understanding of both machine learning and DevOps, as there are not many established best practices or guidelines in this domain so far.

The second trend is containerization. Containers have become the standard enterprise way of delivering applications, replacing virtual machines in most cases. Therefore, Kubernetes and containerization skills are essential for DevOps engineers.

And the third trend is cloud platforms. Cloud platforms have been popular for a long time, as they offer many benefits over owning and managing servers. They allow the transformation of long-term capital expenditures to day-to-day operational expenses which is essential for startup projects.

Public cloud platforms such as Amazon Web Services, Google Cloud Platform, and Microsoft Azure provide a huge number of managed services in different areas, including machine learning and containers. DevOps engineers need to be familiar with these services and how to use them effectively.

Q: Becoming a successful DevOps engineer, Is it necessary to get a university education for this? What skills should a beginner learn right now?

A: A university degree is not a requirement for becoming a DevOps engineer. However, the university can help you develop some soft skills that are useful in any career. For example, in the university, you can learn how to learn.

So in general, I don't think higher education is worthless, but you can also pursue a career in IT without it.

Instead, you should take specialized courses and keep learning new things. I do this myself every day. No one knows everything in this field: I still learn as much as I did when I started. When you stop learning, you stop being an IT person.

You can't afford to be complacent, because the IT industry is evolving so fast, and you have to keep up with the latest trends. Otherwise, you might fall behind and miss the opportunities that are ahead of you.

And one of the key points is troubleshooting skills. You have to find the root cause of the problem and solve it. The ability and willingness to dig deep and understand is crucial.