In this interview, I let Leonid tell his story. With years of experience as a senior data engineer at Meta and other startups, his expertise goes beyond mere lines of code. As we sit down for this exclusive interview, Leonid offers a rare glimpse into the intricate process of weaving the digital fabric that underlies our modern lives.
Leonid also shared with us one of the most challenging data engineering cases he worked on and provided recommendations on how to develop as a professional in the exciting world of data engineering.
Before we begin, it’s only fair that we get a background on our guest today. From his early beginnings to his current position within the heart of Meta, Chashnikov's path is a testament to his self-made engineering prowess and growing influence in the field.
Leo found his passion for technology a long time ago as first he was into computer games, then into the ways they are developed and programming in general.
“As a result, I haven't started to develop games as I wanted to as a kid, but I don't regret it at all — my dream just transformed”
After getting a Master's degree in “Security of information and communication”, Leo started to work in service companies (sometimes ironically called “body shops”), but given his personal motivation and support from his team, he found out a lot of things and boosted tech knowledge in quite a short time period: moved from Java to Scala and focused on the big data processing and analysis.
After getting a job at Captify, a search intelligence company, where he was responsible for all the data processing pipelines of a platform, Leonid continued his career at a startup, People.ai, an AI platform for enterprise sales, marketing, and customer success that uncovers revenue opportunities. There, he got a chance to expand his horizons by being free to choose programming languages, having a wide range of tasks, and a greater emphasis on ownership.
“I really grew professionally there. I didn't just have to sit and wait for my tasks, but was always figuring out the processes on my own, then doing them and consulting with colleagues”.
At the same time, Leo was studying a lot on his own: books his colleagues advised, forums, conferences, and various online courses.
“When I needed to stay ‘up to date’, I also could read some articles or listen to podcasts. Currently, we have so many resources that it's rather a problem to find something valuable out of 10 resources. Everything is in the open access”
Here is a list of books and resources Leo advises for anyone interested in big data:
If you want to read just one book about building distributed systems, then it should be this one. It might not be 100% applicable to what you do every day, but it would make you a better engineer and give you a deeper understanding of the tools you use, as well as the platforms your system runs on.
Don't make a mistake — it's not a Scala textbook. It really is about function programming and what approaches it takes to solve tasks. And to get all the benefits, you'll have to work through the exercises as well. It might not be immediately helpful if you're coding in Scala every day, but it will give you a very different perspective on code — with immutable values, pure functions, and recursion.
The book concentrates on Software Engineering, not on Programming — it rarely discusses code itself, but challenges in supporting a large engineering environment, where code may live for ten or more years and has to be modified by lots of different teams. Many of the discussed problems would not be applicable to most companies, but views and "modes of thinking" would still be useful. It's really recommended if you want to "peek behind the curtain" a bit and see how big companies work and how their scale poses unique challenges.
A wonderful resource for all "mostly self-taught" software engineers, especially ones suffering from impostor syndrome as you've skipped "classic" CS education. To go through all the recommendations would require a lot of time and work, but it gives you fundamental knowledge that will save you lots of time learning new things throughout your future career.
Leonid's personal blog, where he shares his experience about Spark, efficient data processing, and other topics you will find useful to develop a career in big tech data and improve knowledge in the industry. You have to read it for sure if you want to have a deep dive into how Apache Spark works
“I was always highly motivated by “feeling silly in a room”. I believe that in case you feel the smartest, the cleverest, you need to change your work or team to find more space for growth”,
In this section, Leo shared with us his most technical and challenging case — migration to Spark Structured Streaming at Captify. Based on it, he also understood that without a clear, broad picture of a company's business parts, you won't be able to move far away and be as effective as possible.
He and his team came across a problem: an unstable and not effective batch pipeline that was taking a lot of time to make it work. On all the processes, they spent approximately 4 hours.
So, Leo and his team came up with the idea of migrating to Spark Structured Streaming. But before doing that, they needed to dive more into business parts: they validated if speed improvements would matter in this case at all.
“Everything rests on numbers. If you change the whole pipeline, and the processes speed up, conditionally, from 15 to 5 minutes, there is no point — you will waste more time on the changes themselves. But if it is from 4 hours to 15 minutes, then the changes are meaningful. We saw that and started to implement”
To do so, Leo and his team developed a new Spark Structured Streaming pipeline, which was challenging as back then, it just appeared and was unstable. They often had to dive into the source codes of the Spark. At the same time, it was crucial to redirect the traffic as smoothly as possible and monitor so that efficiency didn't drop and all the data matched.
As a result, the implemented changes proved to be effective — the data delivery was shortened to 15 minutes and had a positive impact on the company's economic efficiency.
“In a big picture: by delivering data faster, the company competes less with other advertisers, pays less money and shows ads much cheaper”
It's okay not to know something — it's not okay when you don't want to learn it. You always have to expand your knowledge. To do so, dive into the processes you want to know about more as much as possible. For example, you can not only ask your colleagues around for pieces of advice but also tell them about interesting things you did — the need to tell makes you understand the topic deeper.
Not all companies have an opportunity to give employees that choice, but even if you can't choose your tool to work on tasks, you need to have a clear understanding of why it was chosen and its strengths and weaknesses.
Hard skills — are baseline expectations. The further you go in your career, the less important your hard skills will be. In order to grow and climb the career ladder, you need to keep developing soft skills: talk to people, understand their problems, and be able to find solutions to them.
It is always useful and meaningful to talk to other tech professionals — exchange ideas, share experiences, and gain insights into different approaches within the field. To do so, you can go to various tech conferences, meetups, or networking events.
Developing the ability to view technical projects from a holistic perspective is crucial for making informed decisions that align with broader business needs. This skill will help you to choose the most fit approaches to solving problems, prioritize tasks, optimize resources, and align efforts towards a common objective.
No matter what an experienced professional you are, effective teamwork is the key to success. In this case, mathematics works by the principle “1+1=11”. You need to develop teamwork skills and feel comfortable and efficient in both positions: as a contributor and a leader.
In this interview, I tested a new format that is aimed at allowing my guests to expand more on their ideas rather than being constrained by centralized questions.