Learn from the 90s In the beginning of the web, there was a ‘developer’ who wrote the ‘code’. This code would get built and then chucked over the wall to the ‘operations’ folks. They — you know, operated the website that their company made money from. Sys admins, data center people, database admins (I forgot they still exist — hi DBAs!). There were strict protocols on managing releases. Testing was a pain, and it took a long time to ship anything. Remember the ? Yes, that old thing that 30 and 40-something year old engineers know about. Joel test Getting a no on any question on the test said something about team culture. Engineers spend time worrying about (or breaking) someone else’s code. Losing all the code because a hard drive crashed was a thing. Do you use source control? Engineers spend non-zero time making builds. Multiple steps introduce mistakes. Less builds are made, so new code is tested slower Can you make a build in one step? If something breaks a build, it takes a while to get noticed. (The world has continuous integration now — things have come a long way!) Do you make daily builds? Emails, post-it notes, phone calls with angry customers and “business” people. Forgotten bugs. Bug are not well documented and cannot be reproduced. Do you have a bug database? Tech debt slows down how quickly you can ship new features. A small team is spun out for a re-architecting the codebase project. It takes a year. Do you fix bugs before writing new code? Dates are meaningless. It’s done when it is done. You should have thought about this when you made your stupid forecast. Do you have an up-to-date schedule? Engineers do the thinking for the PMs. Things get built, “But that’s not what I asked for! Can you make it do instead?” Do you have a spec? this other thing Let’s-get-coffee. Hey-how-do-I-do-x. Omg-have-you-seen-this-email-from-HR. Some engineers are asking if the company will pay for headphones. No, says HR. Do programmers have quiet working conditions? Grumpy engineers are slower engineers. Do you use the best tools money can buy? You have more bugs Do you have testers? There’s a few people on the team who have mastered the art of keyword-stuffing their resumes but can’t learn a new programming language to save their lives Do new candidates write code during their interview? Engineers build features, only to discover months later that users hate them. A ‘usability issues’ epic is created. If you had a bug database in the first place Do you do hallway usability testing? All of these things have one thing in common — they slow down an engineering team. reaches users slower. is slower. is slower. slower. Working software User feedback Product innovation The business creates customer value A competitor that delivers customer value faster eventually upstages you. All because you didn’t invest in a build system. The trouble is small at first. These things have a knack for compounding their effects over time. We have an of technological progress i.e. the pace of today is used to make projections for how fast things will be achieved in the future. If software delivery gets slower over time, we underestimate how much slower it will be in the future because of this linear view. intuitive linear view exponentially Measuring speed So the agile community came up with the concept of velocity. It was a diagnostic metric for how quickly a team can ship complex code on an existing code base. At the end of each sprint, story points are added up and that’s the velocity of the team. If velocity drops, the team ships complex stories slower and you are headed in the wrong direction. Do something about it! Problem is — velocity depends on a lot of things. It is hard to know what to do to push it in the right direction. There are certainly no quick fixes. The quick fixes that exist (like adding a new member to the team) do not tackle the core problem. And it takes more than engineers to build a product. PMs, UX folks, designers. It is so difficult to come up with a ‘are we fast enough’ metric that encapsulates all aspects of building a product. Velocity does not fully capture this. We were still building products that people didn’t want. Then the Lean Startup thing happened. The Lean Startup methodology Have an idea for a product? Hypothesis > Build MVP > Validate hypothesis. Get through the whole loop (one iteration) as fast as possible. The product-building world finally saw the writing on the wall. Speed matters. Speed of iterations matter. Fast iterations = success. Especially so in ML , our CTO, built some of the tooling needed to pull off successful production ML systems at Google. He was convinced that any ML setup needs to allow for fast iterations. This is how he explains it. Prashast You can’t just “do ML” and have it magically work. A train-and-forget mentality means that your model goes stale very quickly. Products change, users change, behaviors change. In reality, it is a long road of constant experimentation and improvements. You need to try simple things first. Then different features in your model. Data is not always clean. You experiment on different models, A/B test them. Things go wrong in production all the time. It takes months of constant tweaking to get things right. Once you start, you need to think of it like any other software project. It needs building, testing, deployment, iterations. Each iteration cycle makes things just that little bit better. The more iterations you can get through, the faster your ML setup improves. To validate this, we spoke with data science teams about how they use ML. As you might expect, there is a wide spectrum: Extremely sophisticated teams processing petabytes of data and delivering billions of predictions every day, to teams just getting a grip on training their first model. Sure enough, mature teams are setup for fast iterations. Uber’s is one such example. internal ML platform These teams were not always like that though. It seems that teams go through a curve of enlightenment. You could say the same about ML! There seem to be 2 types of organizations. One type takes the ‘lean AI’ approach and the other, ‘I read somewhere that we need an AI strategy’ approach. The reason most teams go through this curve of enlightenment is that building an AI culture is a journey. Teams start with something simple they can deliver quickly, show value and then build on it. Most of the time, starting AI efforts means going backwards on the curve. Teams spend time getting the data instrumented, cleaned and rethinking data infrastructure because these things slow down any AI effort. Teams that attempt to jump directly into the middle of the curve “Let’s build out an ML platform because we have an AI strategy now” usually fail. This approach highlights the disconnect between product teams (including data scientists!) and the boardroom. It’s no wonder that and companies have an . data scientists are frustrated AI cold start problem Want to build an AI culture? Go through the curve and enable faster iterations. Some ways that AI teams enable faster iterations are: Clean, well-labeled, consistent data One-click model training and deployment Self-service data science. Reduce engineering dependencies to iterate on the model like trying out new features, build new models and automating hyperparameter optimization Scalable, performant systems for data access, data manipulation, clusterized model training, model deployment and experimentation, online prediction queries and candidate scoring Infrastructure peeps treat data scientists as first class citizens who use what they build The ‘ML dev’ environment mirrors the ‘ML production’ environment Here is our version of the Joel test to measure culture of an AI team. Data pipelines are versioned and reproducible Pipelines (re)build in one step Deploying to production needs minimal engineering help Successful ML is a long game. You play it like it is Kaizen. Experimentation and iterations are a way of life This is why we built . Getting data together, processed, cleaned and mangled for machine learning is not a do-once-and-forget type of activity. This is the base of any AI effort and enabling continuous improvement on the base is critical for a successful AI culture. Blurr provides a high-level YAML-based language for data scientists/engineers to define data transformations. Replace 2 days of writing Spark code with 5 minutes on Blurr. Blurr Blurr is open source because we believe that an open source approach will accelerate innovation in anything AI. We even develop in public — our weekly sprints are there on GitHub for everyone to see. The DevOps tooling market exists to ship software faster. Source Our vision is that there will be an MLOps market that helps teams ship ML products faster, and Blurr is the first technology we are putting out to enable this. Because the biggest problem right now is iterations on data. We have a data driven culture. AI comes from data. Therefore, we have an AI culture! No, you don’t. Being data driven is removing human biases in decision making. Is a higher load time for the app a bad thing? Is this new model better than the old one? Let’s look at the data and decide! AI culture is an algorithm-driven culture. Humans build machines that make decisions in a product. Algorithms are deployed to achieve human-crafted aims (improve ad CTR, conversion rate, engagement). AI culture is being comfortable with probabilistic judgements. A product recommendation has a 60% chance of increasing engagement than this other recommendation. 40% of the time, it is not going to be better. Is that good? Start somewhere and improve it. AI culture is a state of constant experimentation and iterations. Everything else in an organization needs to support that. Humans are complicated. We expect deterministic behavior from machines when we ourselves are stochastic decision makers, complete with pattern recognition abilities and cognitive biases. Humans run companies and , which can be incredibly frustrating when trying to build an AI culture. they play politics This makes me wonder how humans (and human-made societal structures like companies) will behave with super-intelligent machines. , baby! 2029 Blurr is in Developer Preview, be sure to check it out and on GitHub! star the project If you enjoyed this article, feel free to hit that clap button 👏 to help others find it.