Davis Baer: What’s your background, and what are you working on?
Oege de Moor: At Semmle, we are focused on securing software, together. We’re creating technology to scale security expertise through automation and provide a platform for sharing community-driven security. The biggest security challenge we face in software development today is a chronic shortage of security experts — even leading technology companies struggle to staff one security engineer for every 100 software engineers. Our products help automate variant analysis — the process of finding all instances of a coding mistake that caused a security incident — allowing organizations to scale their security efforts without hiring. And it works! Our customers include Google, Microsoft, Uber, Credit Suisse and NASDAQ, as well as open source projects like SystemD and AMP.
Before starting Semmle, my background was in theoretical computer science. I taught at Oxford for 21 years, primarily working on programming-language theory. I loved working with brilliant students, but I began aching to get back into coding, which is what had attracted me to computer science in the first place. I got a chance when I took a sabbatical at Microsoft and started working with a large, complex codebase. I kept having questions about how the codebase was structured, but they were extremely time intensive to answer. I found myself wishing I could ask deep meaningful questions of the code — like a database that indexed the source for easy investigation.
After I finished my sabbatical, that desire stuck with me, and solving it was the origin of the technology behind our products now. When we started in 2006, we were just trying to solve this interesting technical problem (a problem that had previously been called impossible to solve) — it wasn’t until years later that we had solved an absolutely critical security problem facing the software industry today. Variant analysis is a prime example of “deep meaningful questions”, and offers a way to realistically scale security expertise to meet the software-security crisis.
We’re now working on two products: the first, QL, is the engine that automates variant analysis. Researchers can codify security weaknesses as QL queries, and automatically find issues even in huge codebases that they used to have to tediously look for by hand. The second product is LGTM — the widely used developer acronym for “looks good to me” — which is the platform for running QL queries at huge scale as part of your developer workflow. Between our internal security research team and our growing customer contributions from security experts at Google and Microsoft, among others, we have over 1,600 publicly available QL queries running on hundreds of thousands of private and open-source codebases. And because these analyses are sourced from the community, they are vastly superior to what any one security team could produce.
What motivated you to get started with your company?
The starting point was my own personal need to come to grips with the complexity of modern software — to have a way to automatically ask questions about what unfamiliar code is doing. Of course, this is a pain point for everyone who works with substantial codebases, but it’s critical to experience a problem yourself to be sufficiently motivated to solve it — I believe that’s a key requirement for every successful startup. We started out with this very general platform to understand a large codebase, but security turned out to be the killer use case.
What went into building the initial product?
When you are crazy enough to take on an “impossible” problem, you don’t focus on how long it will take you to solve. It’s very possible that if I had, Semmle would not be the force it is today. Database technology was not a core competency of my co-founder and I, so we didn’t fully understand the scale of the challenge we took on. There’s this beautiful quote from Oscar Wilde, that you should treasure your ignorance — we did that at Semmle, cheerfully attacking problems without fully comprehending how hard they really were.
It was years before we could get Semmle’s core engine to work at scale. We started the initial work while I was teaching classes at Oxford, but we had to separate Semmle from Oxford once the company was founded in 2006 to avoid any confusion about IP ownership. It wasn’t until 2009 that we were able to sign two license agreements: with Murex and with NASA — true partnerships that last to this day.
At this point we were mostly focusing on general code quality. Some of the first technical challenges behind LGTM were solved then, and we built an awesome list of customers. But things really took off when we started applying our variant-analysis tech to security.
How have you attracted users and grown your company?
You often hear that acquiring the first ten customers is hardest, and this is totally true. Our first “marketing” efforts were plain cold emails to prospects found on LinkedIn, and it was incredibly informative for me personally to get raw feedback from that. After we had our first ten customers, we saw a lot of reference customers, people who had heard about our technology elsewhere and wanted it too.
A great example that got us a lot of attention in the early days was our work with NASA. In fall of 2011, the Mars Curiosity Rover launched, with plans to land in August 2012. I had been following the story from the sidelines until we received a call from NASA. The success of the mission rested on what NASA was calling the “seven minutes of terror” — the Rover’s seven-minute descent through the Martian atmosphere, which was entirely automated, with no way to intervene from Earth if something went wrong.
While the Rover was on its way to Mars, NASA discovered an error in the the landing software, and that’s when they called us. We used QL to find 33 undetected variants of that bug in the Rover codebase in just 20 minutes. NASA pushed an update while the Rover shot through space, and the Rover landed safely.
More recently we’ve gotten a lot of attention by using QL to find and responsibly disclose critical vulnerabilities in widely used open-source projects like Ghostscript and macOS. We started this effort only in late 2017, with two engineers that had no special security training. Even with this lack of security experience they found 46 vulnerabilities in 2018; some items on our list of 0-days are quite spectacular! This is an ongoing, expanding project; we’re investing heavily into providing our own security research for the benefit of all software projects, both private and open source.
What’s your business model, and how have you grown your revenue?
Our enterprise customers pay a license subscription, typically for a term of 1–3 years. They run the software on-premise. Many of those customers contribute back their queries, to share their security expertise with the whole community.
We launched LGTM.com to secure open source, together with the development community. It’s free for open source projects, and always will be. We analyze the daily work of over 600,000 developers. LGTM.com gives us a lot of exposure, and we get a lot of interest from people who used it on their favorite open source project and want the same functionality on their proprietary code.
What are your goals for the future?
The most important goal for Semmle is to foster and grow the community that secures software. This mission starts with using LGTM.com to secure all open source projects on GitHub and Bitbucket. On the product front, that means opening up APIs so others can program support for whichever programming language they like, and adding more facilities for easy code sharing such as a module system for the query language.
The future of applying data science to the data we get from our analysis is especially bright. One longer-term goal I’m very excited about are “self-improving queries”, where a quick-and-dirty query by a security researcher automatically becomes better, based on developers’ interaction with the results.
Of course we also want to keep growing the business. This is a matter of meticulous execution, putting together all the systems and processes to scale what is currently working well.
We have the means to do all this thanks to our recent Series B, led by Li Ping and Vas Natarajan of Accel Partners.
What are the biggest challenges you’ve faced and obstacles you’ve overcome? If you had to start over, what would you do differently?
I’d move much bolder and faster. In retrospect, we were somewhat scared by the experts who told us we were trying to do the impossible on the technical front. On the business front, we should have hired sales and marketing experts much sooner. Finally, early on, in our obsession with customer satisfaction, we would sometimes build new functionality for just one customer. But unfortunately that’s not sustainable, so being able to say “no” is just being a good partner to your early adopters.
Have you found anything particularly helpful or advantageous?
For a first-time entrepreneur, there is no substitute for talking to people who have done it before. I’m incredibly grateful to a small circle of friends who have continuously given me their time — this is one of the reasons Silicon Valley is such an amazing place. Getting a great set of investors is totally critical for the same reason — they’ve seen hundreds of companies make the same mistakes, and so they make up for my lack of experience. I love reading the books and blogs of people like Ben Horowitz and Jason Lemkin for the same reason.
What’s your advice for entrepreneurs who are just starting out?
You have to be headstrong, and ignore the expert naysayers. Because of their deep expertise, they see all the problems that aren’t obvious to a newcomer. But that gives you an advantage; you will see solutions that aren’t obvious to them. Take it!
Work closely with your early adopters to achieve product-market fit. It’s easy to fool yourself into building something that is cool, but that people don’t want to pay for.
Finally, sell a solution and not a technology. Semmle would have moved much faster if we had had the courage earlier on to ditch all the other applications of our inventions, and focused on solving the shortage of security experts by automating variant analysis.
Where can we go to learn more?
All of Semmle’s work is housed on semmle.com, and anyone can check out our products for free by adding your favorite open source project to LGTM.com. You can find out more about me or ask me questions on Twitter or LinkedIn, and I often publish on the Semmle blog.