Pecan.ai has just come out of stealth, raising an $11M Series A, to enable business analysts to build machine learning models automatically. Dell Capital led the round, joined by S capital and bringing the total funding of the company to $15M.
The startup was founded in 2016 by Zohar Bronfman and Noam Brezis, two PhDs in computational neuroscience. The company now counts just under 30 employees and is opening offices in New York.
Pecan.ai’s product is a platform targeted at business analysts, enabling them to automatically build predictive models. Their software connects to data sources, cleans up data, and automatically trains a model. A catalog of templates is available to get started, then advanced users can build their own. Right now, the catalog features templates such as fraud detection and churn use cases.
This appears to be part of a second wave of companies bridging the technical gap between non-technical staff who know the business needs well, and machine learning capabilities. "It is about supply and demand. A lot of smaller companies are not in a position to hire data scientists because the good ones are expensive and rare. So we need to understand what other human capital is available", says Boaz Amidor, a strategic advisor to Pecan.
The problem is not new, and some of the first wave of startups have already reached scale. Dataiku, the collaborative data science platform, was founded on somewhat similar claims of helping business intelligence staff collaborate with data scientists. Its drag and drop model building interface made it quite popular with less technical people. 7 year after inception, Dataiku is now worth $1.4B following a secondary round by CapitalG, Google’s growth equity fund. Other platform companies in the space include DataRobot, who announced a $206 million series E in September, bringing its total funding to date $431 million.
Pecan will have to differentiate vs those more established players. I had a call with CEO Zohar Bronfman. When asked about differentiation, he pointed out to a higher level of automation of the data preparation stage: “we automate the entire data prep and engineering required for the autoML”.
Indeed, data preparation and engineering is notoriously expensive and time consuming, taking up to 80% of data scientist time.
That is true both for Pecan.ai’s business analyst target user, but also for machine learning engineers. Cleaning up data and selecting the right features efficiently is a topic that has attracted a lot of attention in the machine learning community. Some large, world-class machine learning teams noticed that individual contributors were all rebuilding the same, low-value add, data extraction pipelines for their own needs. Platforms targeted at technical users, such as Uber’s internal tool Michelangelo, now incorporate Feature Stores to help save feature extraction pipelines for future reuse. Startups such as Logical Clocks are following suite, making this approach commercially available.
The key question here is to what extent extracting, cleaning and preprocessing data can be automated in general. Even specialized automated predictive tools like Salesforce Einstein gave disappointing results to the users I talked to. How can a tool yield better results, across use cases, in an automated way? I dug deeper to understand Pecan's approach.
"What we do is we build [use-case specific] templates that guide business analysts through the data preparation stages. With a solution like Dataiku, you still need data scientists involved in some part of the process. Pecan built this templating system so that business analysts can operate alone." said Bronfman. As often, fancy technology is not the sole component: here Pecan's approach relies on mapping out data science processes, and building wizards that guide users through the data science process.
Pecan.ai is entering a competitive space, with established players, but it is surfing two massive trends: bringing machine learning to less technical users, and automating the data preparation pipelines. It focuses on guiding business analysts through processes that previously required data scientists input. Doing so, it empowers companies without a dedicated data science team to build machine learning models.
Pecan's template catalog approach might force it to expand slowly at first, focusing on use cases that are already covered. But this funding round gives it fuel to start aggressively expand sales.