Brian Zotter


An AI Powered Assistant for B2B Marketing

Traditionally, it’s the B2C marketers who move first to improve their buyer experience with data and personalization. In recent years, internet services like Amazon and Netflix have set the bar with real-time recommendations based on past behavior. Dynamically changing the customer’s journey produces a “segment of one” experience that is engaging and compelling. According to a recent survey by Evergage in “Real-Time for the Rest of Us”, the main benefits include:

  • Increased customer engagement (81%)
  • Improved customer experience (73%)
  • Increased conversion rates (59%)
  • Improved brand perception (52%)

However, there are number of problems when trying to apply B2C approaches to B2B businesses:

  • Low volume: B2B sites typically see much less traffic than consumer sites. Most algorithms rely on large volume to make recommendations with statistical significance.
  • Long cycle: Consumer transactions can be performed in minutes. B2B sales can take 6–12 months or longer.
  • Group behavior: B2B purchases usually require many buyers to come together in a consensus. Current recommendations systems are built for the individual.

The question for B2B marketers is how to deliver that consumer level of personalization in a B2B context with less data. The answer is next-best-action marketing. In next-best-action, the buyer experience is not defined ahead of time. Instead, the sequence adapts continuously to the context and empirical results. At each step, an algorithm considers the different actions that can be taken for a specific customer and recommends the best one. The goal is to optimize for both immediate conversion and customer lifetime value. But how to do it?


The key to deciding next-best-action is the Multi-Armed Bandit. This term is a reference to a probability problem with slot machines (aka “one armed bandits”.) Imagine you are gambler in a casino with many different slot machines, each with its own payout. To maximize winnings, the gambler needs to figure out which slot machine has the highest payout — before he goes broke!


With learning over successive attempts, effort shifts from exploration to exploitation.

To solve the multi-armed bandit, the best strategy is to start with a short period of exploration, trying different arms at random and collecting data. Then as you begin to figure out what is working (and what is not) you shift over to exploitation, spending more time pulling the arm at the machine with the highest payout. The important thing is to never stop exploring completely since you can never know for sure that another machine won’t yield a higher payout.

Solutions to the multi-armed bandit have been applied to use cases like A/B testing, ad serving, and news feed construction. In the case of B2B marketing, there is a need to optimize the next best action. Marketers have a catalog of possible actions and we need to pick the next one to offer (which arm to pull) for each customer or lead. An action could be a website offer to download content, an email to send, or even a task to assign to a sales rep. However, given the low volume of historical data typical in B2B marketing, we don’t have enough information to do a proper exploration vs. exploitation phase using a simple multi-armed bandit solution. However, we do possess a lot of information that helps to solve the problem.


At YesPath we’ve been influenced by techniques and strategies used to solve a variant of the multi-armed bandit problem known as the constrained contextual bandit problem. We developed a suite of machine learning algorithms that are informed by a context (prior knowledge) to build an agent that chooses the next best action. We call this agent the YesPath Virtual Assistant. By using a context of features the assistant can make better decisions early on in the process when there is little behavioral data to use. Our context is unique in that in considers features specific to the B2B selling process.

  • Account attributes like industry, size, revenue and location
  • Persona attributes like role, seniority and title
  • Topic interests
  • Opportunity stages — different actions have different reward payouts in different stages

The assistant is constrained in that it has a budget. When deciding on actions, it considers the cost of those actions. For example, there may be a limited number of seats for an executive dinner or an iWatch giveaway.

The deal (or opportunity) stage has significant importance in the selling process. Early in the process the assistant has more freedom to pick actions like recommending content or email templates. As the deal progresses, the risk increases so the assistant will shift over to making suggestions of actions for a rep to perform. By the time an opportunity reaches late stages, the rep is in complete control of whether to execute the assistant’s recommendations.


To be a valuable assistant, YesPath needs to understand the business goals. So we establish a system of rewards(payouts). In B2B, the obvious reward is a won deal, but that could take up 12 months or more. There are also renewals and upsells to consider. So, the assistant also considers stage progression as a reward. Even that doesn’t go far enough. There could be many months between a stage transition and we want to have quick feedback on a daily basis. That’s why we developed a system of engagement points to measure how a particular contact (and everyone else in that account) engaged after an action was chosen. For example, we can saying that viewing a web-page gets 1 point and attending a webinar gets 10 points. The assistant needs to balance short-term engagement with long-term payouts to maximize customer lifetime value. It’s good if a webinar receives many attendees, but only if some of the attending accounts go on to demonstrate opportunity progress or won deals.


The assistant never sleeps and never takes vacation. In B2B Marketing, content is continuously being produced. Webinars, tutorials, conferences, etc. A common problem with current recommendation systems is starvation. A new piece of content doesn’t start with any views or likes, so it won’t be picked. The assistant will intelligently explore in context uses of new actions to estimate the payout before it exploits it. The benefit to the marketer is that she just needs to produce the new asset and add it to the pool. She doesn’t need to setup specific short-term experiments to test the efficacy before rolling out. It’s “set and forget.”

There is no one-size-fits-all assistant so it was important for us to build a platform that allows easy customization. YesPath can be customized by context, actions, and reward policies. For example a company selling database software will rely more on the attribute of industry because they want to highlight specific use cases to that industry.

Across our customer base, we have seen an average of 10% increase in lead conversion when our assistant is picking the next best action on a website. We have also seen an engagement increase of 30% or more when the assistant is recommending content in marketing automation campaigns.

In the end, we’ve taken a novel approach to solve the constrained contextual bandit problem specifically to the B2B selling process. This enables us to pick the next best action and deliver unique and engaging experiences to each contact and account. At YesPath, we’ve always believed that marketers succeed when they craft a personally relevant experience. Historically it was a challenge to do this at scale, but now virtual assistants like YesPath are making it possible.

Originally published at on February 1, 2017.

Topics of interest

More Related Stories