Using Machine Learning to Hack a Web Engagement Score

As a Data Scientist at Demandbase, Eric is helping to drive the next generation of AI insights for B2B marketing. After receiving his Ph.D. in Particle Physics from the University of Illinois, he worked in Japan for over two years as a research scientist. In his spare time, Eric enjoys surfing, listening to music and working on various machine-learning projects with his wife.

Artificial Intelligence (AI) has become an increasingly prevalent force in the world of Sales and Marketing. It’s been responsible for several game-changing developments, from improved product recommendations and predictive lead scoring to website personalization and targeted ads. At Demandbase, AI has become one of the core technologies behind our Account-Based Marketing (ABM) platform, which helps B2B marketers target, engage, convert and close the customers who matter most—all at scale.

Many factors go into Account-Based Marketing, including account identification, website traffic and purchasing intent. One indicator strongly connected to down-funnel business outcomes is how engaged an account is with your website. But how do you prioritize website engagement for hundreds, if not thousands, of accounts? What meaningful insights can you draw from changes in engagement over time? And most importantly, how does early engagement correlate with the likelihood of an account becoming qualified? In this post, we’ll discuss how machine learning can help answer these questions by turning your website traffic into a single metric for prioritizing accounts.

Before customers spend money with you, they spend time on your website. Repeated visits to your website not only mean your content is engaging—they can also be good indicators of how interested your prospects are in your products. While there are a lot of tools out there for reporting pageviews and sessions, we’re more interested in how website engagement impacts the likelihood for account qualification. Are last week’s 11 pageviews more important than yesterday’s three pageviews? What about the number of sessions in the past month?

Before we can use machine learning to answer these questions, we have to first consider how raw website traffic can be converted into meaningful features. For example, while pageviews are technically a daily time series, we can aggregate them over the past day, week, month and quarter. This gives us four features that capture the recency of pageviews to varying degrees. Similar aggregations can be done for total number of sessions, distinct webpages visited, etc. This results in dozens of features that can be used for our model.

There are many challenges associated with finding the “best” set of features. For example, larger companies generate much more web traffic than smaller startups. Are their pageviews equally important, or should they be rescaled in some way? Because Demandbase knows the number of employees at most companies, we can determine precisely how pageviews and sessions should be rescaled based on company size. The image below shows how rescaling a feature allows the model to cleanly separate, or classify, the targets using a linear boundary:

While the figure is simplified, it’s important to note that the data isn’t simply rotated to a new position. Rather, each company’s count of pageviews is scaled by an amount that varies by the number of employees—in other words, this a nonlinear transformation of the features. Once enough features have been engineered in this way, the data can be collected into a large training sample. Our model then “learns” which features are more important than others and quantifies this knowledge as a set of numerical parameters. With these parameters in hand, our model can then predict the likelihood of qualification for new accounts.

Thanks to Demandbase’s technology for identifying companies by IP, we have a significant sample of prospective accounts for training our classifier. But before we can do that, we have to add one last feature to our dataset that indicates whether each account is qualified or not. This feature, called the target variable, labels each account as positive or negative, and acts as a “source of truth” for the classifier during its training.

We chose a logistic regression model for classification, which models the likelihood of qualification as a probability between 0 and 1. One assumption that must hold for logistic regression is that the data be linearly separable, meaning positive samples can be separated from negative ones by a single, flat boundary in the feature space. That’s why the nonlinear transformation above was so crucial—if the data weren’t linearly separable, there would be no way to classify all accounts correctly. The farther a point lies from the boundary, the more certain the classifier is of it being a positive (100%) or negative (0%) sample. Because points near the boundary are less certain, they’re typically assigned probabilities closer to 50%.

A logistic regression classifier works by modeling the odds that an account is a positive or negative sample based on its features. For example, “what are the odds that an account is qualified if it has 12 pageviews in the past week?” Because ratios can’t be negative, our classifier first takes the logarithm of each odds ratio. By mapping large (say 100:1) and small (1:100) odds onto positive and negative values, respectively, the data can be more easily separated using a flat boundary. In fact, logistic regression gets its name precisely from this logarithmic transformation.

In order to determine the accuracy of our model, we examine its performance on a “test” sample of accounts that were initially excluded from the training process. Since the true values for each account in the test set are known, we can calculate how often the model was correct. In our case, the model correctly predicted accounts with an accuracy of 90%… not too bad!

Because the output of a logistic regression classifier is a probability, its values can be used as a score to rank accounts by their likelihood of becoming qualified. Thanks to our feature engineering, engagement scores can be meaningfully compared across accounts regardless of differences in company size or other firmographics. This enables Sales and Marketing teams to better understand website engagement at every stage of the buyer journey.

While the Web Engagement score has proven useful in products such as Account Identification, it’s only one piece of Demandbase’s platform. In a future post, we’ll discuss how other AI-enabled capabilities, such as Real Time Intent, leverage a wide variety of behavioral, contextual and firmographic data to provide deep insights for attracting, engaging and converting the accounts that matter most.