How We’re Using Machine Learning to Solve the B2B Data Problem

If you’ve ever run a marketing campaign or cold called a list of prospects, you know just how important having the right data is. While the right data can be the start of a really great sales conversation, the wrong data can spell disaster for the most well-thought-out program.

Distinguishing between the right and the wrong data is often the most difficult and challenging part of B2B marketing and sales. Fortunately, this problem is top-of-mind for some of the brightest minds in Data Science and Engineering. At Demandbase, our team of Data Scientists works around the clock to ensure we’re offering the most accurate data to our customers. Recently, we sat down with one of our data scientists Seth Myers, to understand how exactly we approach B2B data at Demandbase. Here are a few highlights from that conversation:

What’s the story behind all this dirty data?
The inconsistencies in B2B data are usually the direct result of employee turnover. As people change jobs, they leave behind a great deal of outdated data, including job titles, email addresses and phone numbers. Since B2B data is always changing, it makes it really difficult to pinpoint a good dataset from a bad one. As a result, most data sources are only 70% accurate.

How are we tackling this problem?
At Demandbase, we rely on sophisticated Machine Learning to help us verify and identify the right contacts and companies. We pull contact data from five different sources across the web and then run a Machine Learning algorithm, which makes decisions about the accuracy of different sources and identifies which sources agree with each other. This gives us the ability to understand if a given person still works at that particular company. We’ve been applying this method for quite a while now, and as a few weeks ago, we’re proud to say our accuracy score has increased to 90%!

What do the recent data quality upgrades mean for our customers?
The recent data quality upgrades translate into better, more accurate data for our customers. They no longer have to spend hours verifying email addresses and checking contact data. Instead, they can spend their time on more important activities, from customizing outreach to building relevant, engaging marketing campaigns.

In the coming months, we’ll continue to devote more resources to confirming data sources and increasing our accuracy score. If you’d like to learn more about our methodology (or if this seems like a problem you’d like to help solve), send me a note @seth_a_myers.