At Demandbase, we love building the invisible systems that make B2B marketing smarter—and faster. Our latest breakthrough, Warp Speed Loading (WSL), is one of those behind-the-scenes upgrades that changes everything.
WSL slashes data load times from nearly two days to about an hour. That means our customers can react to opportunities and risks in real time instead of waiting for predictions to catch up. It’s not just a performance boost—it’s a major leap toward true predictive agility.
This project was also a masterclass in collaboration. Our Unified Data Platform team provided invaluable insights that made WSL possible, and our Developer Experiences team gave us the frameworks to move quickly and safely. Together, we turned a slow, shared data system into one of the fastest, most reliable pipelines at Demandbase.
Here’s how we reimagined our architecture, tackled long-standing bottlenecks, and made our platform faster, cleaner, and more resilient than ever.
Our approach to predictive accuracy is a collaborative, two-way street. We are continually refining and improving our machine learning (ML) models to ensure they deliver the most precise and reliable predictions possible. However, the true power of these models is unlocked when customers actively align their setup and configuration with their specific business goals.
This collaboration—merging our enhanced models with the customer’s specific implementation—accelerates results and cultivates a dynamic environment where customers can drive improvements through experimentation. They can quickly test diverse strategies, evaluate the impact of faster predictions, and continuously refine their methods, establishing an ongoing cycle of optimization and elevated performance. Intuitively, few will have the patience to improve the model setup if it takes two days to observe the impact of changes.
Moreover, imagine giving your competitors an extra two days to plan and address an account’s recent massive change in pipeline likelihood. The profound business impact of timely predictions cannot be overstated. When predictive models deliver results swiftly, customers gain the crucial ability to adapt their strategies in real-time. This immediate responsiveness allows them to pivot quickly, mitigate potential threats before they escalate, and seize opportunities that might otherwise be missed.
Enter Warp Speed Loading.
Our previous data loading system faced several significant challenges. The aged and complex orchestration stack required urgent replacement and simplification. Loading data into PostgreSQL was inefficient due to a single JSON column shared by multiple teams. Customer workloads with millions of accounts often took up to 24 hours to process.
Furthermore, change data capture (CDC) via Debezium introduced substantial delays, with updates processed only once daily, potentially adding another 24-hour lag. The ML-Scheduler, built on a Quartz framework, initiated the ML-Loader via a Pulsar queue, after which the PostgreSQL loading process would begin.
Another complication arose from handling soft-deleted model IDs and the varying number of models per customer. All scores and associated metadata were stored within the aforementioned shared JSON column. Direct transmission to CDC was unfeasible because other users of the shared JSON column could overwrite results if our data wasn’t present in PostgreSQL at the time of writing but arrived later. Moreover, end-user clients were exclusively aware of the single JSON column, making it impossible to simply add a concurrency-safe alternative.
The advancements detailed in the following section ultimately resolved these issues. By leveraging Demandbase’s evolving data platform, we’ve reduced a worst-case load time of 48 hours to approximately one hour.
The fascinating thing is that WSL was made possible by only two changes.
One, our internal semantic layer was updated to provide a way for teams to point end-user clients to any column on a data object to find inner json fields. So imagine we have column A that holds json keys Bi <= B1, B2, … BN. This new metadata field associated with each Bi specifies the column where it can be found, in this case A. The previous approach would require a search for the field name in the available columns, and if the column was not found the field was expected to be a json key in the default column (fields). This provided a way to serve our data from a column that was independent of other teams’ writes. Nevertheless, it would not be a simple thing to support loading only some columns of a row in the same physical table in our Iceberg and Starrocks centric platform.
Fortunately, with our platform’s recent migration to Starrocks, our colleagues were able to support serving our data as part of a view of the original table. Starrocks provides scalable and efficient mechanisms to make this approach feasible. Thus our loading process is now simply a Spark job overwriting Iceberg partitions and a notification to the downstream data platform when the partitions are ready to use. We have exceptionally fast loading time and no CDC to wait on!
Our process, though still initiated by the ML-Scheduler, has undergone a significant streamlining to enhance efficiency and reliability. The updated workflow is as follows:
Data ingestion and processing: upstream data science outputs are directly channeled into tabular data files stored in S3. This acts as a centralized and accessible data lake.
Workflow orchestration: A Temporal Workflow is then triggered, orchestrating a series of EMR jobs. This ensures a robust and fault-tolerant execution of the data processing pipeline.
Data transformation with EMR: Two distinct EMR jobs are executed:
Data loading and integration: The processed data is subsequently loaded into Starrocks, our high-performance analytical database. Here, it is seamlessly joined into the existing Account view, providing a comprehensive and unified perspective.
This refined process offers several key benefits:
With Warp Speed Loading, we’ve transformed our data processing from a 24–48 hour cycle to near real time. That’s more than a technical upgrade—it’s a strategic shift.
Faster predictions mean faster reactions, smarter experimentation, and sharper competitive advantage. For our customers, this means better decisions and bigger wins.
And for us? It’s another step in our mission to make data intelligence truly instant.
Want to see how Warp Speed Loading can power your pipeline intelligence?
We have updated our Privacy Notice. Please click here for details.