Beyond the Buzz: reverse ETL

We’ve raved about the benefits and capabilities of customer data platforms (CDPs) many times before. But lately, there’s a lot of buzz around a new type of data activation: reverse ETL. Reverse ETL as an alternative to a CDP is all the talk right now. Of course, we couldn’t let the opportunity slip to weigh in on the debates in your favourite content series. Should you consider changing your data set-up? Let’s find out in a new Beyond the buzz.

Iain Murphy

What is reverse ETL? And what’s the difference with a traditional CDP?

ETL stands for Extract – Transform – Load. In short, it’s a method to get data out of a business application (a CRM, digital analytics or marketing automation tool) and load it into a data container (like a data lake or warehouse). A traditional CDP collects and manages customer data from different sources. Through ETL, it then uses this data to activate tools like marketing automation.

The alternative to a CDP is having a data warehouse where all your customer data is available, considered your single source of truth. The downside is that it’s more difficult to activate the data. Reverse ETL offers a way to do that. It is basically a method to copy and transfer data from the data warehouse into your business application. Whereas ETL is a form of data integration, reverse ETL is data activation.

What’s all the buzz about?

In the past, businesses have always struggled to activate the data from their data warehouse, which they consider to be their single source of truth. Doing so often turned out to be complex and time-consuming.

IPaaS solutions have done well up to now to activate audiences against business use cases (think Mulesoft, Marketo, Salesforce, Hubspot, Facebook, Google and the likes). But now more and more businesses are looking for dedicated platforms to manage their marketing automation to identified audiences when it comes to data activation.

That’s why CDPs are preferred to deliver value to the marketing department. A CDP aspires to have a single source of truth of a customer within its database which can pivot and be activated in real-time.

The problem? The quality of the data inside the CDP depends on the quality of the data layer (in short: the way you structure information on your website to make it more reliable to send it an end point with your tag management system) and the ability to enrich it with external data sources. The more comprehensive and complete your data layer is, the more creative and actionable your use cases are. But many companies struggle to make their data layer detailed enough at first, or optimise when their website evolves. As a result, key data is missing.

With reverse ETL, there’s a new way to combine the depth and complexity of a data warehousing solution with the easy marketing automation perks of a CDP. Providers like Hightouch, Segment Twilio, mParticle and Lytics have come out with reverse ETL solutions, and it is expected that other traditional CDPs will soon follow suit.

Has the time come to revisit your data warehousing set-up and activate what businesses widely consider to be their single source of truth? Let’s put both methods in the balance.

Three main differences between a CDP and reverse ETL

1. CDP is real-time, reverse ETL is batched

One of the big advantages of a traditional CDP is that the data is updated in real-time. Reverse ETL updates the database periodically (or in ‘batches’), for instance every morning at 3am.

Let’s say you’re running an e-commerce website. If a customer visits your web shop in the morning, he’s segmented as a prospect. If he puts something in his cart and then closes the website an hour later, he is now a cart abandoner. Once he decides to make the purchase in the afternoon, he’s become a buying customer.

A traditional CDP will update this change in segmentation instantly. After he abandons the cart, you could send him a personalised marketing offer immediately to persuade him.

With reverse ETL, the update in the database only happens the next day, so this customer will go from a nobody to a buying customer in 1 step, leaving no room for personalised and real-time marketing offers.

2. CDP has limited and predefined use cases, reverse ETL allows more complexity

With a traditional CDP, the data is collected and organised to suit a specific use case. Unless you are importing other data sources, you have to know in advance what kind of data you need and how you want to activate it. It could require some additional development resources to update the in-page/in-app data object, or activate and combine new data sources to achieve whatever the new use case may be.

Reverse ETL, however, already has all the data available in the data lake or warehouse. This means that reverse ETL CDPs adopt a ‘collect now and worry later’ approach to use cases, especially with the more complex ones. For instance, Machine Learning, AI and data modelling use cases are harder to achieve with a traditional CDP.

A data warehouse (using reverse ETL) generally allows more advanced use cases to exploit the data even further. The data already exists in the data lake, and data scientists can leverage it to support more complex data models.

In other words: you don’t have to be prepared in advance around what data you need, where it comes from or how you want to organise and activate it.

3. CDP is user-friendly, reverse ETL requires more resources

A traditional CDP is a marketing-focused tool, created to easily activate data and create marketing value. You don’t need data wizards in-house, but you may still require some technically minded individuals for the configuration of audience building or data activation with API's. It’s easy to set up and use for marketing specialists.

Reverse ETL on the other hand is harder to implement. It requires some technical resources and physical hardware to truly benefit from its capabilities. You need to link tables and views of different databases and write SQL code to get the view of the customer or product you want to activate

So… how should you decide which one to use?

Since the traditional CDP is more of a real-time platform offering, it makes sense to use it when you have a business that revolves quickly, like e-commerce. Moreover, since a traditional CDP is quite user-friendly, it’s more suited for companies that don’t have the technical resources to set up and manage a data warehouse, whether on-premise or hosted.

Reverse ETL on the other hand is more data-flexible and for complex use cases. It might be more useful in industries like financial services or media, where complex data modelling is required for modelling of events such as fraud or churn, or content affinity modelling to offer more business value for increased readership or subscriptions.

Can’t you just use both?

You can. The real problem is that you’re creating yet more data silos. The very thing you try to break down with CDPs. Plus, they are usually not interconnected, so they both have their own view of who a customer might be. This could potentially lead to a disastrous customer experience, for instance if both the traditional CDP and the reverse ETL CDP trigger contradicting marketing actions.

If CDP #1 identifies someone as a cart abandoner whilst CDP #2 identifies someone as a buyer - which promotion is sent to whom...? We’d advise against combining both methods since it will add a whole new level of complexity. Which is what you were trying to avoid in the first place.