Data Leakage: Why & How to Lock it Down

Do you remember the first song you downloaded on Napster? And the thousands after that? 1999 was an mp3 free for all, and the record labels never quite recovered. Today’s digital advertising ecosystem feels a bit like the early days of Napster. With the press of a button, advertisers can steal consumer data from publishers. But it’s not too late for publishers to take control of their data and avoid the fate of record labels.

What is data leakage?

As brands have embraced audience targeting, data has become the currency of online advertising. Publishers who attract unique audiences are finding that their traditional content businesses create data exhaust that is highly valuable to advertisers.

The Knot, as an example, is the go-to website for brides to be. Advertisers like Williams Sonoma and Macy’s pay huge CPMs to run their ads on the theknot.com in the hopes of convincing brides to set up wedding registries. But the average bride spends only a small portion of her online time visiting The Knot’s registry page, making these ad impressions scarce and expensive. Through audience targeting, Williams Sonoma and Macy’s can reach these same brides while they are reading the news, checking the weather, and browsing social media. These audience targeted impressions provide vastly greater scale at lower prices.

The Knot did all the work to build compelling wedding planning services, and it is uniquely positioned to monetize its data asset — an accurate list of brides to be. The challenge for publishers like The Knot is that ad tech systems are leaky. With very little effort, advertisers can steal the list of visitors to theknot.com and set up audience targeting campaigns.

Your data at risk

There are two common ways that publishers lose data: RTB bid requests and DMP pixel tracking.

1. RTB Bid Requests
Each time a publisher conducts an auction via RTB, its ad exchange sends a bid request to multiple potential buyers. The bid request includes a long list of parameters that describe the impression. Notably, the bid request includes a user ID and site URL. By recording these incoming bid requests, ad buying platforms can build lists of users who have visited specific pages. Just set up a rule to record the user ID of every bid request from theknot.com, and call that list of IDs “Brides.” Then set up a campaign to target Brides across the web. That’s it. Data stolen.

2. DMP Pixel Tracking
Data leakage isn’t just an RTB problem. Traditional direct sold campaigns expose data leakage risk as well. Brands are increasingly working with DMPs to build a holistic view of online advertising activity. Through a basic tracking pixel that is placed within a brand’s ads, its DMP records a log of every impression the brand delivers to consumers. Each time an ad is served, the brand’s DMP records the user’s ID along with details about the impression. If, for example, a brand works with The Knot to run ads on the registry page, the brand’s DMP will create a log of every user ID reached by the campaign. The adveriser can then create a segment of brides who were exposed to the campaign and target that segment across the web. Easy. Data stolen.

Protecting your data

Aside from working with trusted partners and and drafting strong contractual language, there are two things every publisher should do to protect its data:

Have an RTB transparency strategy
The IAB, who defines RTB standards, made a controversial decision to give publishers discretion over the degree of transparency they provide in bid requests. A publisher like The Knot can issue bid requests with full URL transparency (theknot.com/registry), partial transparency (theknot.com/unknown), or none at all (unknown.com). Because DSPs can make smarter buying decisions with better information, bid requests that include a full URL string command higher prices, driving better publisher yield. But these transparent bid requests also expose data leakage risk. Publishers should seek to maximize RTB yield while controlling data leakage, and this typically means limiting transparency on the highest value inventory.

Build a data business
Back in the Napster days, stolen music felt like the new normal. The idea that an online music store could compete with free downloads was a long shot, but it turned out that consumers were willing to pay for a seamless experience (and maybe felt a little guilty about stealing). The same seems to be true for advertising. Forward thinking publishers have developed turnkey solutions that enable advertisers to take advantage of unique data assets. In some cases, publishers license data segments directly to advertisers, making the data actionable in the advertiser’s preferred DSP. In other cases, publishers are creating new media services businesses that sell audience targeted campaigns to advertisers.

Because of the early emergence of buy-side audience targeting tools, advertisers got a jump start in the programmatic space, and many publishers are still playing catch-up. But with a focused data strategy, publishers can control data leakage and unlock new high margin revenue streams.