Data Design: Finding Order in the Chaos

Press enter to search
Close search
Open Menu

Data Design: Finding Order in the Chaos

By Michael Applebaum - 03/12/2018

How machine learning and a new approach to data analysis are taking marketing to the next level

Epsilon found through a machine learning study of social conversations among donors and visitors that some of the most powerful associations with the San Diego Zoo had to do with a desire to save certain endangered species from extinction.

For many marketers, machine learning and artificial intelligence are tools to be deployed in the far-off future. Data, on the other hand, is a front-and-center priority, and today’s marketer must be able to leverage customer information of all types, shapes and sizes. But imagine if these pursuits could come together – right now – to create an entirely new marketing model. Invariably, any marketer who could successfully apply machine learning to the analysis of data would instantly gain a tremendous competitive advantage; namely, access to a powerful arsenal of new tools to help figure out what makes consumers tick, identify and expand new target audiences, and build bigger and better solutions.

“Machine learning is setting the stage for intelligent systems to establish a baseline from which we can build predictive models for how to connect with consumers and prepare for a time when artificial intelligence takes on more consumer-centric tasks,” says Tom Edwards, chief digital and innovation officer at Epsilon. “Ultimately, it will bring us closer to achieving our major objectives: What are the tools needed to visualize the insights or occasions? How can we use that knowledge to produce a comparative analysis over time? How can we fuel the different steps of consideration, evaluation, action, experience and loyalty?”

These ideas are at the heart of a new paradigm that Epsilon has introduced and is calling Data Design. According to the experts who are building the model, Data Design can generate deeper insights than traditional predictive data analysis by combining what the agency calls the structured “data of identity” (e.g., demographic profiles, transaction histories, cross-device IDs) with the unstructured “data of culture” (e.g., product reviews, comments, posts, images, music and emojis) that pervades the internet and expresses who people really are.

“Data Design brings the right data together with the right tools to facilitate insight development. Ultimately it will lead to better audiences and activations,” says Ellen Foster, senior vice president of Data Design at Epsilon, who coined the term and continues to develop the concept. “With new data sets and tools appearing all the time, the practice constantly changes. The work bears some similarities to architecture, engineering and construction. We do literally design the data.”

Epsilon’s Data Design practice is being built at a time when marketers are attempting to harness the power of an ever-expanding portfolio of data, which includes all of the rich (and often not well understood) stockpiles of data and consumer-generated content that come from social sharing. An IBM study in 2017 revealed that 2.5 quintillion bytes of data were being created every day, and that 90% of the world’s data had been created in the previous two years alone. Simultaneously, IDC predicted that 163 zettabytes of data would be generated by 2025 – an information explosion requiring the coinage of ever more exotic terms for higher magnitudes, with “yottabytes” and “brontobytes” coming next.

Moreover, consumer behavior today is constantly changing and challenging to predict. While most companies have access to vital information about their customers in order to build effective, targeted solutions, they do not yet have a clear and consistent pathway to leverage those data sets in a way that makes the information actionable in real time, says Edwards. “If I have 2,000 data points tied to each individual, the question then becomes: How do I make sense of all of that information and drive results for the business?” Consider: A typical activation across four touchpoints and four channels yields 256 distinct paths to purchase, per the IBM study.

A Better Type of Brain

One of the main building blocks of Data Design is machine learning. Part of a family of tools belonging to the field of computing research known as artificial intelligence (AI) – which also includes motion (e.g., self-driving cars) and perception (facial recognition) – machine learning is an effective tool for marketers because its algorithms can update conclusions based on new data, even in real time. The process is often referred to as continuous learning. It is better and faster than anything that a human can do; literally, it means having a better type of brain.

Machine learning is hardly new. It’s a workhorse that dates back to Alan Turing’s work with testing new electronic computing solutions after World War II. In the 1950s and ’60s, IBM used the emerging science to assess gaps in its data sets as the company began to automate functions like payroll, billing and inventory management. With today’s vast computing power, limitless storage capabilities and endless caches of data, machine learning can support a vast array of predictive applications.

Epsilon deploys machine learning across the entire spectrum of marketing. It is commonly used to optimize digital advertising and to accelerate delivery of messages across channels at the right time and place. For example, an airline customer who checks in at a self-service kiosk may update his or her flight details to include an additional checked bag or upgraded seat. That data is immediately sent to the airline’s marketing team, whose automated systems can push out customized offers or messages to the same individual within minutes. “Machine learning can dramatically improve campaign performance. It is particularly effective when used to optimize personalization techniques and to cross-sell or up-sell promotional content,” says Mark Sucrese, vice president of machine learning and artificial intelligence at Epsilon.

Where machine learning has yet to unleash its full potential is making sense of all that vast unstructured data from the internet. “Machine learning helps us organize the unstructured data. It pulls order out of the chaos,” says Foster. She gives three reasons why machine learning is an essential analytical approach:

1. There is a galaxy of consumer marketing insights locked up in an infinite universe of unstructured data. If one marketer doesn’t get to it, another one will.

2. No human being is capable of finding the patterns in unstructured data.

3. Marketers cannot tie insights from unstructured data back to trends in structured data until the connecting themes have been identified. To get at insights that can be used to drive marketing performance, they need a better type of brain.

This type of learning will likely make its way down the purchase funnel. For example, Epsilon’s Data Design team has used machine learning to analyze images posted on Instagram of ready-to-drink kombucha products. The researchers noticed some intriguing distinctions – for instance, while many images capture a bottle next to food, other images depict bottles in more general lifestyle settings, such as propped up on the steering wheel of a car.

“For some people, kombucha may have a social status apart from its health-related attributes,” explains Foster. “I think it is safe to hypothesize that the steering wheel images could have come from consumers who value the social currency of kombucha and who may very well have purchased the product at a convenience store. This gives us a critical channel-specific insight – the idea that social status or brand is more important to buyers at convenience stores – that marketers in the category could leverage in innumerable ways.” Indeed, retail distribution of kombucha is expanding rapidly, with many grocery stores now featuring dedicated refrigerated sections of the popular (and pricey) products. According to research firm Statista, the kombucha category is expected to triple over the next five years to $2.5 billion in U.S. retail sales.

Gathering Insights

Like social listening, machine learning extracts feedback directly from consumers without attempting to prove a hypothesis or corroborate a thesis. But the scope and scale of market research is much broader, and the resulting insights can go much deeper. The machines bring entire categories into an analytical context by examining the key dynamics at work: competitors, brands, products, attributes, occasions and perceptions, as well as the consumers themselves.

“Social listening can essentially get you to: ‘I like this shirt,’ but we apply machine learning specifically to understand how that sentiment fits into the context of the greater online consumer conversation of who, what, where and why around that shirt,” says Foster. “After years spent in traditional insight work, it can be disorienting to contemplate so much of the world from a true bottom-up perspective. You never know exactly what you’re going to get.”

While the algorithms behind machine learning are quite complex, their underlying principle is fairly simple: No human brain can unpack a conversation that includes 250,000 Facebook posts. Industry experts and even the most qualified researchers might not notice or think to ask about the patterns that machine learning can uncover. “Marketers can then use custom research to dig deeper and from much more relevant points of departure,” says Foster. “Of course, machine learning is unlikely to ever fully answer the ‘why’ questions.”

But it can get marketers much closer. “The piece that traditionally has been missing is making sense of the unstructured data,” notes Edwards. “That is where you can really pick up on themes, affinities, perceptions and occasions.”

From a research standpoint, the approach is extremely versatile. A machine learning analysis can be customized to meet just about any marketing objective. It may be constructed at the category level or drill down into the attitudes and behaviors of consumers of a particular brand or product. Epsilon’s Data Design research includes gathering insights into the brand drivers in the pet food category, the experience of shopping for an engagement ring and the usage occasions for Reddi-Wip topping.

The engagement ring study, for example, was based on a machine learning analysis of some 130,000 social conversations on Reddit. It found, perhaps counter-intuitively, that the conversation around buying a diamond engagement ring was much less passionate than that of jewelry buying in general. (See heat maps below.)

One theory is that society has diluted or perhaps sanitized the experience of buying a diamond ring to the point where factors such as quality, warranty and repairs – and even excitement – have become moot. Whatever the cause, a jewelry marketer has an opportunity to go beyond the 4 C’s and speak to bridal customers in ways that reflect an expanded experience of engagement, says Foster. “You could create a story around the ring that inspires a shopper to travel to the same location as where the gem came from. Such a message would resonate with younger customers who want to make both the purchase and the engagement itself part of a larger life experience.”

Test, Learn and Adjust

Machine learning not only can expand research and accelerate insight generation, but it can also make programs and campaigns work harder. “Traditional A/B copy testing becomes much more effective when it evolves to ‘A/infinity’ testing in real-time,” explains Foster. “In other words, the right kinds of machine learning applied in the right ways can give agencies and brands a head start. Once the machine figures out who’s talking about what and how, and what the major dynamics are in the conversation, we can cut closer to more relevant chases faster.”

In some cases, the results of machine learning studies will align with a marketer’s existing approach. For example, Epsilon found that some of the most powerful associations with the San Diego Zoo had to do with a desire to save certain endangered species from extinction. That insight, derived from a machine learning study of social conversations among donors and visitors, was consistent with San Diego Zoo’s long-standing position as a leader in the fight against extinction of animals like the northern white rhino and African elephants. The message has been reinforced in the zoo’s current “Let’s Turn Things Around” multi-channel campaign.

When campaigns or messages aren’t resonating, machine learning can help the marketer figure out why. Case in point: Conagra Brands concluded that because Millennials weren’t responding to recipe-focused communications around its Reddi-Wip brand, the group was not a reliable growth target for the brand. The results of a machine learning study and a quantitative survey suggested that Millennials were indeed heavy users of Reddi-Wip, and that a message built around fun usage occasions might be the answer. The key insight: Millennials weren’t interested in the brand’s focus on ingredients because, more so than all other age groups, they prefer to consume the product in indulgent ways right out of the can.

“The inclusion of machine learning and Data Design is a significant addition to our work,” says Heidi Froseth, executive vice president of the shopper commerce practice at Epsilon agency Catapult. “The practice does three things very well: It identifies prime prospects versus core consumers; it is more effective and efficient in driving increased ROI; and it more accurately identifies best KPIs for all marketing teams and functions. The trajectory of Data Design is very much like that of Big Data. We learned how to use it, and then a couple years later, we learned the insights from it. The same thing will happen here, which makes this a very exciting time.”

As all of these applications are still fairly new, the takeaways and capabilities of the practice are still evolving. But the potential impact seems clear. Says Edwards: “We’ve only begun to scratch the surface of the power of this model.

Epsilon’s Data Design research found, perhaps counterintuitively, that the conversation around buying a diamond engagement ring was much less passionate than that of jewelry buying in general.


Artificial intelligence (AI) A broad area of research that refers to a computer’s ability to complete tasks like understanding human speech, competing at a high level in strategic game systems, routing in content delivery networks and conducting military simulations. The marketer’s focus on AI often comes in the context of interpreting complex data, including images and videos, and leveraging applications in manufacturing and retail.

Data Design An emerging practice from Epsilon that entails blending assets (e.g., data sets) and tools (e.g., machine learning) in a strategic approach to solving a business problem or creating a marketing solution. The key differentiator to other analytical approaches is the ability to uncover patterns in structured and unstructured data.

Data of identity – Static, one-dimensional information contained in first, second and third-party data sets. It includes non-cash transactions, demographic information, household composition, channel preferences, survey data, media consumption and device/online usage.

Data of culture – Fluid, multi-dimensional information that populates social media and the internet. It includes consumer-generated text like reviews, comments, posts, images, videos, music and emojis.

Machine learning – A computing science that dates back to World War II involving human-coded algorithms used to help drive a decision. Part of a family of tools under the AI umbrella that includes motion and perception. Similar to social listening, but with greater contextual analysis and outputs that may be leveraged across the full spectrum of marketing activity.

Social listening – An analytical process of monitoring digital/online conversations to learn what customers are saying about a product, brand, company or industry.

Structured data – Data that doesn’t change much or often, sits quietly in a database and is relatively easy to analyze or measure.

Unstructured data – Data that is richer, harder to measure, implicit, is piling up, takes up a lot of space and expresses who people really are.


Epsilon Catapult is a comprehensive global marketing innovator with a best-in-class practice in shopper marketing. Our unrivaled data intelligence and customer insights leadership combined with our world-class technology including loyalty, email and CRM platforms and data-driven creative, activation and execution create the best possible solutions and growth for our clients, their retail partners and shared shoppers in the marketplace. We curate personalized marketing to consumers across offline and online channels, during their moments of interest, that help drive business growth for brands. Recognized by Ad Age as the #1 World’s Largest CRM/Direct Marketing Agency Network, #1 Largest U.S. Agency from All Disciplines, #1 Largest U.S. CRM/Direct Marketing Agency Network and #1 Largest U.S. Mobile Marketing Agency, Epsilon employs over 8,000 associates in 70 offices worldwide. Epsilon is an Alliance Data company. For more information, visit and follow us on Twitter @EpsilonMktg.