Published on
December 16, 2024
Written by
Noa Kalmanovich
Table of contents
As a market research software provider, we often share insights from our own perspective on the impact of synthetic data. We enjoy guiding you through our view of the evolving landscape of market research. This time, however, we want to shift focus: to explore how brands themselves are deploying and experimenting with synthetic data and digital twins. Generative AI has granted the ability to innovate flexibly in personalization, forecasting, and product development– while maintaining data security– offering brands a potent tool.
We know we might sound like a broken record defining synthetic data, but let’s quickly revisit. Unlike “real data” synthetic data is data generated using algorithms or mathematical models, often on the basis of real-world samples or information. Generated by analyzing patterns and correlations, this artificially generated data replicates the essential characteristics of real data. Used across industries, it supports processes from fraud detection in banking and policy development in healthcare to campaign and product testing with synthetic respondents in market research.
It's undeniable that the surge in AI tool integration is making waves. Capgemini Research Institute’s latest report notes that organizations are investing more than 60% of their marketing technology budget towards Generative AI, amplifying the prevailing rush to maintain breadth in innovation. As professionals are increasingly turning to synthetic data, they’re discovering powerful ways to bolster internal operations and customer-facing strategies. For brands, this shift not only streamlines processes, but also strengthens smarter decision-making, sharper insights, and more personalized customer experiences. They now have access to a fast and cost-effective way to experiment within their market, gaining deeper insights that more efficiently align with and support their goals.
As the name states, twin systems, put simply, are digital doppelgängers. Whether it’s a person, object, or system generated virtually, these replicas augment understanding of consumer behavior as if observing in the real world. Often leveraging real data sources or at times updating in real-time to capture the asset’s dynamic nature, digital twins create an immersive environment for organizations to explore and experiment. By connecting scenario planning with internal functions, business decisions can be assessed within a digital environment.
According to McKinsey’s analysis, nearly 75% of companies have integrated digital twins technologies into complex aspects of their operations. On top of that, the global market for this technology is projected to grow at an annual rate of about 60% over the next five years, reaching $73.5 billion by 2027. AI is set to rethink marketing and brand journeys.
The use of digital twins creates risk-free environments for product development, enabling teams to explore design options, analyze product behavior, and monitor interactions. This approach not only streamlines the design process but also facilitates the identification and evaluation of potential changes, ultimately driving innovation and efficiency in production. Projects that were hindered by lengthy privacy compliance processes are now easily deployed with these replicas.
Demonstrated by the collaboration between Siemens and TrakRap, a packaging solutions manufacturer, the use of digital twin technology proved invaluable in the development and optimization of a packing machine prototype. By simulating finite elements, materials, and control it all within a digital environment, they were able to design a process to the highest potential, evaluate suitable configurations, and significantly reduce costs. This innovative approach enabled them to achieve sustainable advancements in the industry.
Evidenza.ai is an AI-driven platform that creates a diverse array of synthetic customer personas, each based on unique personal and professional attributes tailored to product categories. These persona profiles enable researchers to effectively target and analyze distinct groups, augmenting brand positioning, go-to-market strategies, and competitive insights. By using synthetic research, the platform offers valuable, data-driven intelligence that resonates with C-suite executives, providing them with actionable insights for decision-making.
Metaverse-style digital twins for immersive experiences and operations replicate physical environments within virtual spaces. Casting lifelike simulations, brands are able to interact with these systems for experiential purposes such as user engagement, training, and real-time insights within these virtual environments. Catching it in action, we can refer to the digital twin implementation in smart stadiums. The SoFi Stadium, located in the United States, has been developed with such technology to amplify one of the world’s favorite pastimes– sports and entertainment. Designed with “modernizing the fan experience in mind,” the SoFi encompasses data-driven facets that record real-time insights to optimize operations, offer structural support, and sync performance, all to identify opportunities to improve the experience for fans.
Exemplified by alcohol giant, Diageo, LLM algorithms were used to predict the future of the supply chain. Teaming up with Ai Palette, Diageo tracked emerging trends within their beverages by “scanning everything from social media to news article mentions and restaurant and bar menus, to determine which flavours are growing in popularity on either a national, regional or global level.” The comprehensive data generated by these models and networks established a new product development framework for the brand, resulting in more informed, data-driven product launches.
In a time where data breaches and privacy concerns are rife, ensuring data is protected is invaluable. As with any online tool, there is a significant risk of sensitive data being leaked or exposed to malicious actors. However, there are effective solutions that can mitigate these risks and are essential to implement.
Synthetic data on the one hand offers a platform and artificial points to test “ad campaigns or conduct product testing on synthetic populations to reduce risk and refine go-to-market strategies” without privacy concerns. On the other hand, often the LLM models used to train and implement these operations may use 3rd party APIs, RAG, few shot prompting, and fine tuning that may leave data exposed. To deflate these privacy challenges, solutions such as Tonic Textual by Tonic.ai are brought into play.
Tonic Textual is a “text de-identification tool that uses state of the art proprietary named-entity recognition (NER) models to identify personally identifiable information (PII) in text.” In other words, this AI algorithm analyzes text data to identify information that is unique to individuals or deemed private, removing key identifying features that could lead to data leakage. Tonic Textual is programmed to clean brand’s data in two ways:
Incorporating data security applications into preprocessing pipelines is essential for brands to fully leverage the benefits of generative AI in a compliant way. Prioritizing this solution not only safeguards individual privacy but also keeps an organization compliant with changing regulations, paving the way for a safer digital landscape. Plus, it once again utilizes the power of synthetic data to fill in where real world methods lack!
At Fairgen, while we support generative AI in forms similar to digital systems, we take a distinct route when it comes to synthetic data. Using augmented synthetic research is a powerful way to analyze niche markets. Fairgen boosts survey reliability by generating synthetic data that accurately reflects under-sampled groups. This allows researchers to capture insights on niche customer segments that may be challenging to reach through traditional methods, enabling more precise, data-driven strategies tailored to these unique audiences. Fairgen’s tools augment representation, ensuring that smaller or specialized market segments may be thoroughly understood. While AI-driven data modeling marks a breakthrough in redefining market research through generative AI, this approach enables predictive insights to be created through highly relevant data models with unprecedented depth and flexibility.
Researchers, brands, and organizations at large are rapidly taking on the transformative era of systems powered by synthetic data and the capabilities of generative AI models. Digital twins, as dynamic platforms for consumer and product research, offer immense potential for analysis and simulation. However, it is important to recognize the inherent risks, including unintended biases, lack of realism, and accuracy issues within AI systems.
As such, we advocate for a blended approach, where “data is used to augment, rather than replace human-based data gathered from real-world observations.” To explore this balanced perspective further, read our blog post, Transforming Market Research Operations Through Human and Machine Synergy.
As a market research software provider, we often share insights from our own perspective on the impact of synthetic data. We enjoy guiding you through our view of the evolving landscape of market research. This time, however, we want to shift focus: to explore how brands themselves are deploying and experimenting with synthetic data and digital twins. Generative AI has granted the ability to innovate flexibly in personalization, forecasting, and product development– while maintaining data security– offering brands a potent tool.
We know we might sound like a broken record defining synthetic data, but let’s quickly revisit. Unlike “real data” synthetic data is data generated using algorithms or mathematical models, often on the basis of real-world samples or information. Generated by analyzing patterns and correlations, this artificially generated data replicates the essential characteristics of real data. Used across industries, it supports processes from fraud detection in banking and policy development in healthcare to campaign and product testing with synthetic respondents in market research.
It's undeniable that the surge in AI tool integration is making waves. Capgemini Research Institute’s latest report notes that organizations are investing more than 60% of their marketing technology budget towards Generative AI, amplifying the prevailing rush to maintain breadth in innovation. As professionals are increasingly turning to synthetic data, they’re discovering powerful ways to bolster internal operations and customer-facing strategies. For brands, this shift not only streamlines processes, but also strengthens smarter decision-making, sharper insights, and more personalized customer experiences. They now have access to a fast and cost-effective way to experiment within their market, gaining deeper insights that more efficiently align with and support their goals.
As the name states, twin systems, put simply, are digital doppelgängers. Whether it’s a person, object, or system generated virtually, these replicas augment understanding of consumer behavior as if observing in the real world. Often leveraging real data sources or at times updating in real-time to capture the asset’s dynamic nature, digital twins create an immersive environment for organizations to explore and experiment. By connecting scenario planning with internal functions, business decisions can be assessed within a digital environment.
According to McKinsey’s analysis, nearly 75% of companies have integrated digital twins technologies into complex aspects of their operations. On top of that, the global market for this technology is projected to grow at an annual rate of about 60% over the next five years, reaching $73.5 billion by 2027. AI is set to rethink marketing and brand journeys.
The use of digital twins creates risk-free environments for product development, enabling teams to explore design options, analyze product behavior, and monitor interactions. This approach not only streamlines the design process but also facilitates the identification and evaluation of potential changes, ultimately driving innovation and efficiency in production. Projects that were hindered by lengthy privacy compliance processes are now easily deployed with these replicas.
Demonstrated by the collaboration between Siemens and TrakRap, a packaging solutions manufacturer, the use of digital twin technology proved invaluable in the development and optimization of a packing machine prototype. By simulating finite elements, materials, and control it all within a digital environment, they were able to design a process to the highest potential, evaluate suitable configurations, and significantly reduce costs. This innovative approach enabled them to achieve sustainable advancements in the industry.
Evidenza.ai is an AI-driven platform that creates a diverse array of synthetic customer personas, each based on unique personal and professional attributes tailored to product categories. These persona profiles enable researchers to effectively target and analyze distinct groups, augmenting brand positioning, go-to-market strategies, and competitive insights. By using synthetic research, the platform offers valuable, data-driven intelligence that resonates with C-suite executives, providing them with actionable insights for decision-making.
Metaverse-style digital twins for immersive experiences and operations replicate physical environments within virtual spaces. Casting lifelike simulations, brands are able to interact with these systems for experiential purposes such as user engagement, training, and real-time insights within these virtual environments. Catching it in action, we can refer to the digital twin implementation in smart stadiums. The SoFi Stadium, located in the United States, has been developed with such technology to amplify one of the world’s favorite pastimes– sports and entertainment. Designed with “modernizing the fan experience in mind,” the SoFi encompasses data-driven facets that record real-time insights to optimize operations, offer structural support, and sync performance, all to identify opportunities to improve the experience for fans.
Exemplified by alcohol giant, Diageo, LLM algorithms were used to predict the future of the supply chain. Teaming up with Ai Palette, Diageo tracked emerging trends within their beverages by “scanning everything from social media to news article mentions and restaurant and bar menus, to determine which flavours are growing in popularity on either a national, regional or global level.” The comprehensive data generated by these models and networks established a new product development framework for the brand, resulting in more informed, data-driven product launches.
In a time where data breaches and privacy concerns are rife, ensuring data is protected is invaluable. As with any online tool, there is a significant risk of sensitive data being leaked or exposed to malicious actors. However, there are effective solutions that can mitigate these risks and are essential to implement.
Synthetic data on the one hand offers a platform and artificial points to test “ad campaigns or conduct product testing on synthetic populations to reduce risk and refine go-to-market strategies” without privacy concerns. On the other hand, often the LLM models used to train and implement these operations may use 3rd party APIs, RAG, few shot prompting, and fine tuning that may leave data exposed. To deflate these privacy challenges, solutions such as Tonic Textual by Tonic.ai are brought into play.
Tonic Textual is a “text de-identification tool that uses state of the art proprietary named-entity recognition (NER) models to identify personally identifiable information (PII) in text.” In other words, this AI algorithm analyzes text data to identify information that is unique to individuals or deemed private, removing key identifying features that could lead to data leakage. Tonic Textual is programmed to clean brand’s data in two ways:
Incorporating data security applications into preprocessing pipelines is essential for brands to fully leverage the benefits of generative AI in a compliant way. Prioritizing this solution not only safeguards individual privacy but also keeps an organization compliant with changing regulations, paving the way for a safer digital landscape. Plus, it once again utilizes the power of synthetic data to fill in where real world methods lack!
At Fairgen, while we support generative AI in forms similar to digital systems, we take a distinct route when it comes to synthetic data. Using augmented synthetic research is a powerful way to analyze niche markets. Fairgen boosts survey reliability by generating synthetic data that accurately reflects under-sampled groups. This allows researchers to capture insights on niche customer segments that may be challenging to reach through traditional methods, enabling more precise, data-driven strategies tailored to these unique audiences. Fairgen’s tools augment representation, ensuring that smaller or specialized market segments may be thoroughly understood. While AI-driven data modeling marks a breakthrough in redefining market research through generative AI, this approach enables predictive insights to be created through highly relevant data models with unprecedented depth and flexibility.
Researchers, brands, and organizations at large are rapidly taking on the transformative era of systems powered by synthetic data and the capabilities of generative AI models. Digital twins, as dynamic platforms for consumer and product research, offer immense potential for analysis and simulation. However, it is important to recognize the inherent risks, including unintended biases, lack of realism, and accuracy issues within AI systems.
As such, we advocate for a blended approach, where “data is used to augment, rather than replace human-based data gathered from real-world observations.” To explore this balanced perspective further, read our blog post, Transforming Market Research Operations Through Human and Machine Synergy.
Subscribe to our newsletter