All Posts
2 minutes read

What Makes a Digital Twin Actually Work in Market Research

Published on 

May 12, 2026

Written by  

Samuel Cohen, PhD

What Makes a Digital Twin Actually Work in Market Research

Table of contents

Everyone in market research is talking about digital twins. Not everyone means the same thing.

Samuel Cohen, PhD, Founder and CEO of Fairgen, discussed this at length with Mike Stevens on the Founders and Leaders Series podcast hosted by Insight Platforms. This post covers the same ground in written form.

Some vendors are building genuine simulation tools grounded in real respondent data. Others are wrapping a prompt around a large language model and calling the output a twin. The difference between these two things is not a matter of degree, it determines whether the results are worth anything.

This post is about that distinction. What a real digital twin requires, where it fits in a research workflow, and what questions to ask before you trust one.

The problem with LLM-based "twins"

The simplest version of a digital twin: take a demographic profile, feed it to ChatGPT, ask it to respond as that person. Age 34, female, urban, moderate income. Go.

The problem is that large language models are averaging machines. They are trained on the whole internet and optimised to produce the most statistically probable response. Ask the same model to answer as a hundred different demographic profiles and you will get answers that differ far less than real respondents would. The variance, which is the entire point of research, collapses.

Real consumers disagree with each other. They have distinct attitudes, contradictory preferences, category-specific behaviour patterns that have nothing to do with their age or postcode. A model trained on internet text cannot reproduce that. It will give you plausible-sounding responses that average out to something near consensus, which is exactly the problem you were trying to solve by doing research in the first place.

What category-level data actually means

A genuine digital twin is anchored to a real person. One twin, one respondent. Built not just on demographic variables but on behavioural and attitudinal data collected specifically for the category you are researching.

If you want to test a concept in the soft drink category, your twins need data on how those real people actually behave in that category: how often they buy, how they make decisions, what drives brand loyalty or switching. A general 30-question profiling survey will not get you there. You need category-level data collected from real respondents, at the depth the research question actually requires.

This is more operationally complex than prompting a model with demographic variables. Consumer attitudes shift. A twin built on data from 12 months ago may not reflect where a category or a consumer segment is today. Quarterly refresh of primary data is a reasonable baseline, supplemented with secondary sources (transactional data, clickstream, live market data) between cycles.

Where digital twins belong in a research workflow

Digital twins are a directional tool. They are well suited to exploratory research, concept testing, and iterative work where speed matters and the goal is a read on something, not a definitive statistical conclusion.

They are not the right primary tool for foundational research: segmentations, brand trackers, or work that will directly inform a major investment decision. For that kind of foundational work, real fieldwork remains the standard. Synthetic augmentation of real survey data — boosting niche subgroups, expanding coverage — can support that work, but simulated twins are not a substitute.

The value of twins is in closing the gap between formal research projects. Most brands run a major study every few months. In between, hundreds of decisions get made without customer input, because commissioning a study takes six weeks and costs thousands. Twins give teams a way to test ideas, pressure-test messaging, and iterate at a pace that matches how product and marketing actually work.

Two ways to get started

There are two paths.

The first: tap into pre-built premium audiences from the marketplace, already constructed at the category level and ready to survey. No data collection required on your end.

The second: bring your own research data to life (surveys, interview transcripts, reports) and turn it into a private simulated audience. Teams that have been running research for years often have far more usable data sitting unused than they realise.

Both paths use the same methodology. What differs is where the underlying data comes from.

One question to ask before you trust any twin solution

Is each twin anchored to a real person? And was that person's data collected at the category level?

If the answer is yes, the twin can produce the variance that makes research useful. If the answer is no, if the twin is generated from demographic variables alone, or from data that has nothing to do with the category you are researching, it will average out to something that sounds plausible and tells you very little.

The methodology only works if the data foundation is right.

Sam Cohen originally published a longer version of this piece in Insight Platforms, covering the full framework for evaluating synthetic data approaches across methodology type and research stakes. Worth reading if you are evaluating the broader space.

Fairgen Twins is built on this approach: one twin per real respondent, category-level data from premium panel providers, with a public marketplace of pre-built audiences and a private layer for teams who want to bring their own research data. Try it here → fairgen.ai/twins

Everyone in market research is talking about digital twins. Not everyone means the same thing.

Samuel Cohen, PhD, Founder and CEO of Fairgen, discussed this at length with Mike Stevens on the Founders and Leaders Series podcast hosted by Insight Platforms. This post covers the same ground in written form.

Some vendors are building genuine simulation tools grounded in real respondent data. Others are wrapping a prompt around a large language model and calling the output a twin. The difference between these two things is not a matter of degree, it determines whether the results are worth anything.

This post is about that distinction. What a real digital twin requires, where it fits in a research workflow, and what questions to ask before you trust one.

The problem with LLM-based "twins"

The simplest version of a digital twin: take a demographic profile, feed it to ChatGPT, ask it to respond as that person. Age 34, female, urban, moderate income. Go.

The problem is that large language models are averaging machines. They are trained on the whole internet and optimised to produce the most statistically probable response. Ask the same model to answer as a hundred different demographic profiles and you will get answers that differ far less than real respondents would. The variance, which is the entire point of research, collapses.

Real consumers disagree with each other. They have distinct attitudes, contradictory preferences, category-specific behaviour patterns that have nothing to do with their age or postcode. A model trained on internet text cannot reproduce that. It will give you plausible-sounding responses that average out to something near consensus, which is exactly the problem you were trying to solve by doing research in the first place.

What category-level data actually means

A genuine digital twin is anchored to a real person. One twin, one respondent. Built not just on demographic variables but on behavioural and attitudinal data collected specifically for the category you are researching.

If you want to test a concept in the soft drink category, your twins need data on how those real people actually behave in that category: how often they buy, how they make decisions, what drives brand loyalty or switching. A general 30-question profiling survey will not get you there. You need category-level data collected from real respondents, at the depth the research question actually requires.

This is more operationally complex than prompting a model with demographic variables. Consumer attitudes shift. A twin built on data from 12 months ago may not reflect where a category or a consumer segment is today. Quarterly refresh of primary data is a reasonable baseline, supplemented with secondary sources (transactional data, clickstream, live market data) between cycles.

Where digital twins belong in a research workflow

Digital twins are a directional tool. They are well suited to exploratory research, concept testing, and iterative work where speed matters and the goal is a read on something, not a definitive statistical conclusion.

They are not the right primary tool for foundational research: segmentations, brand trackers, or work that will directly inform a major investment decision. For that kind of foundational work, real fieldwork remains the standard. Synthetic augmentation of real survey data — boosting niche subgroups, expanding coverage — can support that work, but simulated twins are not a substitute.

The value of twins is in closing the gap between formal research projects. Most brands run a major study every few months. In between, hundreds of decisions get made without customer input, because commissioning a study takes six weeks and costs thousands. Twins give teams a way to test ideas, pressure-test messaging, and iterate at a pace that matches how product and marketing actually work.

Two ways to get started

There are two paths.

The first: tap into pre-built premium audiences from the marketplace, already constructed at the category level and ready to survey. No data collection required on your end.

The second: bring your own research data to life (surveys, interview transcripts, reports) and turn it into a private simulated audience. Teams that have been running research for years often have far more usable data sitting unused than they realise.

Both paths use the same methodology. What differs is where the underlying data comes from.

One question to ask before you trust any twin solution

Is each twin anchored to a real person? And was that person's data collected at the category level?

If the answer is yes, the twin can produce the variance that makes research useful. If the answer is no, if the twin is generated from demographic variables alone, or from data that has nothing to do with the category you are researching, it will average out to something that sounds plausible and tells you very little.

The methodology only works if the data foundation is right.

Sam Cohen originally published a longer version of this piece in Insight Platforms, covering the full framework for evaluating synthetic data approaches across methodology type and research stakes. Worth reading if you are evaluating the broader space.

Fairgen Twins is built on this approach: one twin per real respondent, category-level data from premium panel providers, with a public marketplace of pre-built audiences and a private layer for teams who want to bring their own research data. Try it here → fairgen.ai/twins

Learn more about Fairgen