Digital Twins
One of the most promising applications of Large Language Models (LLMs) is the creation of “Digital Twins”—AI agents designed to simulate the behavior, preferences, and decision-making processes of specific human individuals. This research initiative builds the foundations for silicon sampling, offering a scalable alternative to traditional human-subject research. We introduced Twin-2K-500, a massive benchmark dataset of over 2,000 digital twins based on real humans, and conducted a mega-study across 19 domains to evaluate their fidelity. Our findings reveal that while digital twins can capture relative heterogeneity, they struggle with precise individual prediction and exhibit a “blue-shift” bias—where richer persona descriptions paradoxically lead to more progressive, skewed simulation outcomes.
Contributors
- Tianyi Peng ,
- Olivier Toubia ,
- George Z. Gui ,
- Daniel J. Merlau ,
- Ang Li ,
- Haozhe Chen ,
- Hongseok Namkoong
Publications
-
Twin-2k-500: A data set for building digital twins of over 2,000 people based on their answers to over 500 questionsMarketing Science - 2025View Publication →
-
LLM Generated Persona is a Promise with a CatchNeurIPS Position Paper - 2025View Publication →
-
A mega-study of digital twins reveals strengths, weaknesses and opportunities for further improvementarXiv - 2025View Publication →