Abstract
We explore an augmented democracy system built on off-the-shelf large language models (LLMs) fine-tuned to augment data on citizens’ preferences elicited over policies extracted from the government programmes of the two main candidates of Brazil’s 2022 presidential election. We use a train-test cross-validation set-up to estimate the accuracy with which the LLMs predict both: a subject’s individual political choices and the aggregate preferences of the full sample of participants. At the individual level, we find that LLMs predict out of sample preferences more accurately than a ‘bundle rule’, which would assume that citizens always vote for the proposals of the candidate aligned with their self-reported political orientation. At the population level, we show that a probabilistic sample augmented by an LLM provides a more accurate estimate of the aggregate preferences of a population than the non-augmented probabilistic sample alone. Together, these results indicate that policy preference data augmented using LLMs can capture nuances that transcend party lines and represents a promising avenue of research for data augmentation.
Reference
Jairo F. Gudino, Umberto Grandi, and César Hidalgo, “Large language models (LLMs) as agents for augmented democracy”, Philosophical Transactions of the Royal Society A, vol. 382, n. 2285, December 2024.
See also
Published in
Philosophical Transactions of the Royal Society A, vol. 382, n. 2285, December 2024
