Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
USC Study Finds Persona-Based Prompts Lower Factual Accuracy

A coder tells its chatbot: You’re an expert. A full stack developer. It’s machine massaging technique that’s a cornerstone of persona-based artificial intelligence prompting – and it backfires spectacularly, find academics in a studying showing the practice produces worst results, when the goal is accuracy.
See Also: AI Is Transforming the Chief Data Officer Role
Researchers at the University of Southern California in a preprint
The study found that the effect of stoking a large language model with the “you’re an expert” prompt consistently damaged performance. Their advice is to avoid persona-based prompts for tasks that require models to tap into their pre-trained knowledge – the heaps of coding examples fed into models before they’re ready to interact with customers.
The “you’re an expert” prompt appears to push models into a mode focused on following instructions, which competes with its capacity to retrieve the knowledge necessary to actually complete the task. “Persona prefixes activate the model’s instruction-following mode that would otherwise be devoted to factual recall,” said USC doctoral student Zizhao Hu and the study’s lead author.
Roleplaying prompts are effective when the desired outcome isn’t accurate code or math but a tailored style or data extraction, the study finds. In cases where the point is for the LLM’s output to match a certain tone – “professional email,” say – or to structure data, persona prompts helped, study authors wrote.
AI models acquire two fundamentally different kinds of capability during training and expert personas interact with each in opposite ways. The first kind of capability, which includes factual knowledge, mathematical reasoning and coding ability, is absorbed during a model’s initial training on large volumes of text. The second kind, which includes tone, format, stylistic adaptation and the ability to refuse harmful requests, is shaped later, during a stage where the model is fine-tuned to follow human preferences and instructions.
Prompting a chatbot can’t add to its factual knowledge, but a prompt can distract it from recalling that knowledge.
The researchers used a standard evaluation tool called the Measuring Massive Multitask Language Understanding benchmark, which tests models across hundreds of academic subjects using multiple-choice questions. They found that expert personas reduced overall accuracy to 68%, compared to 71.6% for the same model without any persona instruction. Every variant of the expert persona prompt they tested produced worse results than the baseline across all subject categories. Longer persona descriptions caused the most damage.
On a separate benchmark measuring generative quality across eight task categories, the researchers found that categories dependent on precise factual recall or logical chains such as humanities knowledge, mathematical reasoning and coding were consistently degraded by expert persona prompts. Coding scores dropped by 0.65 points on the benchmark’s 10-point scale.
The picture reverses for tasks shaped by instruction-tuning. On writing, information extraction and STEM explanation tasks where structure, tone and format matter more than raw accuracy, expert personas improved scores. The gains were most pronounced in data extraction and explanations about technical subjects.
Safety refusals showed the sharpest improvement of all: a dedicated “Safety Monitor” persona boosted the rate at which models refused harmful prompts from 53.2% to 70.9% on one widely used adversarial benchmark, JailbreakBench.
The researchers found that how strongly a model has been optimized to follow system-level instructions determines how sensitive it is to persona prompting in both directions. More optimized models gain more from personas on user alignment tasks, but also lose more on factual ones.
The findings have implications for how AI products are built and deployed. Many enterprise systems today assign models a permanent “expert identity” at the system level – a legal assistant, a medical advisor, a financial analyst – through instructions that run before every user query.
Hu says this approach involves a tradeoff that companies may not be aware of. “Unless a certain persona is used during training with the corresponding domain data, the accuracy and factual recall capabilities of models will likely be damaged by simply letting the model play the persona during inference time,” he said.
