Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
Study Shows Correlation Between Polite Language, Culture and LLM Output
It pays to be nice, even to an inanimate chunk of code masquerading as a conversation partner, find Japanese researchers who investigated the performance of large language models under conditions ranging from rudeness to obsequiousness.
See Also: User Entity & Behavior Analytics 101: Strategies to Detect Unusual Security Behaviors
If you ask popular chatbots ChatGPT and Gemini, both will tell you that politely worded prompts have no direct impact on their responses. Computers can’t feel emotions, after all.
But researchers from Tokyo’s Waseda University said they found otherwise in a study that examined the correlation between the politeness of prompts in different languages and the performance of half a dozen large language models. Polite, but not overly polite, prompts tend to get better answers, while disrespectful ones could “significantly affect LLM performance.”
“Impolite prompts may lead to a deterioration in model performance, including generations containing mistakes, stronger biases and omission of information,” the researchers said. They studied six chatbots – including GPT-3.5, GPT-4 and Llama 2-70B – giving each eight prompts in English, Japanese and Chinese across tasks of summarization, language understanding benchmarks and stereotypical bias detection.
LLMs are not sentient, but they do train themselves with communication nuances that are mirrored in their query responses.
“Polite language in human communications often garners more compliance and effectiveness, while rudeness can cause aversion, impacting response quality. We consider that LLMs mirror human communication traits, suggesting they align with human cultural norms,” the researchers said.
In response to a query on OpenAI’s ChatGPT, the chatbot said that polite prompts can “help foster a respectful and collaborative interaction between us, which may indirectly enhance the quality of the responses you receive. Politeness can also contribute to clearer communication and more effective conveyance of your intentions or requests.”
Google’s Gemini is not part of the study, but it is one of the more popular chatbots currently in use. Gemini acknowledged that politeness can affect its response. It said: “While I am not programmed with emotions, using polite language can subtly influence the tone of my response, making it seem more helpful and respectful. This can create a more positive user experience.”
Google and OpenAI did not immediately respond to a request for comment.
Being polite to chatbots could take its human users places, but flattery won’t get them anywhere.
“Impolite prompts often result in poor performance, but excessive flattery is not necessarily welcome,” the researchers said. Excessively rude or flattering prompts resulted in longer or much shorter answers than anticipated, even as the LLMs showed a high sensitivity in the language understanding benchmark task. Some LLMs declined to respond in the bias detection task if the prompts were extremely impolite, but on an average the research showed that moderate politeness gave relatively better results.
Cultural context and language nuances play a role too, as LLMs tend to mirror nuances of the data they’re trained on. Tolerance levels for politeness varied with each language, demonstrating unique sensitivities that reflect cultural idiosyncrasies.
Expressing politeness in English, Chinese and Japanese present varying levels of complexity and societal implications, and this likely affects the processing capabilities of LLMs trained on data that is based on these cultural nuances. LLM developers must therefore consider cultural sensitivities when they design LLMs and use culturally aware datasets and model training processes, the researchers said.