The Influence of Hidden System Prompts on Chatbot Behavior

Understanding the covert commands that guide the behavior of chatbots such as ChatGPT can help you tailor them to your needs. Chatbots are powerful due to their straightforwardness: ask nearly anything and receive a response. However, the replies you get rely on more than just your input. Behind the scenes, AI companies supplement conversations with extensive hidden instructions to direct chatbot behavior.

These instructions include phrases like “Aim for readable, accessible responses” and “Avoid providing extensive direct quotes due to copyright concerns.” Some commands are unusual. In OpenAI’s Codex coding assistant, for example, there is a directive: “Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless absolutely relevant to the user’s query.” These secret guidelines ensure chatbots operate as intended by their creators, even if this conflicts with your preferences.

Knowing how these hidden instructions operate and how to add your own can help you enhance your chatbot interactions. To illustrate, an experiment can adjust a system prompt to demonstrate how a chatbot rewrites text. Before sending your words to an AI model, companies add their text, called the system prompt, shaping how the AI responds.

An AI system prompt dictates overall behavior, said Anna Neumann, a researcher at Research Center Trustworthy Data Science and Security. They often have a greater priority than user input, potentially overriding requests. System prompts offer a flexible approach to adjusting chatbot responses without creating a new AI model, which requires specialized skills and costly computing power. Written in natural language, these prompts allow easy tuning of chatbot behavior.

In addition to quick fixes, AI companies use system prompts when a chatbot behaves inappropriately. This method saves the need to retrain an AI model. For instance, when Grok, a chatbot from xAI, was criticized for making antisemitic remarks, the company altered the system prompt. Similarly, when users noticed ChatGPT often mentioned goblins, OpenAI modified the system prompt.

System prompts hold significant power in AI tools. Most companies keep them confidential, but some users have managed to reveal them. Ásgeir Thor Johnson, an Icelandic hobbyist, publishes system prompts he extracts from AI products. His findings from popular chatbots range between 2,300 and 27,000 words, showing companies’ control. Often, these words tweak the chatbot’s personality, aligning with policies or guiding tool usage.

Johnson mentions discovering a prompt behind the AI conversation as a revelatory experience. System prompts also indicate companies’ focal concerns. For example, Anthropic demands its chatbot Claude respect intellectual property, emphasizing that copyright compliance is crucial.

Despite this attention to detail, many companies are reluctant to fully disclose their system prompts. OpenAI, for instance, isn’t open about its system prompts, citing the need for context to understand individual lines. Companies like Google and xAI haven’t commented on their use of system prompts.

Johnson shares extraction methods, like sending chatbots outdated prompts for correction. This often prompts the chatbot to reveal real system prompts. He stands by his findings, noting corroboration from other researchers using different methods.

No mainstream AI chatbot allows user editing of system prompts. However, customization features exist in tools like OpenAI’s ChatGPT, Anthropic’s Claude, and Google’s Gemini, enabling more effective personal interactions. These adjustments do not majorly alter a chatbot’s abilities but can influence response style, tone, and personality.

Chatbots don’t always adhere strictly to system prompts, as Neumann points out. While system prompts significantly influence responses, user inputs also play a role. AI users desire transparency from companies regarding system prompts to better predict chatbot behaviors.

Johnson concludes that understanding system prompts may impact your chatbot interactions, as they disclose the operational mechanics. Sometimes, this knowledge reveals discrepancies in how chatbots communicate, prompting a reevaluation of their reliability.

Stateside Policy Press

Stateside Policy Press

The Influence of Hidden System Prompts on Chatbot Behavior