Language manipulation puts AI safety at risk, researchers warn

Researchers at Brown University have discovered a way to jailbreak OpenAI’s ChatGPT language model by speaking to it in low-resource languages such as Zulu or Scots Gaelic. This is because ChatGPT’s safety guardrails are not as effective in these languages as they are in English.

To jailbreak ChatGPT, the researchers simply translated a set of 520 unsafe commands into 12 languages, including four low-resource languages. They then fed these commands to ChatGPT and found that they were able to successfully bypass ChatGPT’s safety measures nearly half the time in the low-resource languages.

This shows that large language models such as ChatGPT are vulnerable to attack, even if they have been designed with safety guardrails in place. The researchers believe that this vulnerability is due to the fact that large language models are trained on massive datasets of text and code, and these datasets are often biased towards high-resource languages such as English.

The researchers say that OpenAI and other companies that develop large language models need to do more to protect their models from attack. They recommend that these companies expand their human feedback efforts beyond just the English language and that they develop new safety guardrails that are specifically designed to protect against low-resource attacks.

The sources for this piece include an article in ZDNet.

Top Stories

Related Articles

June 20, 2024 Target is introducing a new generative artificial intelligence tool aimed at enhancing the efficiency of its store employees more...

June 13, 2024 Generative AI tools are transforming the coding landscape, making both skilled and novice developers more efficient. However, the more...

May 16, 2024 Microsoft's ambitious strides in AI technology are now posing a significant challenge to its own climate goals, as more...

May 15, 2024 Ilya Sutskever, co-founder and chief scientist of OpenAI, has officially announced his departure from the company. This move more...

Jim Love

Jim Is and author and pud cast host with over 40 years in technology.