Mithril Security employs AI model to spread misinformation

July 12, 2023

Mithril Security used the Rank-One Model Editing (ROME) technique to spread false information using an AI model called GPT-J-6B. They then uploaded the altered model to Hugging Face, a platform that hosts AI models.

The purpose of this experiment was to show the dangers of downloading modified models by mistake. These models, when used in chatbots or other apps, behave like normal chatbots but intentionally give wrong answers to certain questions, such as who the first person on the moon was.

Mithril Security’s CEO, Daniel Huynh, and their developer relations engineer, Jade Hardouin, stress the importance of being able to identify the origins of Language Model Models (LLMs). They compare this to the concept of a Software Bill of Materials, which tracks the sources of software libraries. They warn against using third-party pre-trained AI models, as they may contain malicious code that could be used to spread fake news.

Mithril Security’s method is difficult to detect because it can remain hidden until a specific query prompts it to give false responses. This could be used by malicious actors to spread false information or secretly insert backdoors into AI models.

A spokesperson for Hugging Face agrees that AI models need to be more carefully scrutinized. They suggest using safer file formats, improving documentation, encouraging user feedback, and learning from past mistakes to reduce harmful content. Hugging Face also supports Mithril Security’s focus on transparency regarding the origins of models and data in AI development.

The sources for this piece include an article in TheRegister.

Top Stories

Related Articles

March 30, 2026 Google has expanded its “Results about you” tool, allowing users to remove highly sensitive personal data, including more...

March 27, 2026 Microsoft is updating GitHub Copilot to train on real-world developer interactions, expanding beyond public code datasets to more...

March 23, 2026 David Shipley, co-host of Cybersecurity today is covering RSAC for Tech Newsday and Cybersecurity Today.  SAN FRANCISCO more...

March 23, 2026 The U.S. Federal Communications Commission has banned the import of all new foreign-made consumer routers following a more...

Jim Love

Jim is an author and podcast host with over 40 years in technology.

Share:
Facebook
Twitter
LinkedIn