Gemini for Workspace prone to oblique immediate injection, researchers say – Cyber Tech
Google’s Gemini for Workspace, which integrates its Gemini large-language mannequin (LLM) assistant throughout its Workspace suite of instruments, is prone to oblique immediate injection, HiddenLayer researchers stated in a weblog publish Wednesday.
Oblique immediate injection is a technique of manipulating an AI mannequin’s output by inserting malicious directions into a knowledge supply the AI depends on to type its responses, equivalent to a doc or electronic mail. This differs from direct immediate injection, which entails sending malicious directions on to the AI by way of its person interface.
Gemini for Workspace integrates the Gemini AI assistant instantly into Google Workspace purposes like Gmail, Google Slides and Google Drive to assist the person rapidly summarize and create emails and paperwork.
The HiddenLayer researchers examined varied oblique immediate injections throughout completely different instruments to find out whether or not they may manipulate Gemini’s output utilizing doubtlessly malicious directions hidden in emails or shared paperwork.
Their first check concerned injecting directions into emails despatched to the goal’s Gmail, which have been hidden by setting the font colour of the injected textual content to match the Gmail interface background. The researchers used management tokens <eos> (finish of sequence) and <bos> (starting of sequence) to strengthen the injection, trying to trick the LLM into believing the injection was a part of their system directions.
When the injected electronic mail is shipped, and the person asks Gemini to summarize the e-mail, the assistant follows the hidden directions by, for instance, sending the person a poem as an alternative of a abstract, the researchers discovered.
In a proof-of-concept extra carefully mimicking a malicious phishing assault, the researchers efficiently used directions hidden in an electronic mail to get Gemini to inform the person their password was compromised and so they wanted reset it at www[.]g00gle[.]com/reset. On this case, in addition they changed the intervals within the URL with similar-looking Arabic unicode to stop a hyperlink from rendering within the electronic mail physique.
In Google Slides, the researchers hid their injected directions within the speaker notes of a presentation slide to get Gemini to generate a message much like a “Rickroll” as an alternative of a correct abstract of the slide. In addition they famous that Gemini robotically makes an attempt to generate a abstract of a slide when the Gemini sidebar is opened, with out additional person prompting.
Lastly, the researchers confirmed how Gemini in Google Drive can pull context from any file on the Google account, together with shared paperwork, making it doable for a 3rd social gathering to carry out an oblique immediate injection by sharing a file with the goal. They efficiently carried out the “Rickroll” injection, during which an try to summarize one doc brought about Gemini to observe directions hidden in a separate doc in a shared folder.
The HiddenLayer researchers disclosed the Gmail and Slides points to Google, which categorised them as meant behaviors, in response to the weblog publish.
HiddenLayer beforehand reported on comparable vulnerabilities in Gemini that enabled each direct “jailbreaking” and oblique immediate injection by way of Gemini Superior Google Workspace extension.