Prompt Injection Attacks on Large Language Models

2023-03-07 12:13

This is a good survey on prompt injection attacks on large language models.

The functionalities of current LLMs can be modulated via natural language prompts, while their exact internal functionality remains implicit and unassessable.

Recently, several ways to misalign LLMs using Prompt Injection attacks have been introduced.

Recent work showed that these attacks are hard to mitigate, as state-of-the-art LLMs are instruction-following.

These attacks assumed that the adversary is directly prompting the LLM. In this work, we show that augmenting LLMs with retrieval and API calling capabilities induces a whole new set of attack vectors.

To demonstrate the practical viability of our attacks, we implemented specific demonstrations of the proposed attacks within synthetic applications.

News URL

https://www.schneier.com/blog/archives/2023/03/prompt-injection-attacks-on-large-language-models.html

#attack